An executable model for Biodiversity Net Gain 4.0
Natural England published the latest version of the Biodiversity Metric - BNG 4.0 in April 2023. It remains under continuing review. BNG 4.0, possibly amended, is likely to become mandatory when the remaining parts of the Environment Act 2021 come into force towards the end of the year. It is a very comprehensive suite of documents with supporting tools designed to aid in the calculation of biodiversity net gain.
A full baseline habitat analysis for a proposed development together with the proposed habitat changes requires the gathering, organisation and processing of a large amount of data. A macro-enabled Excel spreadsheet has been provided in the package, which is expected to be used widely, if not exclusively, for nearly all BNG calculations. Spreadsheets are, however, inherently difficult to comprehend, check and test.
This project is an attempt to provide an open source online tool to perform the same analysis which is comprehensible, checkable against the BNG guidance and testable.
Spreadsheets
Wonderfully useful though spreadsheets are, they do have some major downsides.
Unreadable Formula
Computation is defined using expressions which reference cells or ranges of cells by sheet name and cell address. There are mechanisms for giving cells meaningful names, but these are rarely used and not used in the BNG spreadsheet. This renders the spreadsheet very difficult to read and check without extensive navigation around multiple sheets and sub-tables embedded within sheets. Even then the meaning of the cells has guessed from column and row labels which themselves are mere descriptions rather than having any semantic value.
Macros and hidden columns further limit comprehension.
Lack of a coherent model
Because data and computation are interleaved within a spreadsheet, there is no separation of data from its processing and hence no guarantee that the same kind of data (all the rows in a table) will be computed in the same way.
Lack of a test strategy
Software engineering has developed strategies for enabling repeated testing of software so that the whole system can be re-tested after a minor change has been made which may have unforeseen consequences. Test-driven development goes further and requires test data and the results expected from that data to be created before development so that the behavior can be developed and tested incrementally. Although there are specialist tools such as Rapise which can assist with spreadsheet testing they are expensive.
Readability and testing are important because industry experience shows that an overwhelming number of operational spreadsheets have faults which, in some cases, has led to disastrous consequences for their users. Clearly, a tool which supports legislation should be provably correct.
As the computer scientist Tony Hoare said:
“There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.”
In this project we strive for simplicity.
A conceptual data model
A Entity-Attribute-Relationship (EAR) conceptual model captures the structure of the data in an information system. EAR models are typically used with Relational Databases which are composed of multiple inter-dependent tables.
Tables (such as the Habitat table in BNG 4.0 which list the 132) are conceptualized as Entities which define the common properties of every row in the table, i.e., every instance of the entity. The main part of this definition is the list of fields in the record which can be thought conceptually as attributes of the entity. Each attribute describes the type of values which can be held in the field, the permitted and default values and whether it is mandatory or optional. Relationships describe how one entity is related to another, typically by common attributes.
An executable conceptual model
An executable model is an extension of a data model which in addition defines the rules for computing attributes which are not raw data. In the software engineering approach, Model-Driven Development, such models can be used to generate the compiled software. For this project, the rules are interpreted using software written in XQuery running on the open source eXist-db XML database. As the user enters data for an instance of an entity (a row of a table), the software will evaluate the rules in the model in an appropriate order and compute the values of all the computed attributes, rather as a spreadsheet does. Here however the rules are expressed in readable formula which use the names of other attributes in the same entity and of those in related tables.
Formula conventions
The formulae are expressed in the XQuery language, although most need only the XPath subset. Two shortcuts are used:
$self/fieldname is a reference to a field in the record being processed.
table(‘tablename’) is a reference to the rows of a named table which may be a base reference table such as Habitat or a project-specific table such as Habitat_Baseline.
XPath expressions allow the rows to be filtered with a condition in [ ] and fields accessed by /fieldname
For example, the rule to compute the list of Habitats in a Broad Habitat is:
table('Habitat')[Broad_Habitat=$self/Broad_Habitat]/Habitat
which means:
Take all the rows of the Habitat table where the Broad_Habitat is the same as this rows Broad_Habitat and return the list of Habitat values in those rows.
To retrieve the Distinctiveness Score of the selected Habitat from the Habitat table:
table('Habitat')[Habitat=$self/Habitat]/Distinctiveness_Score
The language provides a range of common functions. This is the formula used in the overall Project Summary to sum all the On-site Baseline Habitat units:
sum(table('Habitat_Baseline')[On-off-site='On-site']/Total_habitat_units)
Here is the full model for the Habitat_Baseline.
Mapping the Excel Spreadsheet to the Conceptual Model
The reference tables were extracted from the tables in the BNG 4.0 calculator tool and exported as CSV, then loaded into the eXist database using column headings as attribute names. Some names were changed to better reflect their role. The key Habitat table was restructured to have two key columns: Broad Habitat and Habitat.
Three tables were multidimensional, with row headings one dimension and one or two levels of column headings for the other dimensions.
Habitat_Condition: Spreadsheet Table G-8 Condition lookup, Technical Annex 2 Data Table 2 Habitat condition score matrix.
Habitat_Creation_Year: Spreadsheet Table G4 Temporal Multipliers, Technical Annex Data 2 Table 3: Habitat creation matrix.
Habitat Enhancement Year: Spreadsheet Table G-5 Enhancement Temporal, Technical Annex 2 Data Table 3: Habitat enhancement matrix
The data in these tables are sparse, so multiple cells are marked as 'Not Possible'. For computational purposes it is preferable to convert these to simple tables with multiple keys and with 'Not Possible' combinations removed to be dealt separately with when the data is missing.
The tables which contain project-specific data are based on the same tables in the Spreadsheet. For simplicity I combined each pair of On-site and Off-site tables into a single table. This ensures that the only difference is the application of a spatial multiplier for the Off-site entries.
Comparison with the Natural England Spreadsheet
So far, the model is limited to the Area Habitat tables, both on-site and off-site as well as support for tree surveys. Testing has been so far limited to defining a number of real-world cases in both the Natural England Spreadsheet and in the XQuery on-line system and comparing the results.
This is the set of projects so far implemented. All are viewable but a pin number is required to edit all but the Test Project.
Given the complexity of some of the formula, we have identified very few issues:
The computation in the Spreadsheet for Enhancement Habitat Units is at odds with the formula given in Technical Annex 2 BOX TS 7-1. The Spreadsheet formula only uses the Post-Intervention Area, whereas Equation 3 refers to the Pre-Intervention Area as well.
The Off-site Enhancement calculation makes no use of the Spatial Risk multiplier although this is present in the sheet.
Rounding in the Spreadsheet is sometimes incorrect which creates small differences in totals to those in the online version where values are computed throughout to double precision and rounded on display.
Technical Annex 2 Table 2 Habitat Condition Score Matrix and hence Spreadsheet table G8 Condition look-up allows the Bramble Scrub Habitat to have both 'Condition Assessment N/A' and 'Fairly Poor'.
Further work
The project is at the prototype stage and parts of the BNG model are yet to be implemented, in particular the linear habitats of hedges and waterways and some overall validity checks. Automated testing is feasible but has not yet been implemented. It would be interesting to know if Natural England have test data which they could share.
We hope this project will be a useful contribution to the establishment of BNG4.0. A repository for the project is being created on GitHub.
Oct '23
Version 2 of the platform under development.
~ prefix replaced by $self
3 notes
·
View notes
Wilmon + Simon hadn't expected his first day of travel to be spent standing 4 hours on an overcrowded platform because all trains had been canceled, and yet here he was
Hi btw 💜
Nina!! Hi💜Thank you for this lovely sentence.
Simon hadn't expected his first day of travel to be spent standing 4 hours on an overcrowded platform because all trains had been canceled, and yet here he was. God, he knew that when traveling through Germany he'd have troubles because everyone knew Deutsche Bahn, but this was on another level.
After yet another announcement sounded, first in German and then thankfully in English too, apologising for yet another delay, Simon decided to try and find a bit more space to sit down; the train wasn't going to be here anytime soon so queueing up to get on the train as one of the first people didn't have any use right now.
Simon slowly made his way through the crowd with his heavy backpack digging into his shoulders as he apologised profusely for bumping into people, and then he finally some space and a wall to sit down against. Without thinking twice he took off his backpack and slid down the wall until he hit the floor, barely even noticing how close he was to the person next to him, not seeing how the blonde person with his floppy hair was watching him curiously.
"What a shitshow, huh?", Simon suddenly heard, a Swedish accent registrering slowly in his brain, making him look quickly to his left, finally seeing his fellow - handsome - DB-victim, and Simon answered in Swedish: "You can say that again. Want to wait together?"
13 notes
·
View notes