The SEAmBOTH (https://seamboth.wordpress.com/) ecological models were based on biological inventory data and environmental layers, developed for the SEAmBOTH study area. Some environmental data were derived from other data sources, developed, for example, for the use of other projects, such as the Finnish Inventory Programme for the Underwater Marine Diversity (VELMU). Currently, the model layers included in the data package only cover the Finnish side of the study area (northernmost Bothnian Bay).
Used environmental variables included:
seafloor substrates (rock, boulders, mud, sand and mobile/unstable sediments), chlorophyll-a, depth, nutrients, salinity, surface exposure, seafloor fetch, shallow areas (satellite-derived) and turbidity (satellite-derived).
The ecological models were developed utilizing a machine learning method, generalized boosting machine and additional functions from boosted regression trees (see details from the SEAmBOTH final report: https://seamboth.files.wordpress.com/2020/06/seamboth_finalreport.pdf). Datasets were split into model training and testing with a ratio of 70/30, and for models with too little data for training/test separation, the full dataset was used in the modelling with 10-fold cross-validation. Independent test evaluations were not calculated for those models. For species with too little data for modelling, a linear, inverse distance model of 100 m was calculated. Error distribution of “Bernoulli” was used for the binomial species presence-absence data (1/0). Modelling performances relied on how well models capture true/false presences/absences (sensitivity/specificity). Modelling parameters were dependent on the ecological preferences of species and varied between generalists and specialist species. Tree complexities changed between 3-5, learning rates between 0.005-0.01 and bag fractions from 0.7-0.9 depending on species in question. Results of models are shown in a separate .csv file.