Species Distribution Model Experiment : BCCVL (Sandpit)

The Species Distribution Model Experiment (SDM) is used to investigate the potential distribution of a species under current climatic conditions. This experiment type offers 19 algorithms.

Note: You will need to run a Species Distribution Model before you can run a Climate Change or Biodiverse Experiment.

Overview of SDM methods in the BCCVL

Geographical models

These models use the geographic location of known occurrences of a species to predict the likelihood of presence in other locations, and do not rely on the values of environmental variables.

Circles:	predicts that a species is present at sites within a certain radius around observed occurrences, and absent beyond that radius.
Convex Hull:	predicts that a species is present at sites inside the minimum spatial convex hull around observed occurrences, and absent outside that hull.
Geographic Distance:	predicts species occurrences based on the assumption that the closer to a known presence, the more likely it is to find the species.
Inverse-Distance Weighted Model:	predicts species occurrence probabilities for unknown locations as the average of values at nearby known locations weighted by their inverse distance from the unknown location.
Voronoi Hull Model:	predicts that a species is present inside voronoi hulls around observed occurrences, which consist of all points whose distance to the known location is less than or equal to its distance to any other known location, and absent outside those hulls.

Profile models

These models only use occurrence data, and are based on the characterization of the environmental conditions of locations associated with species presence.

Bioclim/Surface Range Envelope:	defines a multi-dimensional environmental space bounded by the minimum and maximum values of environmental variables for all occurrences as the potential range where a species can occur.
Domain:	predicts species occurrence based on the environmental similarity of an unknown site to the nearest occurrence sites using the Gower coefficient of similarity.
Mahalanobis Model:	predicts species occurrence based on the environmental similarity of an unknown site to the nearest occurrence sites using the Mahalanobis distance measure, which is independent of the scales of various predictors.

Statistical models

Statistical models produce estimates of the effect of different environmental variables on the distribution of a species. These models use all the data available to estimate the parameters/coefficients of the predictors, and construct a function that best describes the effect of environmental variables on species occurrence. The suitability of a particular model is often defined by specific model assumptions.

Generalised Linear Model:	a regression model for data with a non-normal distribution, fitted with maximum likelihood estimation.
Generalised Additive Model:	a multiple regression model that uses smoothed functions of the environmental variables to model non-linear relationships.
Multiple Adaptive Regression Splines:
Flexible Discriminant Analysis:

Machine learning models

Machine Learning methods are algorithms that typically use one part of the dataset to ‘learn’ and describe the dataset (training) and the other part to make predictions.

Maxent:	predicts species occurrences by finding the distribution that is most spread out, or closest to uniform, while taking into account the limits of the environmental variables of known locations.
Classification Tree:	predicts species occurrence by repeatedly splitting the dataset into mutually exclusive groups based on a threshold value of one of the environmental variables.
Random Forest:	grows many decision trees and averages the predictions of these trees to estimate the importance of each environmental variable.
Boosted Regression Tree/General Boosting Model:	predicts species occurrence probabilities based on a combination of decision trees and boosting, It uses a stagewise procedure to iteratively fit random subsets of the data, and averages the suite of trees in the final model.
Artificial Neural Network:

How to run an SDM in the BCCVL

At the top of the experiment page click on the Experiments tab.
Click on Species Distribution Model Experiment.

Step 1: Description tab

Enter the name for your experiment in the first box (e.g. Current Fox (Vulpes vulpes) Distribution).
(optional) You can also add a description of your experiment in the box below if you want to convey more information. Some researchers use this box to record their research question or hypotheses for later referral.
Click Next.

Step 2: Configuration tab

Select the algorithm/s you would like to use (don't know pick to pick? See our quick selection guide) - you can select more than one.
(optional) Configuration

Some algorithms have configuration options (e.g. Boosted Regression Tree) while others do not (e.g. Bioclim).
Once selected click on the configuration options for your chosen algorithm/s. These are currently set to the standard default values so you do not need to make changes. However, configuring the model to best fit your data will give a more robust result.

Click Next.

Note: The BCCVL currently hosts 19 algorithms. Choosing what algorithm best suits your experiment and data can be confusing, so too can the configuration of each algorithm. For information on each of these algorithms see our quick guide below. For further information on each algorithm and their configuration options click on the title of the algorithm you want to explore.

Step 3. Occurrences tab

Select your pre-loaded species occurrence dataset by clicking the Select A Dataset button. Note: If you click this and you have no loaded species occurrence datasets you will need visit the dataset page and import or upload the required data.
In the pop-up box select the dataset you wish to use in your SDM. Click Save Changes.
(optional) You can visualise your occurrence data by clicking the green eye icon.
Click Next.

Step 4. Absences tab

You have two options for adding absence data:

Uploaded true absence data: If you have your own absence data you need to import this into your experiment as below.

Click the Select A Dataset button
In the pop-up box select the pre-loaded absence dataset you wish to use in your SDM. Click Save Changes.
(optional) You can visualise your absence data by clicking the green eye icon.
Click Next.

Pseudo-absence data: The BCCVL can randomly generate pseudo-absence data points for your experiment based on your occurrence data.

Click on Use Pseudo Absence Points link. A box will dropdown.
Check the box next to Pseudo absence points and select the number of absence points you want generated.
Click Next.

Step 5. Climate & Environmental Data tab

Click Select Available Datasets button
In the pop-up box you can enter search terms to filter for required datasets or browse through by scrolling
Once you have found the dataset/s you are looking for select them and click Add Layers.
When back on the Climate & Environmental Data tab you can select/deselect data layers
(optional) You can visualise each of the data layers by clicking the green eye icon, on the right hand side of the map you can toggle which data layer you want to visualise
Once you have selected all your environmental and climate layers click Next.

Step 6. Run tab

Ensure you are happy with your experiment design
Click Start Experiment.

solutions