The BCCVL currently offers 17 algorithms across 4 different categories:
Geographical models
These models use the geographic location of known occurrences of a species to predict the likelihood of presence in other locations, and do not rely on the values of environmental variables.
Circles: | predicts that a species is present at sites within a certain radius around observed occurrences, and absent beyond that radius. |
Convex Hull: | predicts that a species is present at sites inside the minimum spatial convex hull around observed occurrences, and absent outside that hull. |
Geographic Distance: | predicts species occurrences based on the assumption that the closer to a known presence, the more likely it is to find the species. |
Inverse-Distance Weighted Model: | predicts species occurrence probabilities for unknown locations as the average of values at nearby known locations weighted by their inverse distance from the unknown location. |
Voronoi Hull Model: | predicts that a species is present inside voronoi hulls around observed occurrences, which consist of all points whose distance to the known location is less than or equal to its distance to any other known location, and absent outside those hulls. |
Profile models
These models only use occurrence data, and are based on the characterization of the environmental conditions of locations associated with species presence.
Bioclim/Surface Range Envelope: | defines a multi-dimensional environmental space bounded by the minimum and maximum values of environmental variables for all occurrences as the potential range where a species can occur. |
Domain: | predicts species occurrence based on the environmental similarity of an unknown site to the nearest occurrence sites using the Gower coefficient of similarity. |
Mahalanobis Model: | predicts species occurrence based on the environmental similarity of an unknown site to the nearest occurrence sites using the Mahalanobis distance measure, which is independent of the scales of various predictors. |
Statistical models
Statistical models produce estimates of the effect of different environmental variables on the distribution of a species. These models use all the data available to estimate the parameters/coefficients of the predictors, and construct a function that best describes the effect of environmental variables on species occurrence. The suitability of a particular model is often defined by specific model assumptions.
Generalised Linear Model: | a regression model for data with a non-normal distribution, fitted with maximum likelihood estimation. |
Generalised Additive Model: | a multiple regression model that uses smoothed functions of the environmental variables to model non-linear relationships. |
Multiple Adaptive Regression Splines: | |
Flexible Discriminant Analysis: |
Machine learning models
Machine Learning methods are algorithms that typically use one part of the dataset to ‘learn’ and describe the dataset (training) and the other part to make predictions.
Maxent: | predicts species occurrences by finding the distribution that is most spread out, or closest to uniform, while taking into account the limits of the environmental variables of known locations. |
Classification Tree: | predicts species occurrence by repeatedly splitting the dataset into mutually exclusive groups based on a threshold value of one of the environmental variables. |
Random Forest: | grows many decision trees and averages the predictions of these trees to estimate the importance of each environmental variable. |
Boosted Regression Tree/General Boosting Model: | predicts species occurrence probabilities based on a combination of decision trees and boosting, It uses a stagewise procedure to iteratively fit random subsets of the data, and averages the suite of trees in the final model. |
Artificial Neural Network: |