Title: | Ensemble Tool for Predictions from Species Distribution Models |
---|---|
Description: | A tool which allows users to create and evaluate ensembles of species distribution model (SDM) predictions. Functionality is offered through R functions or a GUI (R Shiny app). This tool can assist users in identifying spatial uncertainties and making informed conservation and management decisions. The package is further described in Woodman et al (2019) <doi:10.1111/2041-210X.13283>. |
Authors: | Sam Woodman [aut, cre] |
Maintainer: | Sam Woodman <[email protected]> |
License: | Apache License (== 2) |
Version: | 0.4.4 |
Built: | 2024-11-06 06:01:13 UTC |
Source: | https://github.com/swfsc/esdm |
eSDM: A tool for creating and exploring ensembles of predictions from Species Distribution Models
eSDM provides functionality for overlaying SDM predictions onto a single base geometry and creating and evaluating ensemble predictions. This can be done manually in R, or using the eSDM GUI (an R Shiny app) opened through eSDM_GUI
eSDM allows users to overlay SDM predictions onto a single base geometry, create ensembles of these predictions via weighted or unweighted averages, calculate performance metrics for each set of predictions and for resulting ensembles, and visually compare ensemble predictions with original predictions. The information provided by this tool can assist users in understanding spatial uncertainties and making informed conservation decisions.
The GUI ensures that the tool is accessible to non-R users, while also providing a user-friendly environment for functionality such as loading other polygons to use and visualizing predictions. However, user choices are restricted to the workflow provided by the GUI.
Sam Woodman [email protected]
Create a weighted or unweighted ensemble of SDM predictions, including associated uncertainty values
ensemble_create(x, x.idx, w = NULL, x.var.idx = NULL, ...) ## S3 method for class 'sf' ensemble_create(x, x.idx, w = NULL, x.var.idx = NULL, ...) ## S3 method for class 'data.frame' ensemble_create(x, x.idx, w = NULL, x.var.idx = NULL, ...)
ensemble_create(x, x.idx, w = NULL, x.var.idx = NULL, ...) ## S3 method for class 'sf' ensemble_create(x, x.idx, w = NULL, x.var.idx = NULL, ...) ## S3 method for class 'data.frame' ensemble_create(x, x.idx, w = NULL, x.var.idx = NULL, ...)
x |
object of class |
x.idx |
vector of column names or numerical indices;
indicates which columns in |
w |
weights for the ensemble; either a numeric vector the same length as |
x.var.idx |
vector of column names or column indices;
indicates columns in |
... |
Arguments to be passed to methods; specifically designed for passing
|
ensemble_create
is designed to be used after overlaying predictions with overlay_sdm
and
(if desired) rescaling the overlaid predictions with ensemble_rescale
.
This function implements ensemble methods provided in eSDM_GUI. Note that it does not implement regional exclusion, which must be done manually if not using the GUI.
Ensemble uncertainty is calculated using either the within-model uncertainty (if x.var.idx
is specified) or
the among-model uncertainty (if x.var.idx
is NULL
).
See the eSDM GUI manual for applicable formulas.
An object of the same class as x
with two columns appended to the data frame:
'Pred_ens' - The ensemble predictions
'Var_ens' - The variance of the ensemble predictions,
calculated using either the within-model uncertainty (if x.var.idx
is specified) or
the among-model uncertainty (if x.var.idx
is NULL
)
Note that all other columns of x
will be included in the returned object.
Also, if x
is of class sf
then
1) the geometry list-column will be the last column of the returned object and
2) the agr
attribute will be set as 'constant' for 'Pred_ens' and 'Var_ens'
ensemble_create(preds.1, c("Density", "Density2"), c(0.2, 0.8)) ensemble_create(preds.1, 1:2, c(0.2, 0.8), c("Var1", "Var2")) ensemble_create(data.frame(a = 1:5, b = 3:7), c(1, 2)) weights.df <- data.frame(runif(325), c(rep(NA, 100), runif(225))) ensemble_create(preds.1, c("Density", "Density2"), weights.df, na.rm = TRUE)
ensemble_create(preds.1, c("Density", "Density2"), c(0.2, 0.8)) ensemble_create(preds.1, 1:2, c(0.2, 0.8), c("Var1", "Var2")) ensemble_create(data.frame(a = 1:5, b = 3:7), c(1, 2)) weights.df <- data.frame(runif(325), c(rep(NA, 100), runif(225))) ensemble_create(preds.1, c("Density", "Density2"), weights.df, na.rm = TRUE)
Rescale SDM predictions and (if applicable) associated uncertainties
ensemble_rescale(x, x.idx, y, y.abund = NULL, x.var.idx = NULL)
ensemble_rescale(x, x.idx, y, y.abund = NULL, x.var.idx = NULL)
x |
object of class |
x.idx |
vector of column names or column indices;
indicates columns in |
y |
rescaling method; must be either "abundance" or "sumto1". See 'Details' section for descriptions of the rescaling methods |
y.abund |
numeric value; ignored if |
x.var.idx |
vector of column names or column indices;
indicates columns in |
ensemble_rescale
is intended to be used after overlaying predictions with
overlay_sdm
and before creating ensembles with ensemble_create
.
The provided rescaling methods are:
'abundance' - Rescale the density values so that the predicted abundance is y.abund
'sumto1' - Rescale the density values so their sum is 1
SDM uncertainty values must be rescaled differently than the prediction values.
Columns specified in x.var.idx
must contain variance values.
These values will be rescaled using the formula var(c * x) = c^2 * var(x)
,
where c
is the rescaling factor for the associated predictions.
If x.var.idx
is not NULL
, then the function assumes
x.var.idx[1]
contains the variance values associated with the predictions in x.idx[1]
,
x.var.idx[2]
contains the variance values associated with the predictions in x.idx[2]
, etc.
Use NA
in x.var.idx
to indicate a set of predictions that does not have
associated uncertainty values (e.g., x.var.idx = c(4, NA, 5)
)
The sf
object x
with the columns specified by x.idx
and x.var.idx
rescaled.
The agr
attributes of x
will be conserved
ensemble_rescale(preds.1, c("Density", "Density2"), "abundance", 50) ensemble_rescale(preds.1, c(1, 2), "sumto1") ensemble_rescale( preds.1, c("Density", "Density2"), "abundance", 100, c(3,4) )
ensemble_rescale(preds.1, c("Density", "Density2"), "abundance", 50) ensemble_rescale(preds.1, c(1, 2), "sumto1") ensemble_rescale( preds.1, c("Density", "Density2"), "abundance", 100, c(3,4) )
Open the eSDM graphical user interface (GUI); an R Shiny app for creating ensemble predictions using SDM predictions.
eSDM_GUI(launch.browser = TRUE)
eSDM_GUI(launch.browser = TRUE)
launch.browser |
Logical with default of |
Calculate AUC, TSS, and RMSE for given density predictions and validation data
evaluation_metrics(x, x.idx, y, y.idx, count.flag = FALSE)
evaluation_metrics(x, x.idx, y, y.idx, count.flag = FALSE)
x |
object of class sf; SDM predictions |
x.idx |
name or index of column in |
y |
object of class sf; validation data |
y.idx |
name or index of column in |
count.flag |
logical; |
If count.flag == TRUE
, then eSDM::model_abundance(x, x.idx, FALSE)
will be run
to calculate predicted abundance and thus calculate RMSE.
Note that this assumes the data in column x.idx
of x
are density values.
If count.flag == FALSE
, then all of the values in column y.idx
of y
must be 0
or 1
.
All rows of x
with a value of NA
in column x.idx
and
all rows of y
with a value of NA
in column y.idx
are removed before calculating metrics
A numeric vector with AUC, TSS and RMSE values, respectively.
If count.flag == FALSE
, the RMSE value will be NA
evaluation_metrics(preds.1, 2, validation.data, "sight") evaluation_metrics(preds.1, "Density2", validation.data, "count", TRUE)
evaluation_metrics(preds.1, 2, validation.data, "sight") evaluation_metrics(preds.1, "Density2", validation.data, "count", TRUE)
Low resolution GSHHG world map, includes hierarchical levels
L1 and L6. Processed using st_make_valid
gshhg.l.L16
gshhg.l.L16
An object of class sfc
http://www.soest.hawaii.edu/pwessel/gshhg/
Calculates the predicted abundance by multiplying the density prediction values by prediction polygon areas
model_abundance(x, dens.idx, sum.abund = TRUE)
model_abundance(x, dens.idx, sum.abund = TRUE)
x |
object of class |
dens.idx |
name or index of column(s) in |
sum.abund |
logical; whether or not to sum all of the predicted abundances |
Multiplies the values in the specified column(s) (i.e. the density predictions)
by the area in square kilometers of their corresponding prediction polygon.
The area of each prediction polygon is calculated using st_area
from geos_measures
.
x must have a valid crs code to calculate area for these abundance calculations.
If sum.abund == TRUE
, then a vector of the same length as dens.idx
representing the predicted abundance for the density values in each column.
If sum.abund == FALSE
and the length of dens.idx
is 1,
then a numeric vector with the predicted abundance of each prediction polygon of x
.
If sum.abund == FALSE
and the length of dens.idx
is greater than 1,
then a data frame with length(dens.idx)
columns of the predicted abundance of prediction polygons
model_abundance(preds.1, "Density") model_abundance(preds.1, c(1, 1)) model_abundance(preds.1, c(1, 1), FALSE)
model_abundance(preds.1, "Density") model_abundance(preds.1, c(1, 1)) model_abundance(preds.1, c(1, 1), FALSE)
Overlay specified SDM predictions that meet the percent overlap threshold requirement onto base geometry
overlay_sdm(base.geom, sdm, sdm.idx, overlap.perc)
overlay_sdm(base.geom, sdm, sdm.idx, overlap.perc)
base.geom |
object of class |
sdm |
object of class |
sdm.idx |
names or indices of column(s) with data to be overlaid |
overlap.perc |
numeric; percent overlap threshold, i.e. percentage of each base geometry polygon must overlap with SDM prediction polygons for overlaid density value to be calculated and not set as NA |
See the eSDM GUI manual for specifics about the overlay process.
This process is equivalent to areal interpolation (Goodchild and Lam 1980),
where base.geom
is the target, sdm
is the source, and the data
specified by sdm.idx
are spatially intensive.
Note that overlay_sdm
removes rows in sdm
that have NA values
in the first column specified in sdm.idx
(i.e. sdm.idx[1]
),
before the overlay.
Thus, for valid overlay results, all columns of sdm
specified in
sdm.idx
must either have NA values in the same rows
or contain only NAs.
Object of class sf
with the geometry of base.geom
and
the data in the sdm.idx
columns of sdm
overlaid onto that
geometry. Note that this means all columns of sdm
not in
sdm.idx
will not be in the returned object.
Because the data are considered spatially intensive, the agr
attribute will be set as 'constant' for all columns in the returned object.
Additionally, the output will match the class of sdm
, with regards
to the classes tbl_df, tbl, and data.frame. This means that, in addition to
being an sf
object, if sdm
is a tibble then the output will
also be a tibble, while if sdm
is just a data frame then the output
will not be a tibble.
Goodchild, M.F. & Lam, N.S.-N. (1980) Areal interpolation: a variant of the traditional spatial problem. Geo-Processing, 1, 297-312.
pol1.geom <- sf::st_sfc( sf::st_polygon(list(rbind(c(1,1), c(3,1), c(3,3), c(1,3), c(1,1)))), crs = sf::st_crs(4326) ) pol2.geom <- sf::st_sfc( sf::st_polygon(list(rbind(c(0,0), c(2,0), c(2,2), c(0,2), c(0,0)))), crs = sf::st_crs(4326) ) pol2.sf <- sf::st_sf(data.frame(Dens = 0.5), geometry = pol2.geom, crs = sf::st_crs(4326)) overlay_sdm(pol1.geom, pol2.sf, 1, 25) # Output 'Dens' value is NA because of higher overlap.perc value overlay_sdm(pol1.geom, pol2.sf, 1, 50) # These examples take longer to run overlay_sdm(sf::st_geometry(preds.1), preds.2, 1, 50) overlay_sdm(sf::st_geometry(preds.2), preds.1, "Density", 50)
pol1.geom <- sf::st_sfc( sf::st_polygon(list(rbind(c(1,1), c(3,1), c(3,3), c(1,3), c(1,1)))), crs = sf::st_crs(4326) ) pol2.geom <- sf::st_sfc( sf::st_polygon(list(rbind(c(0,0), c(2,0), c(2,2), c(0,2), c(0,0)))), crs = sf::st_crs(4326) ) pol2.sf <- sf::st_sf(data.frame(Dens = 0.5), geometry = pol2.geom, crs = sf::st_crs(4326)) overlay_sdm(pol1.geom, pol2.sf, 1, 25) # Output 'Dens' value is NA because of higher overlap.perc value overlay_sdm(pol1.geom, pol2.sf, 1, 50) # These examples take longer to run overlay_sdm(sf::st_geometry(preds.1), preds.2, 1, 50) overlay_sdm(sf::st_geometry(preds.2), preds.1, "Density", 50)
preds.1
, preds.2
, and preds.3
are objects of class sf
that serve as
sample sets of SDM density predictions for the eSDM
package
preds.1 preds.2 preds.3
preds.1 preds.2 preds.3
Objects of class sf
with a column of density predictions (name: Density
) and
a simple feature list column (name: geometry
).
preds.1
also has a second column of sample density predictions (name: Density2
),
as well as Var1
and Var2
, representing the variance
preds1
: An object of class sf (inherits from data.frame) with 325 rows and 5 columns.
preds2
: An object of class sf (inherits from data.frame) with 1891 rows and 2 columns.
preds3
: An object of class sf (inherits from data.frame) with 1445 rows and 2 columns.
An object of class sf
(inherits from data.frame
) with 1891 rows and 2 columns.
An object of class sf
(inherits from data.frame
) with 1445 rows and 2 columns.
preds.1
sample SDM density predictions created by importing
Sample_predictions_2.csv into the eSDM GUI, exporting predictions, and then
clipping them to the SoCal_bite.csv region.
Also manually added two variance columns (numbers are randomly generated with a max of 0.01)
preds.2
sample SDM density predictions created by importing
Sample_predictions_1.csv into the eSDM GUI, exporting predictions, and then
clipping them to the SoCal_bite.csv region
preds.3
is a set of sample SDM density predictions created by importing
Sample_predictions_4_gdb into the eSDM GUI, exporting predictions, and then
clipping them to the SoCal_bite.csv region
Create polygon(s) from a data frame with coordinates of the polygon centroid(s)
pts2poly_centroids(x, y, ...)
pts2poly_centroids(x, y, ...)
x |
data frame with at least two columns; the first two columns must contain longitude and latitude coordinates, respectively. See 'Details' section for how additional columns are handled |
y |
numeric; the perpendicular distance from the polygon centroid (center) to its edge (i.e. half the length of one side of a polygon) |
... |
passed to st_sf or to st_sfc,
e.g. for passing named arguments |
This function was designed for someone who reads in a .csv file with a grid of coordinates representing SDM prediction points and needs to create prediction polygons with the .csv file coordinates as the polygon centroids. However, the function can be used to create square polygons of any size around the provided points, regardless of if those polygons touch or overlap. The created polygons are oriented so that, in a 2D plane, their edges are parallel to either the x or the y axis.
If x
contains more than two column, then additional columns will be treated as simple feature attributes,
i.e. passed along as the first argument to st_sf
If a crs
is not specified in ...
,
then the crs
attribute of the polygon(s) will be NULL
.
Object of class sfc
(if x
has exactly two columns) or class sf
(if x
has exactly more than two columns). The object will have a geometry type of POLYGON
.
If the object is of class sf
, the name of the geometry list-column will be "geometry"
# Create an sfc object from a data frame of two columns x <- data.frame( lon = c(5, 10, 15, 20, 5, 10, 15, 20), lat = c(5, 5, 5, 5, 10, 10, 10, 10) ) pts2poly_centroids(x, 2.5, crs = 4326) # Create an sf object from a data frame of more than two columns x <- data.frame( lon = c(5, 10, 15, 20, 5, 10, 15, 20), lat = c(5, 5, 5, 5, 10, 10, 10, 10), sdm.pred = runif(8), sdm.pred2 = runif(8) ) pts2poly_centroids(x, 2.5, crs = 4326, agr = "constant")
# Create an sfc object from a data frame of two columns x <- data.frame( lon = c(5, 10, 15, 20, 5, 10, 15, 20), lat = c(5, 5, 5, 5, 10, 10, 10, 10) ) pts2poly_centroids(x, 2.5, crs = 4326) # Create an sf object from a data frame of more than two columns x <- data.frame( lon = c(5, 10, 15, 20, 5, 10, 15, 20), lat = c(5, 5, 5, 5, 10, 10, 10, 10), sdm.pred = runif(8), sdm.pred2 = runif(8) ) pts2poly_centroids(x, 2.5, crs = 4326, agr = "constant")
Create polygon(s) from a data frame with the coordinates of the polygon vertices
pts2poly_vertices(x, ...)
pts2poly_vertices(x, ...)
x |
data frame with at least two columns; the first two columns must contain longitude and latitude coordinates, respectively. See 'Details' section for how additional columns are handled |
... |
passed to st_sfc,
e.g. for passing named argument |
Vertices of different polygons must be demarcated by rows with values of NA
in both the first and second columns (i.e. the longitude and latitude columns).
All columns in x
besides the first two columns are ignored.
If a crs
is not specified in ...
,
then the crs
attribute of the polygon(s) will be NULL
.
Object of class sfc
with the geometry type POLYGON
x <- data.frame( lon = c(40, 40, 50, 50, 40), lat = c(0, 10, 10, 0, 0) ) pts2poly_vertices(x, crs = 4326) # Create an sf object x <- data.frame( lon = c(40, 40, 50, 50, 40, NA, 20, 20, 30, 30, 20), lat = c(0, 10, 10, 0, 0, NA, 0, 10, 10, 0, 0) ) sf::st_sf(Pred = 1:2, geometry = pts2poly_vertices(x, crs = 4326))
x <- data.frame( lon = c(40, 40, 50, 50, 40), lat = c(0, 10, 10, 0, 0) ) pts2poly_vertices(x, crs = 4326) # Create an sf object x <- data.frame( lon = c(40, 40, 50, 50, 40, NA, 20, 20, 30, 30, 20), lat = c(0, 10, 10, 0, 0, NA, 0, 10, 10, 0, 0) ) sf::st_sf(Pred = 1:2, geometry = pts2poly_vertices(x, crs = 4326))
Sample validation data created by cropping Validation_data.csv to the SoCal_bite.csv region (.csv files from ...)
validation.data
validation.data
An object of class sf
with 8 rows and 3 variables
1's and 0's indicating species presence/absence
number of individuals observed at each point
simple feature list column representing validation data points