Awesome
ENMwizard
Advanced Tecniques for Ecological Niche Modeling Made Easy
This package provides tools to facilitate the use of advanced techniques related to ecological niche modeling (ENM) and the automation of the workflow for modeling multiple species. ENMwizard allows easier: 1. preparation of occurrence and environmental data (selection of environmental variables, selection of calibration and projection areas); 2. model tunning (thanks to the package ENMeval); 3. model selection and projection. Computationally intensive tasks can be performed using a single or multiple cores to speed up processing. ENMwizard also implements AICc Model Averaging for MaxEnt models (Gutierrez & Heming, 2018, https://arxiv.org/abs/1807.04346).
Installation
ENMwizard is downloadable from https://github.com/HemingNM/ENMwizard. You can download it using devtools to install from GitHub.
Install from GitHub using devtools
Run the following code from your R console:
install.packages("devtools")
devtools::install_github("HemingNM/ENMwizard")
library(ENMwizard)
Notice that ENMwizard is not compatible with ENMeval 2.0.
Sorry for that. I am working to make it compatible with the newest version soon
Citation
Please cite ENMwizard (and other R packages it depends on) by using:
citation("ENMwizard")
citation("spThin")
citation("ENMeval")
citation("raster")
Steps for niche modeling using ENMwizard
Prepare environmental data
Load occurrence data
First, lets use occ data available in dismo package.
Bvarieg.occ <- read.table(paste(system.file(package="dismo"),
"/ex/bradypus.csv", sep=""), header=TRUE, sep=",")
head(Bvarieg.occ) # Check first rows
Now we make it a named list, where names correspond to species names.
spp.occ.list <- list(Bvarieg = Bvarieg.occ)
Create occ polygon to crop rasters prior to modelling
The occurrence points in the named list are used to create polygons. Notice that you can cluster the occ points using several clustering methods. See differences and choose one that fits your needs:
occ.polys <- set_calibarea_b(spp.occ.list)
occ.polys <- set_calibarea_b(spp.occ.list, k=0, c.m="AP", q=.01) # less polygons
occ.polys <- set_calibarea_b(spp.occ.list, k=0, c.m="AP", q=.3)
occ.polys <- set_calibarea_b(spp.occ.list, k=0, c.m="AP", q=.8) # more polygons
occ.polys <- set_calibarea_b(spp.occ.list, k=0, c.m="NB", method = "centroid", index = "duda")
occ.polys <- set_calibarea_b(spp.occ.list, k=0, c.m="NB", method = "centroid", index = "sdindex")
Create buffer
... and the occurrence polygons are buffered using 1.5 degrees.
occ.b <- buffer_b(occ.polys, width = 1.5)
Get and cut enviromental layers
Get climate data for historical (near current) conditions. In this example, a directory called 'rasters' is created. Then, rasters from historical (near current) are downloaded.
# Create directory to store raster files
dir.create("./rasters")
# Download data for present
library(raster)
predictors <- getData('worldclim', var='bio', res=10, path="rasters")
Cut environmental variables for each species (and plot them for visual inspection).
pred.cut <- cut_calibarea_b(occ.b, predictors)
for(i in 1:length(pred.cut)){
plot(pred.cut[[i]][[1]])
plot(occ.polys[[i]], border = "red", add = T)
plot(occ.b[[i]], add = T)
}
Select the least correlated variables
vars <- select_vars_b(pred.cut, cutoff=.75, names.only = T)
# See selected variables for each species
lapply(vars, function(x)x[[1]])
# remove correlated variables from our variable set
pred.cut <- select_vars_b(pred.cut, cutoff=.75, names.only = F)
Prepare occurrence data
Filter original dataset
Now we want to remove localities that are too close apart. We will do it for all species listed in "spp.occ.list".
thinned.dataset.batch <- thin_b(loc.data.lst = spp.occ.list)
Load occurrence data (filtered localities)
After thinning, we choose one dataset for each species for modelling.
occ.locs <- load_thin_occ(thinned.dataset.batch)
Great! Now we are ready for tunning species' ENMs
Tunning Maxent's feature classes and regularization multiplier via ENMeval
Model tuning using ENMeval
Here we will run ENMevaluate_b to call ENMevaluate (from ENMeval package). Here we will test which combination of Feature Classes and Regularization Multipliers give the best results. For this, we will partition our occurrence data using the "block" method.
By providing [at least] two lists, occurrence and environmental data, we will be able to evaluate ENMs for as many species as listed in our occ.locs object. For details see ?ENMeval::ENMevaluate. Notice that you can use multiple cores for this task. This is specially usefull when there are a large number of models and species.
ENMeval.res.lst <- ENMevaluate_b(occ.locs, pred.cut,
RMvalues = c(1, 1.5), fc = c("L", "LQ", "LP"),
method="block", algorithm="maxent.jar")
Model fitting (calibration)
After tuning MaxEnt models, we will calibrate them using all occurrence data (i.e. without partition them).
# Run model
mxnt.mdls.preds.lst <- calib_mdl_b(ENMeval.o.l = ENMeval.res.lst,
a.calib.l = pred.cut,
mSel = c("LowAIC", "AUC"))
Projection
Prepare projecion area
Download environmental data
For projection it is necessary to download raster files with the environmental variables of interest. Rasters with historical (near current) climatic conditions was already created. We will download data of climatic conditions for two future (2050 and 2070) scenarios and create one list with all three climate scenarios.
library(raster)
# Get climate data for future conditions (2050) from two GCMs at RCP 8.5
futAC5085 <- getData('CMIP5', var='bio', res=10, rcp=85, model='AC', year=50, path="rasters")
names(futAC5085) <- names(predictors)
futCC5085 <- getData('CMIP5', var='bio', res=10, rcp=85, model='CC', year=50, path="rasters")
names(futCC5085) <- names(predictors)
# Get climate data for future conditions (2070) from two GCMs at RCP 8.5
futAC7085 <- getData('CMIP5', var='bio', res=10, rcp=85, model='AC', year=70, path="rasters")
names(futAC7085) <- names(predictors)
futCC7085 <- getData('CMIP5', var='bio', res=10, rcp=85, model='CC', year=70, path="rasters")
names(futCC7085) <- names(predictors)
predictors.l <- list(ncurrent = predictors,
futAC5085 = futAC5085,
futCC5085 = futCC5085,
futAC7085 = futAC7085,
futCC7085 = futCC7085)
Select area for projection based on the extent of occ points
Now it is time to define the projection area for each species. The projection area can be the same for all species (in this example) of be defined individually. Here, the projection area will be defined as an square area slightly larger than the original occurrence of the species. Then, a two lists with models will be created for a species. In the first list, the projection will be performed using current climatic conditions. In the second list, two cenarios of futurure climate (defined above) are created.
poly.projection <- set_projarea_b(occ.polys, mult = .1, buffer=FALSE)#
plot(poly.projection[[1]], col="gray")
plot(occ.polys[[1]], col="yellow", add=T)
pred.cut.l <- cut_projarea_mscn_b(poly.projection, predictors.l)
plot(poly.projection[[1]], col="gray")
plot(pred.cut.l[[1]][[1]][[1]], add=T)
plot(occ.polys[[1]], add=T)
... if the extent to project is the same for all species
When all species are to be projected using the same current and future climates and in the same region, then the following lines can be used to repeat the same lists of cenarios for all species (could be defined differently for each species if wanted).
proj.extent <- extent(c(-109.5, -26.2, -59.5, 18.1))
# coerce to a SpatialPolygons object
proj.extent <- as(proj.extent, 'SpatialPolygons')
pred.cut.l <- cut_projarea_rst_mscn_b(proj.extent, predictors.l, occ.polys)
Model projections
Finally, the model(s) can be projected on all climatic cenarios. This is performed by the proj_mdl_b
function. The function has two arguments: 1) MaxEnt fitted models (see step 4.3 above) and 2) list of rasters representing all cenarios onto which models will be projected.
This function can be run using a single core (default) or multiple cores available in a computer. There two ways of performing parallel processing: by species or by model. If the distribution of few species is being modelled, and models are computationally intensive, then processing by model will provide best results. If there are many species, probably parallel processing by species (split species across the multiple cores of a computer) will be faster.
# For single or multiple species
# using a single core (default)
mxnt.mdls.preds.cf <- proj_mdl_b(mxnt.mdls.preds.lst, a.proj.l = pred.cut.l)
# or using multiple cores
mxnt.mdls.preds.cf <- proj_mdl_b(mxnt.mdls.preds.lst, a.proj.l = pred.cut.l, numCores=2)
# plot projections
par(mfrow=c(1,2), mar=c(1,2,1,2))
plot(mxnt.mdls.preds.cf$Bvarieg$mxnt.preds$ncurrent)
plot(mxnt.mdls.preds.cf$Bvarieg$mxnt.preds$futAC5085)
Create consensual projections across GCMs by (e.g.) year and/or RCP
The climate scenario projections can be grouped and averaged to create consensual projections. Here we downloaded two GCMs for 2050 and two for 2070, both at RCP 8.5. So, the GCMs will be averaged by year.
# create two vectors containing grouping codes
yr <- c(50, 70)
rcp <- c("45", "85")
groups <- list(yr, rcp)
# get names we gave to the predictors
clim.scn.nms <- names(predictors.l)
consensus_gr(groups, clim.scn.nms)
## here we do compute the consensual projections
mxnt.mdls.preds.cf <- consensus_scn_b(mcmp.l=mxnt.mdls.preds.cf, groups = list(yr, rcp), ref="ncurrent")
####
## just in case you have multiple GCMs by year and RCP, this is an example
## that return more groups
# grouping codes
yr <- c(2050, 2070)
rcp <- c("RCP45", "RCP85")
groups <- list(yr, rcp)
# names of climate scenarios
clim.scn.nms <- c("CCSM4.2050.RCP45", "MIROC.ESM.2050.RCP45", "MPI.ESM.LR.2050.RCP45",
"CCSM4.2070.RCP45", "MIROC.ESM.2070.RCP45", "MPI.ESM.LR.2070.RCP45",
"CCSM4.2050.RCP85", "MIROC.ESM.2050.RCP85", "MPI.ESM.LR.2050.RCP85",
"CCSM4.2070.RCP85", "MIROC.ESM.2070.RCP85", "MPI.ESM.LR.2070.RCP85")
consensus_gr(groups, clim.scn.nms)
Apply thresholds on suitability projections
We have the projections for each climatic scenario, now we must select one (or more) threshold criteria and apply on the projections.
# 1. Fixed.cumulative.value.1 (fcv1);
# 2. Fixed.cumulative.value.5 (fcv5);
# 3. Fixed.cumulative.value.10 (fcv10);
# 4. Minimum.training.presence (mtp);
# 5. 10.percentile.training.presence (x10ptp);
# 6. Equal.training.sensitivity.and.specificity (etss);
# 7. Maximum.training.sensitivity.plus.specificity (mtss);
# 8. Balance.training.omission.predicted.area.and.threshold.value (bto);
# 9. Equate.entropy.of.thresholded.and.original.distributions (eetd).
mods.thrshld.lst <- thrshld_b(mxnt.mdls.preds.cf, thrshld.i = c(5,7))
Identify range shifts
Range shifts are differences in suitable areas between climate scenarios. Here we will map where the range has shifted and compute unchanged, lost, and gained areas.
spp_rdiff <- range_shift_b(mods.thrshld.lst, ref.scn = "ncurrent")
# plot maps of range change
breaks <- round(seq(from=-1, to=1, .666), 2)
colors <- colorRampPalette(c("red", "gray", "blue"))(length(breaks)-1)
plot(spp_rdiff[[1]][[1]][[1]], col=colors, breaks=breaks)
# area of range changes
spp_rdiff_a <- get_rsa_b(spp_rdiff)
spp_rdiff_a
Visualize
Plot one projection for current climate and another for a future climatic scenario
plot(mods.thrshld.lst$Bvarieg$ncurrent$binary$x10ptp)
plot(mods.thrshld.lst$Bvarieg$X50.85$binary$x10ptp)
plot_mdl_diff(mxnt.mdls.preds.lst[[1]], mods.thrshld.lst[[1]], sp.nm = "Bvarieg")
plot_mdl_diff_b(mxnt.mdls.preds.cf, mods.thrshld.lst, save=T)
Plot differences between current climate and future climatic scenarios for all thresholds
plot_scn_diff_b(mxnt.mdls.preds.cf, mods.thrshld.lst,
ref.scn = "ncurrent", mSel = "LowAIC", save=F)
Compute metrics
Compute variable contribution and permutation importance
get_cont_permimport_b(mxnt.mdls.preds.cf)
Compute "Fractional Predicted Area" ('n of occupied pixels'/n)
get_fpa_b(mods.thrshld.lst)
Compute species' total suitable area
get_tsa_b(mods.thrshld.lst)