[Biomod-commits] binary consensus output
Sami Domisch
Sami.Domisch at senckenberg.de
Thu Jan 20 19:12:40 CET 2011
Dear modellers,
I have a question related to the method, how BIOMOD creates the binary consensus predictions. I played a bit around with the test data which comes with the package, and modelled the present distribution of 2 species (Sp185, Sp191) with 3 algorithms (GLM, CTA, RF x 2 repetitions each, one pseudo-absence-run to keep it simple and fast..). I used the Projection-function using the same present-day variables in order to receive the present-day distribution for the whole study area. Subsequently I used the Ensemble.Forecasting-function to get a consensus model using weighted averages (prob.mean.weighted, weight decay 1.6). So far nothing special about it, I pasted the code below.
I now compared the binary consensus output BIOMOD created for the two species (i.e. prob.mean.weighted in "Total_consensus_present_Bin") with the probability-output (0-1000) of the consensus prob.mean.weighted-model, after applying the prob.mean.weighted - threshold which is given in the "consensus_present_results"-table (PA1, which used the total data of the two repetitions PA1_rep1 and PA1_rep2). I thus created the binary results manually.
Now here is my problem: the number of presence-pixels ("1") differ between the two outputs, although they should be identical. For instance, Sp185 and Sp191 have 736 and 1217 presence-pixels, respectively, whereas the manually calculated ones have 678 and 1149 pixels classified as "1". Shouldn't the numbers be the same? How is BIOMOD creating the binary results, did I miss something? I guess this derives from the partitioning of the train/test-data vs. using the total data?
I am interested in a solution since I want to average several projections for one species based on different climate scenarios, and binary maps would be essential for me. The idea was quite simple: to average the probabilities of the different climate-scenario runs for each grid cell, and then average the thresholds of these runs to get the binary outputs. And to check this method, I compared the BIOMOD-output and the manually calculated one. However there seems to be some kind of discrepancy...Has anybody a solution or maybe knows a work-around for this issue?
Any help is appreciated, many thanks in advance!
Sami
#############
load("Sp.Env.rda")
library(BIOMOD)
Initial.State(Response=Sp.Env[, c(17:18)], Explanatory = Sp.Env[,c(4:10)],
IndependentResponse = NULL, IndependentExplanatory = NULL)
Models( GLM = T, TypeGLM = "poly", Test = "AIC",
GBM = F, No.trees = 3000,
GAM = F, Spline = 3,
CTA = T, CV.tree = 50,
ANN = F, CV.ann = 2,
FDA = F,
SRE = F, quant=0.025,
MARS = F,
RF = T,
NbRunEval = 2, DataSplit = 70,
Yweights=NULL, Roc=TRUE, Optimized.Threshold.Roc=TRUE,
Kappa=TRUE, TSS=TRUE, KeepPredIndependent = FALSE, VarImport=5,
NbRepPA=1, strategy="random",
nb.absences=1000)
Projection(Proj = Sp.Env[,4:10],
Proj.name='present',
GLM = T,
GBM = F,
GAM = F,
CTA = T,
ANN = F,
FDA = F,
SRE = F, quant=0.025,
MARS = F,
RF = T,
BinRoc = T, BinKappa = F, BinTSS = F,
FiltRoc = F, FiltKappa = F, FiltTSS = F,
repetition.models=T)
Ensemble.Forecasting(Proj.name= "present",
weight.method='Roc',
PCA.median=F,
binary=T,
bin.method='Roc',
Test=T,
decay=1.6,
repetition.models=T)
# check outputs:
# binary output:
load("proj.present/Total_consensus_present_Bin")
binary_weighted_average <- Total_consensus_present_Bin[,,2]
write.csv( binary_weighted_average, "binary_weighted_average.csv")
# probs 0-1000:
load("proj.present/Total_consensus_present")
probs_weighted_average <- Total_consensus_present[,,2]
write.csv(probs_weighted_average, "probs_weighted_average.csv")
# get appropriate threshold for prob.mean.weighted:
consensus_present_results
#############
More information about the Biomod-commits
mailing list