[Biomod-commits] binary consensus output
Wilfried Thuiller
wilfried.thuiller at ujf-grenoble.fr
Thu Jan 20 22:14:28 CET 2011
Dear Sami,
Funny enough, despite the answer to your email is relatively easy, by looking at the code, I discovered a problem that I just fixed. Thanks a lot!
You understood almost correctly.
If in the Ensemble.Forecasting call, you have "repetition.models=T", then for the binary transformation apply to Total.Consensus, we take the AVERAGE of the thresholds (and not only PA1). In your example, PA1, PA1_rep1, and PA1_rep2).
If you do not want the repetition added to the ensemble forecast, just replace "repetition.models=F".
Alternatively, you can also remove the projections made with the full model (run on all data) and only keep the average over the different repetitions (made on sub-parts of the data). For this you just to have to put: final.model.out = TRUE.
There was a bug here, a local variable that was removed accidentally during the process. I fixed it.
Look at your example below where I run the two cases. Everything works fine.
In the mean time, I have also modified the two functions Ensemble.Forecasting and Ensemble.Forecasting.raster to allow the binary transformation for the median as well. The threshold is set to the median of the thresholds.
I have uploaded the new functions on R-Forge tonight (rev 258, version 1.1-6.3). Because R-Forge is always a bit long to compile the packages, I have also attached the two updated scripts. If you want to use them already, first load BIOMOD, and then "source" the two scripts.
Please note that I have not tested the Ensemble.Forecasting.raster with these changes. Any bug report is more than welcome (only the binary transformation by the median has been changed).
library(BIOMOD)
source("PATH/Ensemble.Forecasting") ### make sure to set up the correct path to the file)
Hope it helps,
Wilfried
#####################################################
#### Examples to show the results with and without the "full model".
library(BIOMOD)
data(Sp.Env)
Initial.State(Response=Sp.Env[, c(17:18)], Explanatory = Sp.Env[,c(4:10)], IndependentResponse = NULL, IndependentExplanatory = NULL)
Models(GLM = T, TypeGLM = "poly", Test = "AIC", GBM = F, No.trees = 3000,GAM = F, Spline = 3,CTA = T, CV.tree = 50,ANN = F, CV.ann = 2,FDA = F,SRE = F, quant=0.025,MARS = F,RF = T,NbRunEval = 2, DataSplit = 70,Yweights=NULL, Roc=TRUE,Optimized.Threshold.Roc=TRUE,Kappa=TRUE, TSS=TRUE, KeepPredIndependent = FALSE, VarImport=5,NbRepPA=1,strategy="random",nb.absences=1000)
Projection(Proj = Sp.Env[,4:10],Proj.name='present',GLM = T,GBM = F,GAM = F,CTA = T,ANN = F,FDA = F,SRE = F, quant=0.025,MARS = F,RF = T,BinRoc = T, BinKappa = F, BinTSS = F,FiltRoc = F, FiltKappa = F, FiltTSS = F,repetition.models=T)
Ensemble.Forecasting(Proj.name= "present",weight.method='Roc',PCA.median=F,binary=T,bin.method='Roc',Test=T,decay=1.6,repetition.models=T, , final.model.out=F)
# check outputs:
# binary output:
load("proj.present/Total_consensus_present_Bin")
binary_weighted_average <- Total_consensus_present_Bin[,,3]
apply(binary_weighted_average, 2, sum)
load("proj.present/Total_consensus_present")
probs_weighted_average <- Total_consensus_present[,,3]
load("proj.present/consensus_present_results")
sum(BinaryTransformation(probs_weighted_average[,1], mean(consensus_present_results[[1]][[3]][2,])))
sum(BinaryTransformation(probs_weighted_average[,2], mean(consensus_present_results[[2]][[3]][2,])))
#### WITHOUT FULL MODEL
Ensemble.Forecasting(Proj.name= "present",weight.method='Roc',PCA.median=F,binary=T,bin.method='Roc',Test=T,decay=1.6,repetition.models=T, final.model.out=T)
# check outputs:
# binary output: WITHOUT FULL MODELS
load("proj.present/Total_consensus_present_Bin")
binary_weighted_average <- Total_consensus_present_Bin[,,3]
apply(binary_weighted_average, 2, sum)
load("proj.present/Total_consensus_present")
probs_weighted_average <- Total_consensus_present[,,3]
load("proj.present/consensus_present_results")
sum(BinaryTransformation(probs_weighted_average[,1], mean(consensus_present_results[[1]][[3]][3,2:3])))
sum(BinaryTransformation(probs_weighted_average[,2], mean(consensus_present_results[[2]][[3]][3,2:3])))
Le 20 janv. 2011 à 19:12, Sami Domisch a écrit :
> Dear modellers,
>
> I have a question related to the method, how BIOMOD creates the binary consensus predictions. I played a bit around with the test data which comes with the package, and modelled the present distribution of 2 species (Sp185, Sp191) with 3 algorithms (GLM, CTA, RF x 2 repetitions each, one pseudo-absence-run to keep it simple and fast..). I used the Projection-function using the same present-day variables in order to receive the present-day distribution for the whole study area. Subsequently I used the Ensemble.Forecasting-function to get a consensus model using weighted averages (prob.mean.weighted, weight decay 1.6). So far nothing special about it, I pasted the code below.
>
> I now compared the binary consensus output BIOMOD created for the two species (i.e. prob.mean.weighted in "Total_consensus_present_Bin") with the probability-output (0-1000) of the consensus prob.mean.weighted-model, after applying the prob.mean.weighted - threshold which is given in the "consensus_present_results"-table (PA1, which used the total data of the two repetitions PA1_rep1 and PA1_rep2). I thus created the binary results manually.
>
> Now here is my problem: the number of presence-pixels ("1") differ between the two outputs, although they should be identical. For instance, Sp185 and Sp191 have 736 and 1217 presence-pixels, respectively, whereas the manually calculated ones have 678 and 1149 pixels classified as "1". Shouldn't the numbers be the same? How is BIOMOD creating the binary results, did I miss something? I guess this derives from the partitioning of the train/test-data vs. using the total data?
>
> I am interested in a solution since I want to average several projections for one species based on different climate scenarios, and binary maps would be essential for me. The idea was quite simple: to average the probabilities of the different climate-scenario runs for each grid cell, and then average the thresholds of these runs to get the binary outputs. And to check this method, I compared the BIOMOD-output and the manually calculated one. However there seems to be some kind of discrepancy...Has anybody a solution or maybe knows a work-around for this issue?
> Any help is appreciated, many thanks in advance!
> Sami
>
>
> #############
>
> load("Sp.Env.rda")
>
> library(BIOMOD)
>
> Initial.State(Response=Sp.Env[, c(17:18)], Explanatory = Sp.Env[,c(4:10)],
> IndependentResponse = NULL, IndependentExplanatory = NULL)
>
> Models( GLM = T, TypeGLM = "poly", Test = "AIC",
> GBM = F, No.trees = 3000,
> GAM = F, Spline = 3,
> CTA = T, CV.tree = 50,
> ANN = F, CV.ann = 2,
> FDA = F,
> SRE = F, quant=0.025,
> MARS = F,
> RF = T,
> NbRunEval = 2, DataSplit = 70,
> Yweights=NULL, Roc=TRUE, Optimized.Threshold.Roc=TRUE,
> Kappa=TRUE, TSS=TRUE, KeepPredIndependent = FALSE, VarImport=5,
> NbRepPA=1, strategy="random",
> nb.absences=1000)
>
>
> Projection(Proj = Sp.Env[,4:10],
> Proj.name='present',
> GLM = T,
> GBM = F,
> GAM = F,
> CTA = T,
> ANN = F,
> FDA = F,
> SRE = F, quant=0.025,
> MARS = F,
> RF = T,
> BinRoc = T, BinKappa = F, BinTSS = F,
> FiltRoc = F, FiltKappa = F, FiltTSS = F,
> repetition.models=T)
>
>
> Ensemble.Forecasting(Proj.name= "present",
> weight.method='Roc',
> PCA.median=F,
> binary=T,
> bin.method='Roc',
> Test=T,
> decay=1.6,
> repetition.models=T)
>
>
> # check outputs:
>
> # binary output:
> load("proj.present/Total_consensus_present_Bin")
> binary_weighted_average <- Total_consensus_present_Bin[,,2]
> write.csv( binary_weighted_average, "binary_weighted_average.csv")
>
>
> # probs 0-1000:
> load("proj.present/Total_consensus_present")
> probs_weighted_average <- Total_consensus_present[,,2]
> write.csv(probs_weighted_average, "probs_weighted_average.csv")
>
> # get appropriate threshold for prob.mean.weighted:
> consensus_present_results
>
> #############
>
> _______________________________________________
> Biomod-commits mailing list
> Biomod-commits at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/biomod-commits
--------------------------
Dr. Wilfried Thuiller
Laboratoire d'Ecologie Alpine, UMR CNRS 5553
Université Joseph Fourier
BP53, 38041 Grenoble cedex 9, France
tel: +33 (0)4 76 51 44 97
fax: +33 (0)4 76 51 42 79
Email: wilfried.thuiller at ujf-grenoble.fr
Home page: http://www.will.chez-alice.fr
Website: http://www-leca.ujf-grenoble.fr/equipes/tde.htm
FP6 European MACIS project: http://www.macis-project.net
FP6 European EcoChange project: http://www.ecochange-project.eu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/biomod-commits/attachments/20110120/6802bc9c/attachment-0001.htm>
More information about the Biomod-commits
mailing list