[Biomod-commits] binary consensus output

Sami Domisch Sami.Domisch at senckenberg.de
Thu Jan 20 19:12:40 CET 2011

Dear modellers, 

I have a question related to the method, how BIOMOD creates the binary consensus predictions. I played a bit around with the test data which comes with the package, and modelled the present distribution of 2 species (Sp185, Sp191) with 3 algorithms (GLM, CTA, RF x 2 repetitions each, one pseudo-absence-run to keep it simple and fast..). I used the Projection-function using the same present-day variables in order to receive the present-day distribution for the whole study area. Subsequently I used the Ensemble.Forecasting-function to get a consensus model using weighted averages (prob.mean.weighted, weight decay 1.6). So far nothing special about it, I pasted the code below.

I now compared the binary consensus output BIOMOD created for the two species (i.e. prob.mean.weighted in "Total_consensus_present_Bin") with the probability-output (0-1000) of the consensus prob.mean.weighted-model, after applying the prob.mean.weighted - threshold which is given in the "consensus_present_results"-table (PA1, which used the total data of the two repetitions PA1_rep1 and PA1_rep2). I thus created the binary results manually.
Now here is my problem: the number of presence-pixels ("1") differ between the two outputs, although they should be identical. For instance, Sp185 and Sp191 have 736 and 1217 presence-pixels, respectively, whereas the manually calculated ones have 678 and 1149 pixels classified as "1". Shouldn't the numbers be the same? How is BIOMOD creating the binary results, did I miss something? I guess this derives from the partitioning of the train/test-data vs. using the total data? 

I am interested in a solution since I want to average several projections for one species based on different climate scenarios, and binary maps would be essential for me. The idea was quite simple: to average the probabilities of the different climate-scenario runs for each grid cell, and then average the thresholds of these runs to get the binary outputs. And to check this method, I compared the BIOMOD-output and the manually calculated one. However there seems to be some kind of discrepancy...Has anybody a solution or maybe knows a work-around for this issue?
Any help is appreciated, many thanks in advance!




Initial.State(Response=Sp.Env[, c(17:18)], Explanatory = Sp.Env[,c(4:10)],
        IndependentResponse = NULL, IndependentExplanatory = NULL)
Models(    GLM = T, TypeGLM = "poly", Test = "AIC", 
        GBM = F, No.trees = 3000,
        GAM = F, Spline = 3,
        CTA = T, CV.tree = 50,
        ANN = F, CV.ann = 2,
        FDA = F,
        SRE = F, quant=0.025,
        MARS = F,
        RF = T,
        NbRunEval = 2, DataSplit = 70,
        Yweights=NULL, Roc=TRUE, Optimized.Threshold.Roc=TRUE,
        Kappa=TRUE, TSS=TRUE, KeepPredIndependent = FALSE, VarImport=5,
        NbRepPA=1, strategy="random",
Projection(Proj = Sp.Env[,4:10],
            GLM = T,
            GBM = F,
            GAM = F,
            CTA = T,
            ANN = F,
            FDA = F,
            SRE = F, quant=0.025,
            MARS = F,
            RF = T,
            BinRoc = T, BinKappa = F, BinTSS = F,
            FiltRoc = F, FiltKappa = F, FiltTSS = F,
Ensemble.Forecasting(Proj.name= "present",

# check outputs:

# binary output:
binary_weighted_average <- Total_consensus_present_Bin[,,2]
write.csv( binary_weighted_average, "binary_weighted_average.csv")

# probs 0-1000:
probs_weighted_average <- Total_consensus_present[,,2]
write.csv(probs_weighted_average, "probs_weighted_average.csv")

# get appropriate threshold for prob.mean.weighted:

More information about the Biomod-commits mailing list