[Biomod-commits] Question regarding training/testing/evaluating data sets

Damien Georges damien.georges2 at gmail.com
Wed May 29 14:17:42 CEST 2013

Dear Kristen,

In fact if you give a specific independent dataset for model evaluation 
this dataset will be use both for all models evaluations. In case you 
decide to do do cross validation (i.e. 75/25) , models will be calibrate 
with 75% of expl.var data, then the optimal threshold to convert 
continuous prediction to binaries (needed to evaluate models) will 
determined with 25% of expl.var dataset. Then the model will evaluation 
score will be calculated using eval.expl.var dataset (using threshold 
calculate at previous step).
Ensemble models will be evaluated with same thresholds (25% of expl.var) 
and same reference data (eval.expl.var).

Working with this "3 separated dataset" is the fairest way to construct 
your models. If you have data enough, I recommend you to follow this way 
to work.

Hope that helps,


On 24/05/2013 21:32, Kristen Bouska wrote:
> I wanted to make sure I am understanding my model commands correctly. If I
> have the following commands (below), I am calling in a training data set
> (expl.var) and an evaluation data set (eval.expl.var), then I am
> cross-validating the training data set 10 times, splitting it 75/25 for
> each cross-validation. If this is correct, then my evaluation data set
> (eval.expl.var) is not used in the BIOMOD_Modeling evaluation step,
> correct? Is the evaluation data set only used to evaluate the ensembled
> models?
> Sorry if this is confusing, I just want to make sure I am doing what I
> think I am doing when I am running my models.
> myBiomodData <- BIOMOD_FormatingData(resp.var=myResp, resp.xy=myRespXY,
> expl.var=myExpl, resp.name=myRespName, eval.resp.var=myEvalpa,
> eval.expl.var=myEvalenv, eval.resp.xy=myRespEvalXY)
> myBiomodModelOut <- BIOMOD_Modeling(myBiomodData, models = c('GLM', 'GBM',
> 'CTA', 'RF', 'MARS'), models.options=NULL, NbRunEval=10, DataSplit=75,
> Yweights=NULL, Prevalence=NULL, VarImport=3, models.eval.meth =
> c('KAPPA','TSS','ROC'),SaveObj = TRUE, rescal.all.models = TRUE,
> do.full.models = TRUE, modeling.id = “test_date” )
> Thanks!
> Kristen

More information about the Biomod-commits mailing list