[Biomod-commits] Error in newx$cptable[max(keep), 1L] <- cp : subscript out of bounds

jason b mackenzie jasonbmackenzie at gmail.com
Fri Feb 10 10:21:23 CET 2012


Thanks for the quick feedback, and the heads up about limitations with classification methods. 

For this project I am considering to model species with 30 or more presences from a total of 1700+ plots. Do you have any recommendations about cutoffs of presence:absence ratios for classification methods and/or for GBM? In the past I've used 30 presences as a cut-off for presence-only modeling (eg. Maxent, Domain, SRE), but I am new to working with absence data.


On Feb 10, 2012, at 7:55 PM, Wilfried Thuiller wrote:

> Same than for CTA I think. Do not forget that those techniques (independently of BIOMOD) need some "good" data to perform. 
> FDA is a classification method, if it does not manage to classify, it bugs. 
> This error for instance is directly coming from FDA, not from BIOMOD. If you run FDA yourself, you'll get the same error. Do you have enough information of presence and absence?
> 
> 
> 
> Le 10 févr. 2012 à 09:42, jason b mackenzie a écrit :
> 
>> Hi Wilfried
>> 
>> Thanks for the suggestions to drop low performing techniques (CTA, ANN and SRE) and redundant evaluation methods (KAPPA), and to choose another pseudo absence strategy (random). I look forward to reading the paper, and that's all great practical guidance. 
>> 
>> I ran all the models mostly for exploratory purposes, with the assumption that i could identify the worst performers with evaluation scores and drop them from Ensemble.Forecasting ( ) with flags such as weight.method='proportional', decay = 1.6, qual.th = 0.5. With that said, it makes since to drop weak approaches up front, so thanks again for the advice.
>> 
>> In case there are other methods producing curious errors besides CTA, a different species fell over yesterday on FDA, with the following error:
>> Model=Flexible Discriminant Analysis 
>> Error in family$linkfun(mustart) : 
>>   Argument mu must be a nonempty numeric vector
>> 
>> Cheers, 
>> Jason
>> 
>> 
>> 
>> On Feb 10, 2012, at 6:11 PM, Wilfried Thuiller wrote:
>> 
>>> Dear Jason,
>>> 
>>> The problem is related to the classification tree. It seems the tree has no branches and BIOMOD does not manage to get the optimal number of leaves. 
>>> My take on that is that your species is very difficult to model for CTA and it does not manage to find any good combination of variables explaining the distribution. 
>>> Perhaps we need to make an error trap around this call as this is not the first time we got this error, which is not a BIOMOD bug, but rather the failure of CTA to find an optimal number of leaves. 
>>> I will suggest removing the model for this species.
>>> 
>>> Looking at the code you sent with this message and the other one, I would also suggest:
>>> 
>>> Removing some techniques. Some generally performed quite poorly compared to the other ones. CTA and ANN are the weakest ones. SRE is more like a null model, the most liberal. I would drop the three. 
>>> Same for evaluation, you do not need to use both TSS and KAPPA, they are quite redundant. Just keep TSS, less influenced by prevalence and ROC. 
>>> 
>>> I would recommend using "random" for pseudo absence instead of SRE or circle which tends to over fit a bit the data.
>>> I take the liberty to send you this paper in press with Methods in Ecology and Evolution that discuss the different strategies for selecting pseudo-absence:
>>> PDF
>>> 
>>> Hope it helps,
>>> 
>>> Wilfried
>>> 
>>> 
>>> 
>>> 
>>> Le 7 févr. 2012 à 05:06, jason b mackenzie a écrit :
>>> 
>>>> Dear all,
>>>> 
>>>> I recently had a BIOMOD run crash mid-stream after 20+ species appear to have run successfully. The error message is pasted below. Any ideas or workarounds? 
>>>> 
>>>> ...
>>>> #####			 ERFA2 			#####
>>>> #####		   pseudo-absence run 1        		#####
>>>> Model=Artificial Neural Network 
>>>> 	 3 Fold Cross Validation + 3 Repetitions 
>>>> Calibration and evaluation phase: Nb of cross-validations:  3 
>>>> Evaluating Predictor Contributions in  ANN ... 
>>>> Model=Classification tree 
>>>> 	 50 Fold Cross-Validation 
>>>> Error in newx$cptable[max(keep), 1L] <- cp : subscript out of bounds
>>>> > warnings()
>>>> Warning messages:
>>>> 1: In predict.lm(object, newdata, se.fit, scale = 1, type = ifelse(type ==  ... :
>>>>   prediction from a rank-deficient fit may be misleading
>>>> 2: In cor(g.pred[, ], as.integer(.Rescaler4(as.numeric(predict(model.sp,  ... :
>>>>   the standard deviation is zero
>>>> ...
>>>> 
>>>> 
>>>> FYI, not sure if this matters, but my starting conditions were set as follows:
>>>> 
>>>> Initial.State(Response=SP.ENV[,12:75], Explanatory=SP.ENV[,4:11], IndependentResponse=NULL, 
>>>> 		IndependentExplanatory=NULL)
>>>> 
>>>> Models(GLM = FALSE, GAM = TRUE, Spline=4, GBM = TRUE, No.trees = 3000, CTA = TRUE, CV.tree = 50, ANN = TRUE, CV.ann = 3, 
>>>> 		SRE = TRUE, quant = 0.025, FDA = TRUE, MARS = TRUE, RF = TRUE, NbRunEval = 3, DataSplit = 70, 
>>>> 		Yweights = NULL, Roc = TRUE, Optimized.Threshold.Roc = TRUE, Kappa = TRUE, TSS = TRUE, 
>>>> 		KeepPredIndependent = FALSE, VarImport = 5, NbRepPA = 2, strategy = "sre", nb.absences = 1000)
>>>> 
>>>> 
>>>> Thanks in advance for any help!
>>>> 
>>>> Cheers, 
>>>> Jason
>>>> 
>>>> 
>>>> --
>>>> 
>>>> jason b mackenzie
>>>> 80 banambila street
>>>> aranda, act 2614
>>>> 
>>>> 0447 002 629  [mobile]
>>>> jasonbmackenzie    [skype]
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Biomod-commits mailing list
>>>> Biomod-commits at lists.r-forge.r-project.org
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/biomod-commits
>>> 
>>> --------------------------
>>> Dr. Wilfried Thuiller
>>> Laboratoire d'Ecologie Alpine, UMR CNRS 5553
>>> Université Joseph Fourier
>>> BP53, 38041 Grenoble cedex 9, France
>>> tel: +33 (0)4 76 51 44 97
>>> fax: +33 (0)4 76 51 42 79
>>> 
>>> Email: wilfried.thuiller at ujf-grenoble.fr
>>> Personal website: http://www.will.chez-alice.fr
>>> Team website: http://www-leca.ujf-grenoble.fr/equipes/emabio.htm
>>> 
>>> ERC Starting Grant TEEMBIO project: http://www.will.chez-alice.fr/Research.html
>>> FP6 European EcoChange project: http://www.ecochange-project.eu
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> --
>> 
>> jason b mackenzie
>> 80 banambila street
>> aranda, act 2614
>> 
>> 0447 002 629  [mobile]
>> jasonbmackenzie    [skype]
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Biomod-commits mailing list
>> Biomod-commits at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/biomod-commits
> 
> --------------------------
> Dr. Wilfried Thuiller
> Laboratoire d'Ecologie Alpine, UMR CNRS 5553
> Université Joseph Fourier
> BP53, 38041 Grenoble cedex 9, France
> tel: +33 (0)4 76 51 44 97
> fax: +33 (0)4 76 51 42 79
> 
> Email: wilfried.thuiller at ujf-grenoble.fr
> Personal website: http://www.will.chez-alice.fr
> Team website: http://www-leca.ujf-grenoble.fr/equipes/emabio.htm
> 
> ERC Starting Grant TEEMBIO project: http://www.will.chez-alice.fr/Research.html
> FP6 European EcoChange project: http://www.ecochange-project.eu
> 
> 
> 
> 
> 
> 
> 


--

jason b mackenzie
80 banambila street
aranda, act 2614

0447 002 629  [mobile]
jasonbmackenzie    [skype]




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/biomod-commits/attachments/20120210/971045c2/attachment-0001.html>


More information about the Biomod-commits mailing list