[adegenet-forum] Cross validation using xvalDapc
Caitlin Collins
caitiecollins at gmail.com
Wed Sep 30 23:14:46 CEST 2015
Hi Kirsty,
Now that you seem to have cross-validation working, I was wondering which,
if any, of your questions still remain to be answered. Are you still
looking for help on any of the questions you posted?
If you are still looking for help, I was wondering if you could offer me a
clarification:
I took a look at the post you made to StackOverflow, copied your data, and
tried to run through the code in your e-mail. But I got stuck because I am
not sure where this came from: x1$Matriline. It didn't seem to be one of
the variables in the "mydat" dataset at the bottom of your post that you
said contained the LDA.scores data you had been working with...
Please let us know what questions or problems you are still running into.
Thanks,
Caitlin.
On Wed, Sep 30, 2015 at 7:19 PM, Kirsty Medcalf <kirsty.m.medcalf at gmail.com>
wrote:
> Hi,
>
> Firstly, I would like to thank you for your previous recommendations, it
> was greatly appreciated. The solution was not as obvious at first but I
> persevered. Thank you again because I am moderately new to R.
>
> Kind regards to this forum
>
> Kirsty
>
> xval <- xvalDapc(x, grp1$grp, n.pca.max = 2, training.set = 0.7,
> result = "groupMean", center = TRUE, scale = FALSE,
> n.pca = NULL, n.rep = 30, xval.plot = TRUE)
>
> $`Cross-Validation Results`
> n.pca success
> 1 1 0.6111111
> 2 1 0.6666667
> 3 1 0.6666667
> 4 1 0.6111111
> 5 1 0.6190476
> 6 1 0.6190476
> 7 1 0.6111111
> 8 1 0.5634921
> 9 1 0.6111111
> 10 1 0.6111111
> 11 1 0.6190476
> 12 1 0.6666667
> 13 1 0.5079365
> 14 1 0.6190476
> 15 1 0.6190476
> 16 1 0.6666667
> 17 1 0.6111111
> 18 1 0.6111111
> 19 1 0.4603175
> 20 1 0.6111111
> 21 1 0.6111111
> 22 1 0.6666667
> 23 1 0.5634921
> 24 1 0.6666667
> 25 1 0.6666667
> 26 1 0.5079365
> 27 1 0.6111111
> 28 1 0.6190476
> 29 1 0.6111111
> 30 1 0.6666667
>
> $`Median and Confidence Interval for Random Chance`
> 2.5% 50% 97.5%
> 0.2411765 0.3303922 0.4377002
>
> $`Mean Successful Assignment by Number of PCs of PCA`
> 1
> 0.6124339
>
> $`Number of PCs Achieving Highest Mean Success`
> [1] "1"
>
> $`Root Mean Squared Error by Number of PCs of PCA`
> 1
> 0.3907175
>
> $`Number of PCs Achieving Lowest MSE`
> [1] "1"
>
> $DAPC
> #################################################
> # Discriminant Analysis of Principal Components #
> #################################################
> class: dapc
> $call: dapc.data.frame(x = x, grp = grp, n.pca = n.pca, n.da = n.da)
>
> $n.pca: 1 first PCs of PCA used
> $n.da: 1 discriminant functions saved
> $var (proportion of conserved variance): 0.605
>
> $eig (eigenvalues): 58.23 vector length content
> 1 $eig 1 eigenvalues
> 2 $grp 80 prior group assignment
> 3 $prior 3 prior group probabilities
> 4 $assign 80 posterior group assignment
> 5 $pca.cent 12 centring vector of PCA
> 6 $pca.norm 12 scaling vector of PCA
> 7 $pca.eig 12 eigenvalues of PCA
>
> data.frame nrow ncol
> 1 $tab 80 1
> 2 $means 3 1
> 3 $loadings 1 1
> 4 $ind.coord 80 1
> 5 $grp.coord 3 1
> 6 $posterior 80 3
> 7 $pca.loadings 12 1
> 8 $var.contr 12 1
> content
> 1 retained PCs of PCA
> 2 group means
> 3 loadings of variables
> 4 coordinates of individuals (principal components)
> 5 coordinates of groups
> 6 posterior membership probabilities
> 7 PCA loadings of original variables
> 8 contribution of original variables
>
>
>
>
> Kirsty Medcalf
>
> kirsty.m.medcalf at gmail.com
>
> +447963374030
>
> skype contact: kirsty.medcalf
>
> On Tue, Sep 29, 2015 at 9:44 AM, Kirsty Medcalf <
> kirsty.m.medcalf at gmail.com> wrote:
>
>> Hi
>>
>> I am attempting to cross validate my results from DAPC analysis with a 70
>> % training set using the function xvalDapc (code below). My data frame is
>> called LDA.scores. this is an updated version of a previous post after
>> taking into account the recommendationsbut I am still outputting the same
>> error message. Do I have to change my data frame into a list? If so, what
>> would be the correct format to transform the data frame into this format.
>> If this is possible, I was wondering if anyone had a solution with how to
>> solve this error message (below). I have looked online and through
>> available tutorials and still cannot solve this issue. Words cannot
>> describe my gratitude if this is possible.
>>
>> #Permute the data
>>
>> set.seed(999)
>>
>> x<-LDA.scores[,2:13]
>>
>> grp1<-find.clusters(x, max.n.clust=12)
>> dapc1<-dapc(x, grp1$grp)
>>
>> #DAPC analysis
>>
>> windows(width=10, height=7)
>> x<-LDA.scores[,2:13]
>> grp1<-find.clusters(x, max.n.clust=12)
>> dapc1<-dapc(x, grp1$grp)
>> dapc1
>>
>> #Loadings plot
>>
>> contrib <- loadingplot(dapc1$var.contr, axis=2,
>> thres=.07, lab.jitter=1)
>>
>>
>> #Cross Validation
>> windows(width=10, height=7)
>> set.seed(1234)
>> x1 <- LDA.scores
>> str(x1)
>> x1$Matriline<-as.factor(x1$Matriline)
>> xval <- xvalDapc(x1, grp1, n.pca.max = 2, training.set = 0.7,
>> result = "groupMean", center = TRUE, scale = FALSE,
>> n.pca = NULL, n.rep = 30, xval.plot = TRUE)
>>
>> Error in sort.list(y) : 'x' must be atomic for 'sort.list'
>> Have you called 'sort' on a list?
>>
>> During the DAPC analysis, I chose to retain 2 PCs and 2 LD's, and there
>> appears to be 3 clusters. Would n.pca.max=2 be correct?
>>
>> My reproducible data, the logical steps that I took to chose the number
>> of PC's and LD's to retain, and the number of chosen clusters is available
>> on stack overflow
>>
>>
>> http://stackoverflow.com/questions/32704902/discriminant-analysis-of-principal-components-and-how-to-graphically-show-the-di
>>
>> If it is possible to help me, then thank you
>>
>> Best wishes,
>> Kirsty
>>
>>
>>
>>
>>
>>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150930/6d30c52e/attachment-0001.html>
More information about the adegenet-forum
mailing list