[adegenet-forum] Cross validation using xvalDapc

Kirsty Medcalf kirsty.m.medcalf at gmail.com
Wed Sep 30 20:19:29 CEST 2015


Hi,

Firstly, I would like to thank you for your previous recommendations, it
was greatly appreciated. The solution was not as obvious at first but I
persevered. Thank you again because I am moderately new to R.

Kind regards to this forum

Kirsty

xval <- xvalDapc(x, grp1$grp, n.pca.max = 2, training.set = 0.7,
                 result = "groupMean", center = TRUE, scale = FALSE,
                 n.pca = NULL, n.rep = 30, xval.plot = TRUE)

$`Cross-Validation Results`
   n.pca   success
1      1 0.6111111
2      1 0.6666667
3      1 0.6666667
4      1 0.6111111
5      1 0.6190476
6      1 0.6190476
7      1 0.6111111
8      1 0.5634921
9      1 0.6111111
10     1 0.6111111
11     1 0.6190476
12     1 0.6666667
13     1 0.5079365
14     1 0.6190476
15     1 0.6190476
16     1 0.6666667
17     1 0.6111111
18     1 0.6111111
19     1 0.4603175
20     1 0.6111111
21     1 0.6111111
22     1 0.6666667
23     1 0.5634921
24     1 0.6666667
25     1 0.6666667
26     1 0.5079365
27     1 0.6111111
28     1 0.6190476
29     1 0.6111111
30     1 0.6666667

$`Median and Confidence Interval for Random Chance`
     2.5%       50%     97.5%
0.2411765 0.3303922 0.4377002

$`Mean Successful Assignment by Number of PCs of PCA`
        1
0.6124339

$`Number of PCs Achieving Highest Mean Success`
[1] "1"

$`Root Mean Squared Error by Number of PCs of PCA`
        1
0.3907175

$`Number of PCs Achieving Lowest MSE`
[1] "1"

$DAPC
#################################################
# Discriminant Analysis of Principal Components #
#################################################
class: dapc
$call: dapc.data.frame(x = x, grp = grp, n.pca = n.pca, n.da = n.da)

$n.pca: 1 first PCs of PCA used
$n.da: 1 discriminant functions saved
$var (proportion of conserved variance): 0.605

$eig (eigenvalues): 58.23  vector    length content
1 $eig      1      eigenvalues
2 $grp      80     prior group assignment
3 $prior    3      prior group probabilities
4 $assign   80     posterior group assignment
5 $pca.cent 12     centring vector of PCA
6 $pca.norm 12     scaling vector of PCA
7 $pca.eig  12     eigenvalues of PCA

  data.frame    nrow ncol
1 $tab          80   1
2 $means        3    1
3 $loadings     1    1
4 $ind.coord    80   1
5 $grp.coord    3    1
6 $posterior    80   3
7 $pca.loadings 12   1
8 $var.contr    12   1
  content
1 retained PCs of PCA
2 group means
3 loadings of variables
4 coordinates of individuals (principal components)
5 coordinates of groups
6 posterior membership probabilities
7 PCA loadings of original variables
8 contribution of original variables




Kirsty Medcalf

kirsty.m.medcalf at gmail.com

+447963374030

skype contact: kirsty.medcalf

On Tue, Sep 29, 2015 at 9:44 AM, Kirsty Medcalf <kirsty.m.medcalf at gmail.com>
wrote:

> Hi
>
> I am attempting to cross validate my results from DAPC analysis with a 70
> % training set using the function xvalDapc (code below).  My data frame is
> called LDA.scores. this is an updated version of a previous post after
> taking into account the recommendationsbut I am still outputting the same
> error message.  Do I have to change my data frame into a list? If so, what
> would be the correct format to transform the data frame into this format.
> If this is possible, I was wondering if anyone had a solution with how to
> solve this error message (below).  I have looked online and through
> available tutorials and still cannot solve this issue.  Words cannot
> describe my gratitude if this is possible.
>
>  #Permute the data
>
> set.seed(999)
>
> x<-LDA.scores[,2:13]
>
>    grp1<-find.clusters(x, max.n.clust=12)
>    dapc1<-dapc(x, grp1$grp)
>
> #DAPC analysis
>
> windows(width=10, height=7)
> x<-LDA.scores[,2:13]
> grp1<-find.clusters(x, max.n.clust=12)
> dapc1<-dapc(x, grp1$grp)
> dapc1
>
> #Loadings plot
>
> contrib <- loadingplot(dapc1$var.contr, axis=2,
>                        thres=.07, lab.jitter=1)
>
>
> #Cross Validation
> windows(width=10, height=7)
> set.seed(1234)
> x1 <- LDA.scores
> str(x1)
> x1$Matriline<-as.factor(x1$Matriline)
> xval <- xvalDapc(x1, grp1, n.pca.max = 2, training.set = 0.7,
>                  result = "groupMean", center = TRUE, scale = FALSE,
>                  n.pca = NULL, n.rep = 30, xval.plot = TRUE)
>
> Error in sort.list(y) : 'x' must be atomic for 'sort.list'
> Have you called 'sort' on a list?
>
> During the DAPC analysis,  I chose to retain 2 PCs and 2 LD's, and there
> appears to be 3 clusters. Would n.pca.max=2 be correct?
>
> My reproducible data, the logical steps that I took to chose the number of
> PC's and LD's to retain,  and the number of chosen clusters is available on
> stack overflow
>
>
> http://stackoverflow.com/questions/32704902/discriminant-analysis-of-principal-components-and-how-to-graphically-show-the-di
>
> If it is possible to help me, then thank you
>
> Best wishes,
> Kirsty
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20150930/2dabb7af/attachment.html>


More information about the adegenet-forum mailing list