[adegenet-forum] Request of DAPC analysis
Das, Roma (ICRISAT-IN)
r.das at cgiar.org
Wed Aug 28 06:20:59 CEST 2019
Hello Everyone,
I am using DAPC using adegenet package for cluster analysis. However I am not sure if I am following the correct way to select n.pca and n.clust based on cross-validation.
I am following below steps
1. I am using a genind object
2. Used find.clusters() grp <- find.clusters() and interactively chose n.pca and n.clust. Based on plot, I selected n.pca=200 and n.clust=21
3. Next used xvalDapc() to get some idea about number of PCs
xval <- xvalDapc(tab(fdat, NA.method = "mean"), grp$grp, n.pca.max = 300, n.rep = 30)
4. Based on number of PCs achieving highest mean success and lowest MSE, I selected n.pca=50
5. Further, I tried to narrowed the search of PC's with n.pca = 30:60
xval_optimum <- xvalDapc(tab(fdat, NA.method = "mean"), grp$grp, n.pca = 30:60, n.rep = 100,parallel = "multicore", ncpus = 6L )
6. Finally I selected n.pca=30 based on number of PCs achieving highest mean success and lowest MSE from xval_optimum
My questions are:
7. From cross-validation, it seems the optimum number of PCs is 30. Should I re-run find.clusters() with n.pca=30 and select n.clust interactively from plot
8. And then re-run dapc() with n.pca=30 and output of n.clust from step 6. Please advise
Thanks,
Roma
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20190828/70d3b905/attachment.html>
More information about the adegenet-forum
mailing list