[adegenet-forum] Interpretation of DAPC results
Kirsty Medcalf
kirsty.m.medcalf at gmail.com
Sun Oct 11 08:25:14 CEST 2015
Dear Jombert and the adegenet forum,
I have used the function find.clusters and conducted a DAPC analysis to
create a scatterplot. In addition to conducting a cross validation using a
70 % training set over 30 repeats using the function xvalDapc. If this is
possible, I was wondering if I could please ask for advice regarding the
interpretation of my results. If this is possible, then I would be deeply
appreciative and would hold you in the highest regard.
The code can be found on my stack overflow page and the output figures are
attached.
http://stackoverflow.com/questions/32704902/discriminant-analysis-of-principal-components-and-how-to-graphically-show-the-di
After running a DAPC analysis, as well as a basic PCA and LDA analysis.
The PCA output describes two PC's to explain the models variance, and the
LDA found one discriminant function to describe the variance of the data.
My data is a multivariate analysis rather than an exploration of genetic
clusters. The reproducible data can be found in the link above. If this is
possible, I was wondering if anyone can explain why the DAPC found more
structure in the data by finding 3 clusters. More specifically I was
wondering if anyone would mind reading my stack overflow page to see if I
have followed the correct steps to create an accurate model.
My code for the cross validation (below) shows the probability of assigning
the correct PC's more than random chance. Would I be right in saying that
the accuracy rate of selecting the right number of PC's is 61 % and that
the most optimal model would only contain an assignment of only one PC. My
goal is to completely understand each step of this analysis.
Thank you so much for your patience If you read the whole content of this
post if someone has the ability to provide advice regarding the
interpretation of these result, then thank you in advance.
Best wishes
Kirsty
My cross validation code and results are:
xval <- xvalDapc(x, grp1$grp, n.pca.max = 2, training.set = 0.7,
result = "groupMean", center = TRUE, scale = FALSE,
n.pca = NULL, n.rep = 30, xval.plot = TRUE)
$`Cross-Validation Results`
n.pca success
1 1 0.5833333
2 1 0.6000000
3 1 0.6000000
4 1 0.6666667
5 1 0.5833333
6 1 0.6666667
7 1 0.6000000
8 1 0.5833333
9 1 0.6666667
10 1 0.5833333
11 1 0.6000000
12 1 0.5833333
13 1 0.5833333
14 1 0.6666667
15 1 0.6000000
16 1 0.6666667
17 1 0.5833333
18 1 0.6666667
19 1 0.5833333
20 1 0.6000000
21 1 0.6666667
22 1 0.6666667
23 1 0.5833333
24 1 0.4666667
25 1 0.6666667
26 1 0.6666667
27 1 0.5166667
28 1 0.6666667
29 1 0.6000000
30 1 0.5166667
$`Median and Confidence Interval for Random Chance`
2.5% 50% 97.5%
0.2360938 0.3270833 0.4355208
$`Mean Successful Assignment by Number of PCs of PCA`
1
0.6094444
$`Number of PCs Achieving Highest Mean Success`
[1] "1"
$`Root Mean Squared Error by Number of PCs of PCA`
1
0.3939708
$`Number of PCs Achieving Lowest MSE`
[1] "1"
$DAPC
#################################################
# Discriminant Analysis of Principal Components #
#################################################
class: dapc
$call: dapc.data.frame(x = x, grp = grp, n.pca = n.pca, n.da = n.da)
$n.pca: 1 first PCs of PCA used
$n.da: 1 discriminant functions saved
$var (proportion of conserved variance): 0.605
$eig (eigenvalues): 54.9 vector length content
1 $eig 1 eigenvalues
2 $grp 80 prior group assignment
3 $prior 3 prior group probabilities
4 $assign 80 posterior group assignment
5 $pca.cent 12 centring vector of PCA
6 $pca.norm 12 scaling vector of PCA
7 $pca.eig 12 eigenvalues of PCA
data.frame nrow ncol content
1 $tab 80 1 retained PCs of PCA
2 $means 3 1 group means
3 $loadings 1 1 loadings of variables
4 $ind.coord 80 1 coordinates of individuals (principal components)
5 $grp.coord 3 1 coordinates of groups
6 $posterior 80 3 posterior membership probabilities
7 $pca.loadings 12 1 PCA loadings of original variables
8 $var.contr 12 1 contribution of original variables
Kirsty Medcalf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20151010/4f9461a6/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PCA2.jpeg
Type: image/jpeg
Size: 63078 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20151010/4f9461a6/attachment-0003.jpeg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Biplot.jpeg
Type: image/jpeg
Size: 98022 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20151010/4f9461a6/attachment-0004.jpeg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DAPC.jpeg
Type: image/jpeg
Size: 195092 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20151010/4f9461a6/attachment-0005.jpeg>
More information about the adegenet-forum
mailing list