[adegenet-forum] DAPC for non-structured populations

Joao Faria jfariaos at gmail.com
Mon Jan 21 19:02:57 CET 2013


Dear Thibaut and DAPC users,



I've been exploring DAPC for the past week on a 10 microsatellite dataset
with 114 individuals from 3 geographically distinct locations. I've used
individual based modeling (Structure) and genetic differentiation indices
(Fst, Rst, Dest) to explore the structuring of these populations (not
exclusively). As expected, I found no structure at all (it's a crustacean
species with a huge larval dispersion capacity). I wanted to use DAPC to
confirm these results and graphically represent the absence of divergence
among such populations, but it consistently fails to present a valid
output. I'll try to explain briefly my line of procedure:



I've performed analysis on two difference ways…one by using find.clusters
and the other by assuming the number of clusters equal to my sampled
locations (which are geographically separated).



1. Using find.clusters

I've retained all PCs (110) and got the lowest BIC for K = 2, with
individuals from each actual group (ori) being ~equally divided among the
two inferred groups (inf) (I guess that such evidence would be enough to
considered my populations undifferentiated…but let's move forward...)



I've used the 2 inferred groups to perform a DAPC and selected 1/3 number
of individuals of PCs (PCs = 38) (~60% cumulative variance…too much
information missed). With two clusters I get one single discriminant
function (one eigenvalue). If I the scatter the DAPC I obtain the density
of the individuals for the single discriminant function and get perfect
differentiated clusters!!



1.a) I understand that I've lost a bit of information by selecting few PCs
but still shouldn't I get enough to observe undifferentiated populations?



1. b) Is there any limitation in find.clusters that impedes one to get K=1
and therefore the only K to work upon is K =2? Even if such method splits
roughly half of the ori samples to each cluster? Is such inconsistency with
original groups a sign of lack of structure?



2. Using populations as prior groups (3 clusters)

Number of PCs retained = 70; chosen as to capture a large amount of the
variation  ~95%. All discriminate functions were retained (n.da = 2). As
the number of retained PCs of PCA is too large >N/3, the DAPC outcome shows
overfitting of the descriminant functions and perfectly (wrongly)
differentiate the three clusters. At this point, I've taken a look at the
a-score to the previous DAPC. I got 37 as an optimal number of PCs.
Nevertheless, a-score mean was 0.07…a very low number, with the highest
proportion for the optimal number of PCs of 0.11. Is this a clear sign of
poor fitting? When performing the DAPC retaining the 37 PCs, I still get a
perfect discrimination of clusters.



2.a) Is it possible to get a visual representation of unstructured
populations using DAPC!?


I might be trying to do an impossible DAPC visualization, and skipping a
lot of methodological constrains…and I do apologize for this huge text!!



Thanks in advance for your help.



Best Regards


João Faria

PhD student

University of Azores
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20130121/e89f1b59/attachment.html>


More information about the adegenet-forum mailing list