[adegenet-forum] DAPC analyis and interpretation
Elodie Blanchet
blanchet.elodie at gmail.com
Fri Jul 22 14:39:55 CEST 2011
Hi Valeria,
thanks a lot for your help!
1) Concerning the assignation of "actual" group to "assign" cluster, i
don't expect to found 15 clusters, but that the majority of each
"actual" population was assigned to "assign" cluster (i.e. 80% of
"actual" population 1 is assigned to cluster A).
I did not expect 15 clusters because i worked on an invasive plant along
a corridor of dispersal.
2)Then concerning the individuals with low probability, i agree that it
is normal to observe individual with low probability, but i wondered if
i compared this second observation with the first (see above) what can i
deduce about cluster revealed by the function?
But, probably running "find.clusters" function with more iterations will
able to obtain more consistent results.
Thanks a lot for your help.
All the best.
Elodie
Le 22/07/2011 13:39, valeria montano a écrit :
> Hi Elodie,
>
> I can try to give you a superficial opinion which I hope to be of some
> interest for you.
>
> To obtain a consistent estimate of number of cluster you can try to
> increase the number of iteration (n.iter) and that should work out. I
> experimented the problem of a non consistent number of cluster when
> using a few components retained, but I assume you're retaining all the
> components.
>
> When you say "actual groups" I guess that you expected to see your 15
> pops divided in 15 clusters.
>
> If your pops are "actually" 15 pops, maybe your loci are not powerful
> enough to detect them. In any case, I wouldn't say that they are lying
> to you, it's merely the point of view of your 11 SSR.
> I would say that in general, population structure is a question of
> tones between complete isolation and panmixia. If analysing different
> sets of molecular data for the same sample, there is a concordant
> indication of structure, one can probably assume that is the best way
> to cluster the individuals and that probably mirror reality quite well.
> If you have other information that makes you be almost sure that your
> pops are 15 (I don't know, maybe something like: my pops are
> physically divided in 15 valleys, or other spatial information), you
> could try to run a sPCA. If you get a significant global structure
> (and there is the chance since you're working with nice plants and not
> stupid humans), you can see if one of the components gives you the
> expected 15 pops. Considering the result obtained with the DAPC, it
> won't probably be the first component, but maybe the second or the
> third...who knows...this could be a test to see if there is a global
> structure above the 15 pops and maybe your 15 is a kind of secondary
> structure (sorry, I am not explaining myself really well). In that
> case, you might be quite sure that your 15 SSR are giving you a good
> genetic point of view. Otherwise, if nothing that I've said will
> happen, you can only trust your 11 SSR and their clustering and try to
> find a good biological explanation to convince yourself and the rest
> of world that your number of clusters is the best for you individuals,
> or type more markers...
>
> Concerning the individuals with low probability, I have to confess
> that I've never worked at the individual level, but I imagine that
> it's perfectly normal to have those individuals in any cluster
> analysis. They might be hybrids, expression of the genetic/spatial
> continuity existing among natural pops.
>
> I don't know what else to add...
>
> good luck
>
> Valeria
>
> On 21 July 2011 10:12, Elodie Blanchet <blanchet.elodie at gmail.com
> <mailto:blanchet.elodie at gmail.com>> wrote:
>
> Dear Dr. Jombart and Adegenet users,
>
> I have some questions about DAPC analysis.
>
> I worked on tetraploid plant, with 11 SSR markers, 15 populations
> sampled with 30 individuals each.
>
> 1) When I ran ‘find.clusters’ function, elbow in the curve of BIC
> values was not very clear so I ran it many time. But I obtained
> different optimal number of cluster even if I increase
> “max.n.cluster” option.
>
> I agree that it is made with Bayesian computation, but in this
> case how can I choose the “best” optimal number of cluster?
>
> Maybe, these non-homogenous results between different runs are due
> to the sampling pattern of my populations which were along a
> corridor (thus suggesting a stepping-stone model of dispersal?)
>
> 2) Besides, if I took into account the most frequent “k” after ten
> runs of “find.clusters” function (k=8), I observed that actual
> groups did not correspond to inferred group. I mean that in the
> best case, only 17,5 % of my actual group are inferred to clusters
> revealed by the analysis. Even if individual posterior membership
> was upper than 75% in most of case, I did not know if the genetic
> structure revealed by the analysis is supported or not?
>
> 3) Moreover, some of the clusters revealed by the analysis, are
> made with individuals having posterior membership probability
> <60%, how interpreting these clusters? I would tend to run again
> the analysis and reduce “k”…?
>
> Sorry for this long mail, I hope it is sufficiently clear.
>
> Thanks in advance for your help.
>
> Elodie
>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> <mailto:adegenet-forum at lists.r-forge.r-project.org>
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20110722/44d70a6c/attachment-0001.htm>
More information about the adegenet-forum
mailing list