[adegenet-forum] DAPC analyis and interpretation

valeria montano mirainoshojo at gmail.com
Fri Jul 22 13:39:54 CEST 2011


Hi Elodie,

I can try to give you a superficial opinion which I hope to be of some
interest for you.

To obtain a consistent estimate of number of cluster you can try to increase
the number of iteration (n.iter) and that should work out. I experimented
the problem of a non consistent number of cluster when using a few
components retained, but I assume you're retaining all the components.

When you say "actual groups" I guess that you expected to see your 15 pops
divided in 15 clusters.

If your pops are "actually" 15 pops, maybe your loci are not powerful enough
to detect them. In any case, I wouldn't say that they are lying to you, it's
merely the point of view of your 11 SSR.
I would say that in general, population structure is a question of tones
between complete isolation and panmixia. If analysing different sets of
molecular data for the same sample, there is a concordant indication of
structure, one can probably assume that is the best way to cluster the
individuals and that probably mirror reality quite well.
If you have other information that makes you be almost sure that your pops
are 15 (I don't know, maybe something like: my pops are physically divided
in 15 valleys, or other spatial information), you could try to run a sPCA.
If you get a significant global structure (and there is the chance since
you're working with nice plants and not stupid humans), you can see if one
of the components gives you the expected 15 pops. Considering the result
obtained with the DAPC, it won't probably be the first component, but maybe
the second or the third...who knows...this could be a test to see if there
is a global structure above the 15 pops and maybe your 15 is a kind of
secondary structure (sorry, I am not explaining myself really well). In that
case, you might be quite sure that your 15 SSR are giving you a good genetic
point of view. Otherwise, if nothing that I've said will happen, you can
only trust your 11 SSR and their clustering and try to find a good
biological explanation to convince yourself and  the rest of world that your
number of clusters is the best for you individuals, or type more markers...

Concerning the individuals with low probability, I have to confess that I've
never worked at the individual level, but I imagine that it's perfectly
normal to have those individuals in any cluster analysis. They might be
hybrids, expression of the genetic/spatial continuity existing among natural
pops.

I don't know what else to add...

good luck

Valeria

On 21 July 2011 10:12, Elodie Blanchet <blanchet.elodie at gmail.com> wrote:

> **
>
> Dear Dr. Jombart and Adegenet users,
>
>
>
> I have some questions about DAPC analysis.
>
> I worked on tetraploid plant, with 11 SSR markers, 15 populations sampled
> with 30 individuals each.
>
>
>
> 1) When I ran ‘find.clusters’ function, elbow in the curve of BIC values
> was not very clear so I ran it many time. But I obtained different optimal
> number of cluster even if I increase “max.n.cluster” option.
>
>  I agree that it is made with Bayesian computation, but in this case how
> can I choose the “best” optimal number of cluster?
>
> Maybe, these non-homogenous results between different runs are due to the
> sampling pattern of my populations which were along a corridor (thus
> suggesting a stepping-stone model of dispersal?)
>
>
>
>  2) Besides, if I took into account the most frequent “k” after ten runs
> of “find.clusters” function (k=8), I observed that actual groups did not
> correspond to inferred group. I mean that in the best case, only 17,5 % of
> my actual group are inferred to clusters revealed by the analysis. Even if
> individual posterior membership was upper than 75% in most of case, I did
> not know if the genetic structure revealed by the analysis is supported or
> not?
>
>
>
> 3) Moreover, some of the clusters revealed by the analysis, are made with
> individuals having posterior membership probability <60%, how interpreting
> these clusters? I would tend to run again the analysis and reduce “k”…?
>
>
>
>
>
>
>
> Sorry for this long mail, I hope it is sufficiently clear.
>
> Thanks in advance for your help.
>
> Elodie
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20110722/5b932278/attachment.htm>


More information about the adegenet-forum mailing list