Hi all<div><br></div><div>I think one of the most importance differences to consider when comparing Fst with DAPC is about the way a pop is defined. The link Vladimir sent about the use in molecular anthropology of previously labelled pops is a really good example of a non-genetic based approach of pops definition, which as such probably implies erroneous assumptions by default (unfortunately for me I have collected some experience with that). On the contrary, starting from individuals to infer which is the best number of clusters (can we then say pops, maybe?) based on the allelic variance optimization is definitely better, at least is genetics. Once this is done, a "pops tree" obtained with an Fst or a DAPC would probably result quite similar (without forgetting that the variance decomposition is differently interpreted by the two approaches as explained by Thibaut and amke different assumptions as stated by Vladimir). But I've never tested it actually...</div>
<div><br></div><div>In general, I guess the clusters could be regarded as pops, but this is presumably quite far from defining panmictic groups. So, biologically speaking, I don't know what's the best way to consider the clusters. That would be great to have a method to define panmictic groups without needing to test an explicit biological model...but clustering is really useful.</div>
<div><br></div><div>Best regards</div><div><br></div><div>Valeria</div><div><br></div><div><div><div class="gmail_quote">
2011/5/4 Jombart, Thibaut <span dir="ltr"><<a href="mailto:t.jombart@imperial.ac.uk" target="_blank">t.jombart@imperial.ac.uk</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div style="direction:ltr;font-family:Tahoma;color:#000000;font-size:10pt">Hi there,<br>
<br>
I don't think there is actually a problem in using Fst in this case. Even if HWE assumption does not hold, it can be used as a between-groups distance measure. It is actually very closely related to the quantity optimised by DAPC. Fst is (between-group variance)/(total
variance), while DAPC optimizes (between-group variance)/(within-group variance). However, any other distance measure (e.g. implemented in dist.genpop) can be used.<br>
<br>
I think one of the main interests of representing the between-group distances on DAPC scatterplot is that in some cases, especially in lower-order axes, coordinates might not fully display the relationships between groups. For instance, imagine a structure
with 6 populations in three islands (a,b,c),(d,e),(f), assuming (f) is more distant to the other two islands. One axis might emphasize (a,b,c) vs (d,e), and (f) could fall close to the origin. Representing a minimum spanning tree based on between-population
distances will remind us that (f) is fairly isolated, and prevent the naive interpretation that it is related to both (a,b,c) and (d,e).<br>
<br>
Cheers<br>
<br>
Thibaut<br>
<br>
<br>
<div style="font-family:Times New Roman;color:rgb(0, 0, 0);font-size:16px">
<hr>
<div style="direction:ltr"><font color="#000000" face="Tahoma" size="2"><b>From:</b> Vladimir Mikryukov [<a href="mailto:vmikryukov@gmail.com" target="_blank">vmikryukov@gmail.com</a>]<br>
<b>Sent:</b> 04 May 2011 08:23<br>
<b>To:</b> adegenet forum<br>
<b>Cc:</b> Mac Campbell; Jombart, Thibaut<div><br>
<b>Subject:</b> Re: [adegenet-forum] Population clustering idea<br>
</div></font><br>
</div><div><div></div><div>
<div></div>
<div>Hello,<br>
Please correct me if I'm wrong,<br>
but I think that viewing population differentiation with Fst has many limitations as well.<br>
Why one should switch from a more robust method (DAPC doesn't care about Hardy-Weinberg equilibrium and linkage disequilibrium, isn't it?) to the other (Fst) approach?<br>
Probably it's possible to utilize obtained principal component scores for that?<br>
Or this method will overestimate the differentiation?<br>
<br>
Using other genetic distance measures (especially <span lang="en">
<span title="Нажмите, чтобы увидеть альтернативный перевод">those</span>
<span title="Нажмите, чтобы увидеть альтернативный перевод">which assume particular
</span></span>mutation model, i.e.<span lang="en"><span title="Нажмите, чтобы увидеть альтернативный перевод"> IAM or SSM for microsatellites)</span></span> for the real data could be tricky as well.<br>
<br>
Vladimir.<br>
<br>
<br>
PS. a brief summary of Fst's assumptions one may find here:<br>
<a href="https://anthrogenetics.wordpress.com/2010/10/11/problems-with-fst-based-methods-human-populations-violate-important-assumptions/" target="_blank">https://anthrogenetics.wordpress.com/2010/10/11/problems-with-fst-based-methods-human-populations-violate-important-assumptions/</a><br>
<br>
Or at least I'll suggest to use bias-corrected differentiation index (Dest) like in DEMEtics package (see reference). However, in my practice usually it is highly correlated with Fst (Mantel's r = 0.7 - 0.96)<br>
<br>
Gerlach G., Jueterbock A., Kraemer P., Deppermann J., Harmand P. Calculations of population differentiation based on Gst and D: forget Gst but not all of statistics! // Molecular Ecology. 2010. V. 19. № 18. P. 3845-3852.<br>
<br>
<br>
<div class="gmail_quote">On Tue, May 3, 2011 at 10:53 PM, Mac Campbell <span dir="ltr">
<<a href="mailto:macampbell2@alaska.edu" target="_blank">macampbell2@alaska.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex">
Hi,
<div><br>
</div>
<div>Yes, I agree there are many limitations to viewing populations in a tree like perspective. Initially, I was interested in quantifying how far apart the groups are on a scatter plot because it was hard to tell. I think the code Vladimir sent me does just
that, at least it tells me which ones are closer to each other.</div>
<div><br>
</div>
<div>It will be cool to have a more biologically significant (Fst based) way implemented. One thing that came to mind too was if I wanted to use something like IMa2, I would need to have an assumption in tree form of how the populations are related. </div>
<div><br>
</div>
<font color="#888888">
<div>Mac</div>
</font>
<div>
<div></div>
<div>
<div><br>
<div class="gmail_quote">On Sat, Apr 30, 2011 at 8:56 AM, Jombart, Thibaut <span dir="ltr">
<<a href="mailto:t.jombart@imperial.ac.uk" target="_blank">t.jombart@imperial.ac.uk</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex">
<div>
<div style="direction:ltr;font-family:Tahoma;color:rgb(0, 0, 0);font-size:10pt">
Hello, <br>
<br>
that's a good question. Actually I thought about implementing something along these lines for the dapc scatterplot. I agree with Russell's point that relationships between populations are not necessarily best presented by fully bifurcating trees. However, linking
the populations which are the closest according to a given distance measure (e.g. Fst ) does make sense. I would go for a minimum spanning tree, which is a nice way of showing which are the closest neighbours in terms of genetic distances. It won't be too
much of a pain to code either.<br>
<br>
I will be working on the next adegenet release over the weeks to come, so will probably give it a go soon.<br>
<br>
Cheers<br>
<br>
Thibaut<br>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div></div></div>
</div>
</div>
<br>_______________________________________________<br>
adegenet-forum mailing list<br>
<a href="mailto:adegenet-forum@lists.r-forge.r-project.org" target="_blank">adegenet-forum@lists.r-forge.r-project.org</a><br>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum</a><br>
<br></blockquote></div><br></div></div>