[adegenet-forum] DAPC and s.class

Thibaut Jombart thibautjombart at gmail.com
Wed May 30 11:17:06 CEST 2018


Hi Sela

For question A: the analyses need not agree - cf figure 1 of the DAPC paper:
https://bmcgenet.biomedcentral.com/articles/10.1186/1471-2156-11-94

This is because PCA and DAPC maximise different criteria (total variance
/vs/ (var between / total var)).

For question B: in very 'rectangular' data (large p, small n), one can
easily find a combination of alleles providing perfect discrimination. I
would recommend using cross validation to make sure you don't select too
many PCA axes in the DAPC. You will likely find that the less axes you
retain in the analyses, the more scatter individuals are. If they remain at
the same locations, this likely indicates identical genotypes.

Best
Thibaut




--
Dr Thibaut Jombart
Lecturer, Department of Infectious Disease Epidemiology, Imperial College
London
Head of RECON: repidemicsconsortium.org
WHO Consultant - outbreak analysis
https://thibautjombart.netlify.com
Twitter: @TeebzR
+44(0)20 7594 3658

On 26 May 2018 at 20:39, Hanan Sela <hans at post.tau.ac.il> wrote:

> Hello
> I am trying to study the genetic diversity of a collection of a wheat wild
> relative. I have a matrix of 105 individuals and ~10000 SNP. Each
> individual is assigned to a collection site (population). I have done
> several analyses to investigate the genetic diversity and how it is
> distributed geographically.
> 1.  dudi.pca followed by s.class of dudi.pca$li using the population
> origin as the classing factor.
> 2. dapc using the population origin as the grouping factor.
> 3. dapc using the grouping results of find.clusters.
>
> I have two problems:
>
> A.  The centers of the populations of  s.calss from analysis 1 do not
> agree with the centers of  dapc analysis 2 Meaning that some population
> centers that are one near the other in analysis 1 are  very distant one
> from the other in analysis 2 while the grouping of find.clusters is similar
> to the plotting of dudi.pca and s.class.
>
> B. In bouth dpac analyses all of individulas of each group have the excat
> same coordiates.
>
> Please help with interpetation of  the results. Which analysis represents
> better the true genetic distances between populations?
>
> Thank you
>
> Hanan Sela Ph.D.
> Curator of the Lieberman Cereal Seed Bank
> The Institute for Cereal Crops Improvement
> Tel-Aviv University
> P.O. Box 39040
> Tel Aviv 6139001
> Israel
>
> E: hans at tauex.tau.ac.il <hans at tauex.tau.ac.il>
> P: 972-3-6405773
> M: 972-50-5727458
> F: 972-3-6407857
>
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/
> listinfo/adegenet-forum
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20180530/b80c2cde/attachment.html>


More information about the adegenet-forum mailing list