[adegenet-forum] PCA query?

AVIK RAY avik.ray.kol at gmail.com
Tue Jun 21 20:08:27 CEST 2011


Dear Thibaut
Thanks for very effective reply; it seems DAPC is more suitable for my 
dataset and for the question I'm looking at!
I did few mock runs to see the very initial results, and the BIC curve 
shows gradual leveling off after K=9 it seems, however from STRUCTURE 
(Bayesian) and FLOCK (Max Likelihood) number of putative clusters 
appears to be 2/3; so wondering what made this difference? or I am 
wrongly interpreting it ! ....anyways my dataset contains lot of missing 
data, does that matter much, shall I remove those and then try!
I am attaching BIC and retained PC curves for reference
Thanks
cheers

AVIK


On 6/20/2011 6:58 PM, Jombart, Thibaut wrote:
> Hello,
>
> in none, as far as PCoA / MDS are concerned, they do the same as PCA, but just allow for using fancier Euclidean distances. Loosing information in terms of total variance does not necessarily imply loosing information in terms of group discrimination. But if you're looking for clusters, you don't necessarily need to reduce the dimensionality of the data - most clustering algorithm don't.
>
> Please have a look at the DAPC paper which is really on these topics. You may also be interested in the DAPC vignette for the next release of adegenet.
> DAPC paper is here:
> http://www.biomedcentral.com/1471-2156/11/94
>
> DAPC vignette is there:
> http://adegenet.r-forge.r-project.org/files/adegenet-dapc.pdf
>
> Cheers
>
> Thibaut
>
> ________________________________________
> From: adegenet-forum-bounces at r-forge.wu-wien.ac.at [adegenet-forum-bounces at r-forge.wu-wien.ac.at] on behalf of AVIK RAY [avik.ray.kol at gmail.com]
> Sent: 20 June 2011 13:12
> To: adegenet-forum at r-forge.wu-wien.ac.at
> Subject: [adegenet-forum] PCA query?
>
> Hi all
> bit of confusion with PCA in general, I did PCA in adegenet and it has
> shown some plot with multiple clusters. My data is tetraploid
> microsatellite data and I need to find out potential clusters i.e. some
> individuals are more similar than others with allele data. But If not
> mistaken PCA converts allele information into some synthetic variable
> and does clustering where we tend to loose out lot of information since
> it will select most but not all alleles; so in that sense does PCoA/
> Multidimentional scaling or simply clustering analysis (e.g. K means or
> hierarchical clustering) make more sense?
> Thanks in advance for reply
>
> AVIK
>
> _______________________________________________
> adegenet-forum mailing list
> adegenet-forum at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PCs.bmp
Type: image/bmp
Size: 339798 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20110621/e472e02a/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DAPC-NoClust-BIC.bmp
Type: image/bmp
Size: 875318 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/adegenet-forum/attachments/20110621/e472e02a/attachment-0003.bin>


More information about the adegenet-forum mailing list