[adegenet-forum] DAPC/sPCA. Over-parameterization issues?

Jombart, Thibaut t.jombart at imperial.ac.uk
Thu Jun 16 17:50:29 CEST 2011


Hello, 

k-means is not subject to over-parametrization, so if the BIC plot gives a clear answer there are certainly 3 clusters in your data. You should have a look at the dapc tutorial early release about the over-parametrization issue in DAPC:

I have never seen sPCA fail to retrieve an existing spatial structure, including under settings with very low differentiation (see for instance the rupica dataset), unless a completely weird connection network is chosen (which basically does not reflect the spatial proximities between studied entities). 

A trivial biological explanation would be your individuals come from 3 different populations of origin.

Have you considered mapping the Discriminant Functions of your DAPC over the geographic space? You can use s.value or colorplot for this. Or simply map the different group memberships. That will tell you if the structures you identified are also spatial structures.

Cheers

Thibaut

 
________________________________________
From: adegenet-forum-bounces at r-forge.wu-wien.ac.at [adegenet-forum-bounces at r-forge.wu-wien.ac.at] on behalf of jaa695 at mail.usask.ca [jaa695 at mail.usask.ca]
Sent: 16 June 2011 16:01
To: adegenet-forum at r-forge.wu-wien.ac.at
Subject: [adegenet-forum]  DAPC/sPCA. Over-parameterization issues?

Dear all,

I've been trying to figure out the real meaning of my latest aedegenet analyses,
and hope some of you can help.
Study case: continuous sampling (i.e. no a priori defined populations) of a
large mammal (n= 500 individuals, 34 STR loci) at a regional scale.
Puzzling results: When using DAPC on my dataset I do find some evidence for the
existence of three differentiated clusters (lowest BIC= 3 clusters, and delta
BIC between clusters=1 and clusters =3 is 16, when plotting there is no overlap
among clusters. Assign.per.pop ranges from 0.97 to 0.99). However when looking
at my sPCA results I do not find any evidence for either local or global
structure. Although there are potential biological explanantions for the above
results (e.g. assortative mating) this are extremely unlikely, and my feeling
is that the DAPC cluster might be the result of over-parameterization (494 x
292 matrix of genotypes. Reatined PCs 75 (80-85% cumulative variance, 2
retained discriminant functions). Any thoughts?

THANKS!

/Jose



_______________________________________________
adegenet-forum mailing list
adegenet-forum at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/adegenet-forum


More information about the adegenet-forum mailing list