[adegenet-forum] interpretation sPCA

Tue Oct 4 14:37:40 CEST 2011

Dear Evert, 

I don't think the existence of a cline can be used to infer the origin of an organism. Surely in this case the cline you obtain is compatible with a 'central' origin, but the origin could as well be at either extremities of the cline, or anywhere in between. All the pattern says is that gene flow is somehow negatively related to geographic distance. More generally, no multivariate analysis result is directional. It would be reassuring if the outcome of sPCA roughly match that of DAPC, although both methods are different. This can be easily checked by DAPC scores on the map. Discrepancies can be due to, for instance, the fact that non-spatial genetic structures are the strongest (then DAPC will pick that up first). Another one would be the absence of spatial structure. It is safer to perform a global.rtest (although it lacks power) and to check the screeplot of sPCA before interpreting structures.

Test the origin of your populations would need population-level data. The idea is that within-population diversity decreases when we get away from the origin due to repeated bottlenecks. If you don't have population data, one workaround would be using moving windows to map diversity geographically, and then use a simple optimisation procedure to find the 'optimal' origin. I don't know if this has been done before, so it might be newish. I have developed a package "geoGraph" (on Rforge, not on CRAN: https://r-forge.r-project.org/R/?group_id=348) which does this (apart from the moving windows) and has a vignette illustrating the whole process.

Cheers

Thibaut.

________________________________________
From: Thomas, Evert (Bioversity-Colombia) [E.Thomas at CGIAR.ORG]
Sent: 03 October 2011 21:48
To: Jombart, Thibaut; Linda Rutledge; adegenet-forum at r-forge.wu-wien.ac.at
Subject: interpretation sPCA

Dear Thibaut,

I have a question regarding the interpretation of the sPCA scores as visualized in a color plot or interpolated lagged scores. I am working with intraspecific species data at continental level and found a strong gradient in my data  with a clear separation of a northern and southern group. Based on a number of grounds I believe that the center of origin of the species I am working with is located  at the "genotone" (or what to call this, I mean the grey area between both groups where the genetic differentiation is the steepest) . Does this make sense with the theory behind sPCA? I think the species moved north and south from the putative center of origin and developed into different genotypes which becomes apparent in the visualization of the sPCA...

And should the outcome of an sPCA be somewhat reflected in the outcomes of discriminant analysis of principal components or are these really two different methods? (I apologize for my ignorance)

Many thanks in advance

Evert