[adegenet-forum] Question about genetic structure in admixed populations

Valeria Montano mirainoshojo at gmail.com
Thu Sep 5 10:59:43 CEST 2013

Dear Jutta,

cluster analysis can be tricky when the samples analysed are distributed
along a gradient and if there is no clear-cut subdivision, this can lead to
contradictory results (have a look at this paper
You may want to consider using TESS or BAPS with the admixture model
option. These two software allow including the geographic coordinates as a
prior information and the admixture model is a way to model spatial
gradients. If you tested the IBD with a Mantel test, just be careful that a
significant mantel test is not directly due to IBD, geo to gen correlation
can be significant for different spatial/migratory schemes. I think your
DAPC is ok, a part from the fact that there is no need to use the
find.clusters with the number of PCs indicated by the optim.a.score. This
procedure is used to optimize the discriminant space among clusters in the
DAPC. To assign individuals to clusters you can simply retrieve all the
variance (even though in your case is almost the same given that you have
98%). Only thing, I would try with max number of clusters around 20, more
than your sampling locations. You can also give sPCA a try.

Hope this helps



On 4 September 2013 15:03, Jutta Geismar <Jutta.Geismar at senckenberg.de>wrote:

Dear Mr Jombart and DAPC users,
> ** **
> I used DAPC to analyze genetic structure in a small region with 20
> microsatellite markers. I analyzed 330 individuals (14 sampling sites) and
> found little genetic differences (FST, D Jost), but a significant isolation
> by distance pattern. A cluster analysis in STRUCTURE resulted in four
> clusters (STRUCTURE Harvester) but all individuals had more or less equal
> posterior probability in all of the four inferred clusters. Therefore I
> assume a panmictic population structure. Since STRUCTURE is known for some
> problems analyzing datasets under IBD I analyzed the data with DAPC. DAPC
> resulted in 3 or 4 clusters (and tested up until K=7 to be sure), but in
> both cases these were randomly distributed among all individuals without a
> geographic context. Only 94 individuals were not assigned to one cluster
> with more than 90% and therefore would be counted as “admixed” (example in
> DAPC tutorial). For me the results of STRUCTURE and DAPC are in conflict to
> each other, but I don’t know how a panmictic population would look like in
> DAPC. Distances between sites are small and it is very likely that gene
> flow occurs among my sampling points, which might cause problems in genetic
> cluster analyses. I don’t know if I made any mistake in my thinking, that’s
why I want to explain my procedure briefly:
> 1.       I used dapc and chose 1/3 of the sample size as PC (as
> suggested) and counted DAs in the plot (100% of the variability was
> included, 110 PC, 13 DA)****
> 2.       To reduce variability I used optim.a.score (smart FALSE). The
> best a-score was around 0.2 (PC 61)****
> 3.       After that I wanted to estimate the number of clusters by
> find.clusters and used the a-score as number of PCs and repeated the dapc
> (conserved variance was still 98%, 61 PCs, 2 DA) ****
> I chose k in the BIC values after which the decrease was less compared to
> the previous, but not the lowest k.****
> If I have some mistakes in my procedure I would appreciate some advice.
> But also if the procedure is okay I cannot explain the contrariness of
Thanks a lot in advance for some help.
Jutta Geismar
> Jutta Geismar ****
> PhD student
Germany
