[adegenet-forum] relevant way to compare posterior probabilities between DAPC with the same prior groups and the same individuals

Sun Jun 8 14:57:39 CEST 2014

Hi everyone,

I have performed DAPC on a set of 934 individuals, using 10 predefined 
groups.

I did this with different sets of SNPs (coming from epigenetics assays 
in different tissues);
now I would like to compare the posterior assignments, to know if the 
tissue has an effect, and I don't know what would be the best way.

I have thought about the following:

1- compare the slot assign.per.pop of the summary(dapc), which is the 
percentage of individuals a posteriori assigned to their original prior 
group, for each group. for me a vector of 10 values.

To make it clearer, what I want to compare is sthg like that:
                 prior1    prior2    ...    prior j    ...    prior10
tissue 1    p1,1      p1,2               ...                 p1,10
...                                                ...
tissue i                                       pi,j

where pi,j is the proportion of individuals from prior j correctly 
assigned to j, using tissue i.

I cannot really use anova, because I have only one value per group per 
tissue.
I think it is useless to repeat the dapc in order to get several value 
for each categorie to be able to do an anova, because if the results 
come from multiple simulations, they would be really close I suppose.
So I don't know what would be the error values of this proportion of 
correct reassignment. Maybe if I knew what is the error associated with 
these proportions I could conclude.

I started doing chi-squared tests on the posterior group sizes, but this 
is not really relevant because the posterior groups are a mix of the 
correct and the wrong assignments.

2- compare at the level of the individual the probabilities of assignment.

That is, create a table with those fields :
individual - priorgrp - post proba of assignment to prior grp - tissue

And then do something like a glm( post proba ~ priorgrp + tissue ).

I cannot do an anova because for one cluster and for one tissue the 
proba doesn't have a normal distribution, so I assume it is better with 
the generalized linear model.

Or, use a manova: same than the glm, except that instead of taking only 
the posterior proba of assignment to the prior grp, I take the vector of 
proba of assignment to every group. For now I haven't clearly found the 
conditions to apply a manova, so I am not sure if I can apply it with 
the distribution I have.

How would you compare posterior probabilities of DAPC ?
Hope this not too unclear.

Thank you in advance,

Guillaume

PS: I have not be able to find the information, but how are established 
the posterior probabilities of assignment ? by simulation or 
analytically ? If by simulation, how many iterations are performed ?