[adegenet-forum] relevant way to compare posterior probabilities between DAPC with the same prior groups and the same individuals
Guillaume Louvel
guillaumelouvel at hotmail.fr
Sun Jun 8 14:57:39 CEST 2014
Hi everyone,
I have performed DAPC on a set of 934 individuals, using 10 predefined
groups.
I did this with different sets of SNPs (coming from epigenetics assays
in different tissues);
now I would like to compare the posterior assignments, to know if the
tissue has an effect, and I don't know what would be the best way.
I have thought about the following:
1- compare the slot assign.per.pop of the summary(dapc), which is the
percentage of individuals a posteriori assigned to their original prior
group, for each group. for me a vector of 10 values.
To make it clearer, what I want to compare is sthg like that:
prior1 prior2 ... prior j ... prior10
tissue 1 p1,1 p1,2 ... p1,10
... ...
tissue i pi,j
where pi,j is the proportion of individuals from prior j correctly
assigned to j, using tissue i.
I cannot really use anova, because I have only one value per group per
tissue.
I think it is useless to repeat the dapc in order to get several value
for each categorie to be able to do an anova, because if the results
come from multiple simulations, they would be really close I suppose.
So I don't know what would be the error values of this proportion of
correct reassignment. Maybe if I knew what is the error associated with
these proportions I could conclude.
I started doing chi-squared tests on the posterior group sizes, but this
is not really relevant because the posterior groups are a mix of the
correct and the wrong assignments.
2- compare at the level of the individual the probabilities of assignment.
That is, create a table with those fields :
individual - priorgrp - post proba of assignment to prior grp - tissue
And then do something like a glm( post proba ~ priorgrp + tissue ).
I cannot do an anova because for one cluster and for one tissue the
proba doesn't have a normal distribution, so I assume it is better with
the generalized linear model.
Or, use a manova: same than the glm, except that instead of taking only
the posterior proba of assignment to the prior grp, I take the vector of
proba of assignment to every group. For now I haven't clearly found the
conditions to apply a manova, so I am not sure if I can apply it with
the distribution I have.
How would you compare posterior probabilities of DAPC ?
Hope this not too unclear.
Thank you in advance,
Guillaume
PS: I have not be able to find the information, but how are established
the posterior probabilities of assignment ? by simulation or
analytically ? If by simulation, how many iterations are performed ?
More information about the adegenet-forum
mailing list