[Traminer-users] pseudo-ANOVA
Claire Lemercier
Claire.Lemercier at ens.fr
Mon Jun 21 12:57:35 CEST 2010
Dear Matthias,
Many thanks for this. Just a follow-up about the pseudo-F and pseudo-R2
histograms (I'm rather being curious, as the important thing for me at
this stage is that the p-value is significant, and remains so if I
concentrate on households with at least 3 members).
My results now look like this (after 5000 permutations, for all households):
Pseudo ANOVA table:
SS df MSE
Exp 5700.474 469 12.154529
Res 3653.928 862 4.238896
Total 9354.402 1331 7.028101
Test values (p-values based on 4999 permutations):
PseudoF PseudoR2 PseudoF_Pval PseudoT PseudoT_Pval
2.867381 0.6093894 0 Inf 0
However, when I look at ths histograms, the values of PseudoF cluster
around 1 and those of PseudoR2 around 0.35. Why are they so different
from the PseudoF and PseudoR2 given in the general results?
All the best,
Claire.
> Dear Claire,
>
> Your questions are indeed very interesting. There are some cases when
> the Pseudo T statistic becomes infinite. For instance, when some groups
> have 2 or less units and I guess that this is your case. This means that
> you should not interpret the T statistic. Therefore, you cannot conclude
> that discrepancies (pseudo-variance) are significantly different (or
> not) in each groups.
>
> The p-value is significant and that is what you should look at. The
> absolute value of the R2 may be difficult to interpret. It should be
> compared to the R2 obtained by random permutation (to get an idea if
> 0.609 is high or low). These values are stored in the object returned by
> dissassoc. You can easily get an histogram using:
> da <- dissassoc(...)
> hist(da)
>
> The values are stored in da$perms$t[, 1] (for the PseudoF statistic) or
> da$perms$t[, 2] for Pseudo R2.
>
> The list of pseudo variances for each household can be easily recovered.
> da$groups provide size and pseudo variance of each factor levels (i.e.
> households in your case). You should just remove the last line (which
> gives the total n and pseudo-variance). This can be done with the
> following code.
> da$groups[-nrow(da$groups),]
>
> Regarding the high number of levels, I think that this is not a problem.
> Permutation gives you the probability that your R2 is higher than
> obtained by random. To be sure I suggest you to use at least 5'000
> permutations:
> da <- dissassoc(..., R=5000)
>
> Hope this help,
> Matthias Studer
>
> Le 17.06.2010 09:37, Claire Lemercier a ?crit :
>
>> Hi all,
>> I am using the pseudo-ANOVA routine (dissassoc) of TraMineR; I think
>> that I understand it correctly when it deals with "classical"
>> categorical variables like sex, but I want to be sure that I make no
>> mistake in interpreting the results in a case where the variable has
>> hundreds of levels.
>> We have 1332 individual sequences clustered in 470 households and we
>> want to test if persons in the same household tend to vote similarly
>> at similar timepoints. We produced a distance through optimal matching
>> (with parameters giving an important weight to simultaneity) and
>> dissassoc gives these results for households:
>>
>> Pseudo ANOVA table:
>> SS df MSE
>> Exp 5700.474 469 12.154529
>> Res 3653.928 862 4.238896
>> Total 9354.402 1331 7.028101
>>
>> Test values (p-values based on 999 permutation):
>> PseudoF PseudoR2 PseudoF_Pval PseudoT PseudoT_Pval
>> 2.867381 0.6093894 0 Inf 0
>>
>> Is the "Inf" for PseudoT a problem? Am I right in understanding that
>> this shows a very important association and that, despite of the high
>> number of categories, it is highly significant? I also wanted to
>> control for the fact that household-homogeneity could be concentrated
>> in only some of the households. Am I correct to think that the fact
>> that, with an overall pseudo-variance of 7, a vast majority of even
>> the largest households has an internal pseudo-variance of less than 5
>> points in the right direction?
>> All the best,
>> Claire
>>
>>
>>
>> _______________________________________________
>> Traminer-users mailing list
>> Traminer-users at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users
>>
>>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20100618/d0732b59/attachment-0001.htm>
>
> ------------------------------
>
> _______________________________________________
> Traminer-users mailing list
> Traminer-users at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users
>
>
> End of Traminer-users Digest, Vol 2, Issue 8
> ********************************************
>
>
>
More information about the Traminer-users
mailing list