[Traminer-users] pseudo-ANOVA

Claire Lemercier Claire.Lemercier at ens.fr
Mon Jun 21 12:57:35 CEST 2010


Dear Matthias,
Many thanks for this. Just a follow-up about the pseudo-F and pseudo-R2 
histograms (I'm rather being curious, as the important thing for me at 
this stage is that the p-value is significant, and remains so if I 
concentrate on households with at least 3 members).
My results now look like this (after 5000 permutations, for all households):
Pseudo ANOVA table:
            SS   df       MSE
Exp   5700.474  469 12.154529
Res   3653.928  862  4.238896
Total 9354.402 1331  7.028101

Test values  (p-values based on 4999 permutations):
  PseudoF  PseudoR2 PseudoF_Pval PseudoT PseudoT_Pval
 2.867381 0.6093894            0     Inf            0

However, when I look at ths histograms, the values of PseudoF cluster 
around 1 and those of PseudoR2 around 0.35. Why are they so different 
from the PseudoF and PseudoR2 given in the general results?
All the best,
Claire.
> Dear Claire,
>
> Your questions are indeed very interesting. There are some cases when 
> the Pseudo T statistic becomes infinite. For instance, when some groups 
> have 2 or less units and I guess that this is your case. This means that 
> you should not interpret the T statistic. Therefore, you cannot conclude 
> that discrepancies (pseudo-variance) are significantly different (or 
> not) in each groups.
>
> The p-value is significant and that is what you should look at. The 
> absolute value of the R2 may be difficult to interpret. It should be 
> compared to the R2 obtained by random permutation (to get an idea if 
> 0.609 is high or low). These values are stored in the object returned by 
> dissassoc. You can easily get an histogram using:
> da <- dissassoc(...)
> hist(da)
>
> The values are stored in da$perms$t[, 1] (for the PseudoF statistic) or 
> da$perms$t[, 2] for Pseudo R2.
>
> The list of pseudo variances for each household can be easily recovered. 
> da$groups provide size and pseudo variance of each factor levels (i.e. 
> households in your case). You should just remove the last line (which 
> gives the total n and pseudo-variance). This can be done with the 
> following code.
> da$groups[-nrow(da$groups),]
>
> Regarding the high number of levels, I think that this is not a problem. 
> Permutation gives you the probability that your R2 is higher than 
> obtained by random. To be sure I suggest you to use at least 5'000 
> permutations:
> da <- dissassoc(..., R=5000)
>
> Hope this help,
> Matthias Studer
>
> Le 17.06.2010 09:37, Claire Lemercier a ?crit :
>   
>> Hi all,
>> I am using the pseudo-ANOVA routine (dissassoc) of TraMineR; I think 
>> that I understand it correctly when it deals with "classical" 
>> categorical variables like sex, but I want to be sure that I make no 
>> mistake in interpreting the results in a case where the variable has 
>> hundreds of levels.
>> We have 1332 individual sequences clustered in 470 households and we 
>> want to test if persons in the same household tend to vote similarly 
>> at similar timepoints. We produced a distance through optimal matching 
>> (with parameters giving an important weight to simultaneity) and 
>> dissassoc gives these results for households:
>>
>> Pseudo ANOVA table:
>>            SS   df       MSE
>> Exp   5700.474  469 12.154529
>> Res   3653.928  862  4.238896
>> Total 9354.402 1331  7.028101
>>
>> Test values  (p-values based on 999 permutation):
>>  PseudoF  PseudoR2 PseudoF_Pval PseudoT PseudoT_Pval
>> 2.867381 0.6093894            0     Inf            0
>>
>> Is the "Inf" for PseudoT a problem? Am I right in understanding that 
>> this shows a very important association and that, despite of the high 
>> number of categories, it is highly significant? I also wanted to 
>> control for the fact that household-homogeneity could be concentrated 
>> in only some of the households. Am I correct to think that the fact 
>> that, with an overall pseudo-variance of 7, a vast majority of even 
>> the largest households has an internal pseudo-variance of less than 5 
>> points in the right direction?
>> All the best,
>> Claire
>>
>>
>>
>> _______________________________________________
>> Traminer-users mailing list
>> Traminer-users at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users 
>>
>>     
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20100618/d0732b59/attachment-0001.htm>
>
> ------------------------------
>
> _______________________________________________
> Traminer-users mailing list
> Traminer-users at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users
>
>
> End of Traminer-users Digest, Vol 2, Issue 8
> ********************************************
>
>
>   




More information about the Traminer-users mailing list