[Traminer-users] pseudo-ANOVA results

Thu Jun 17 09:37:14 CEST 2010

Hi all,
I am using the pseudo-ANOVA routine (dissassoc) of TraMineR; I think 
that I understand it correctly when it deals with "classical" 
categorical variables like sex, but I want to be sure that I make no 
mistake in interpreting the results in a case where the variable has 
hundreds of levels.
We have 1332 individual sequences clustered in 470 households and we 
want to test if persons in the same household tend to vote similarly at 
similar timepoints. We produced a distance through optimal matching 
(with parameters giving an important weight to simultaneity) and 
dissassoc gives these results for households:

Pseudo ANOVA table:
            SS   df       MSE
Exp   5700.474  469 12.154529
Res   3653.928  862  4.238896
Total 9354.402 1331  7.028101

Test values  (p-values based on 999 permutation):
  PseudoF  PseudoR2 PseudoF_Pval PseudoT PseudoT_Pval
 2.867381 0.6093894            0     Inf            0

Is the "Inf" for PseudoT a problem? Am I right in understanding that 
this shows a very important association and that, despite of the high 
number of categories, it is highly significant? I also wanted to control 
for the fact that household-homogeneity could be concentrated in only 
some of the households. Am I correct to think that the fact that, with 
an overall pseudo-variance of 7, a vast majority of even the largest 
households has an internal pseudo-variance of less than 5 points in the 
right direction?
All the best,
Claire