[Traminer-users] dissmfac coefficients

Thu Aug 26 15:21:13 CEST 2010

Hello,

As in regression, you should not experience too much collinearity 
problems. However, in case of strong multicollinearity, you may 
experience computational problem (as in multiple regression).

Please consider the following example:
## Defining a state sequence object
data(mvad)
mvad.seq <- seqdef(mvad[, 17:86])

## Building dissimilarities
mvad.lcs <- seqdist(mvad.seq, method="LCS")
print(dissmfac(mvad.lcs ~ Grammar +    gcse5eq, data=mvad, R=1000))

The result wil be:

   Variable  PseudoF   PseudoR2 p_value
1  Grammar 21.86726 0.02595961       0
2  gcse5eq 90.87825 0.10788564       0
3    Total 67.27347 0.15950143       0

The "total " pseudo-R2 (last line) tells us that the full model 
"explain" around 16% of the discrepancy of the sequence.

The pseudo-R2 computed for each covariate can be interpreted as the loss 
of explanatory power if this covariate is removed from the full model. 
If we remove the gcse5eq covariate, the R2 of the model decreases by 0.108.

A more in depth presentation (as well as formulae) may be found in this 
article (section 5): 
http://mephisto.unige.ch/pub/publications/gr/Studer_akdm_2010.pdf

Hence, in your case, the pseudo R2 of 3.68E-02 for exper3 tells you that 
the R2 of your model will decreases by 3.68E-02 if your remove the 
exper3 covariate (but keep exper1, exper2, exper4, expert5...). exper7 
is significant, hence it add a valuable information to the full model 
(although less than expert1).

Did I answer your question ?

Matthias

Le 24.08.2010 10:16, Alexandre Pollien a écrit :
> Hello
>
> I have a question about the interpretation of the dissmfac function. I 
> hope this is not stupid and understable.
>
> Can the pseudo-F, pseudo-R2 be interpreted and used in the same way 
> that regression coefficients? In other words, should we pay attention 
> to problems of collinearity? Is this really multifactorial system or 
> simply a series of independent coefficients?
> I ask these questions because I did an analysis with strongly 
> correlated variables, and the results do not seem bad:
> I would like to determine the breakpoint of an experience effect : 
> exper = 1 to 8 and I recode it like this:
>
> exper1 (=1 / 2->8)
> exper2 (= 1->2 / 3->8)
> exper3 (= 1->3 / 4->8)
> exper4 (= 1->4 / 5->8)
> etc. ...
>
> I suspect that the experience effect is very strong at first, then 
> gradually decreases. The results are:
>
> PseudoR2:
> 7.49E-02
> 4.57E-02
> 3.68E-02
> 9.54E-03
> 7.49E-03
> 7.44E-03
> 4.49E-03
>
> The passage from one experience to two produced a change, as well as 2 
> to 3, 3 to 4. But a break seems to occur at the 4th level of 
> experience. Having 4 or having 5 (or 6,7 etc..) units experience seems 
> to make not very difference (even if the 4th p-value = 0.000). Is that 
> right?
>
> Thank a lot
>
> Alexandre
> _______________________________________________
> Traminer-users mailing list
> Traminer-users at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/traminer-users 
>