[Traminer-users] p-values for sequential association rules?
Gilbert Ritschard
Gilbert.Ritschard at unige.ch
Wed Mar 11 23:09:06 CET 2015
Hi Aron,
The p-value is that of the implication strength (ImplicStat). This criteria is a z value. The lower ImplicStat (greater negative value), the greater the implication strength of the rule. Your first two rules have an implication strength of about -1, and P(Z<-1) is about .16 for a normal distribution. Likewise, for the 3rd rule, you have an implicative strength of .08, and P(Z<.08) = .53. When the p-value is less than .05, the rule has a significant implicative strength.
For more explanation see for example,
Ritschard, G., V. Pisetta and D.A. Zighed (2008), Inducing and evaluating classification trees with statistical implicative criteria, in R. Gras, E. Suzuki, F. Guillet and F. Spagnolo (eds), Statistical Implicative Analysis: Theory and Applications, Series Studies in Computational Intelligence, Vol. 127. Berlin: Springer, 397-420.
Preprint: http://mephisto.unige.ch/pub/publications/gr/ritsch-pisetta-zighed_bookGras_final_plain.pdf
Best,
Gilbert
From: traminer-users-bounces at lists.r-forge.r-project.org [mailto:traminer-users-bounces at lists.r-forge.r-project.org] On Behalf Of Aron Lindberg
Sent: Wednesday, March 11, 2015 16:29
To: traminer-users at lists.r-forge.r-project.org
Subject: [Traminer-users] p-values for sequential association rules?
Hi,
When running sequential association rules I get several values in the output:
Rules Support Conf Lift Standardlift JMeasure ImplicStat p.value
1 (I1) => (I2) 15 0.7142857 0.7482993 -1.1607143 0.45894146 -0.9770084 0.1642825
2 (I2) => (I1) 17 0.8095238 0.8480726 -0.4404762 0.20127953 -0.9770084 0.1642825
3 (I2) => (I3) 17 0.8095238 0.9373434 0.5193644 0.01626898 0.0805823 0.5321129
From here I understand what support, confidence, and lift are:
http://stackoverflow.com/questions/27947556/traminerseqerules-help-page
However, what does the p-value mean? It seems to be highly and negatively correlated with the confidence, but at the same time I have many sequences with combinations high support, confidence, and lift that still are insignificant.
Hence: what does the p-value pertain to? Can the rules be meaningfully interpreted even with the p-value is insignificant?
Best,
ARon
--
Aron Lindberg
Doctoral Candidate, Information Systems
Weatherhead School of Management
Case Western Reserve University
aronlindberg.github.io
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20150311/f391951a/attachment.html>
More information about the Traminer-users
mailing list