[Traminer-users] p-values for sequential association rules?

Gilbert Ritschard Gilbert.Ritschard at unige.ch
Wed Mar 11 23:09:06 CET 2015


Hi Aron,

The p-value is that of the implication strength (ImplicStat). This criteria is a z value. The lower ImplicStat (greater negative value), the greater the implication strength of the rule. Your first two rules have an implication strength of about -1, and P(Z<-1) is about .16 for a normal distribution. Likewise, for the 3rd rule, you have an implicative strength of .08, and P(Z<.08) = .53. When the p-value is less than .05, the rule has a significant implicative strength.

For more explanation see for example,
Ritschard, G., V. Pisetta and D.A. Zighed (2008), Inducing and evaluating classification trees with statistical implicative criteria, in R. Gras, E. Suzuki, F. Guillet and F. Spagnolo (eds), Statistical Implicative Analysis: Theory and Applications, Series Studies in Computational Intelligence, Vol. 127. Berlin: Springer, 397-420.
Preprint: http://mephisto.unige.ch/pub/publications/gr/ritsch-pisetta-zighed_bookGras_final_plain.pdf

Best,
Gilbert

From: traminer-users-bounces at lists.r-forge.r-project.org [mailto:traminer-users-bounces at lists.r-forge.r-project.org] On Behalf Of Aron Lindberg
Sent: Wednesday, March 11, 2015 16:29
To: traminer-users at lists.r-forge.r-project.org
Subject: [Traminer-users] p-values for sequential association rules?

Hi,

When running sequential association rules I get several values in the output:

         Rules Support      Conf      Lift Standardlift   JMeasure ImplicStat   p.value
1 (I1) => (I2)      15 0.7142857 0.7482993   -1.1607143 0.45894146 -0.9770084 0.1642825
2 (I2) => (I1)      17 0.8095238 0.8480726   -0.4404762 0.20127953 -0.9770084 0.1642825
3 (I2) => (I3)      17 0.8095238 0.9373434    0.5193644 0.01626898  0.0805823 0.5321129

From here I understand what support, confidence, and lift are:
http://stackoverflow.com/questions/27947556/traminerseqerules-help-page

However, what does the p-value mean? It seems to be highly and negatively correlated with the confidence, but at the same time I have many sequences with combinations high support, confidence, and lift that still are insignificant.

Hence: what does the p-value pertain to? Can the rules be meaningfully interpreted even with the p-value is insignificant?

Best,
ARon

--
Aron Lindberg

Doctoral Candidate, Information Systems
Weatherhead School of Management
Case Western Reserve University
aronlindberg.github.io
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20150311/f391951a/attachment.html>


More information about the Traminer-users mailing list