[Traminer-users] p-values for sequential association rules?

Aron Lindberg aron.lindberg at case.edu
Sat Mar 14 01:21:24 CET 2015


Does that then mean that for a rule to really be considered as interesting, it needs both a lift above 1 and a p-value below 0.05? Or could a rule with a lift above 1 and an insignificant p-value still be of value?



-- 

Aron Lindberg




Doctoral Candidate, Information Systems

Weatherhead School of Management 

Case Western Reserve University

aronlindberg.github.io

On Thu, Mar 12, 2015 at 12:42 AM, Gilbert Ritschard
<Gilbert.Ritschard at unige.ch> wrote:

> Lift and implication strength are not equivalent, and can well differ as you have observed.
> Lift measures whether the chance to observe the conclusion increases when the condition holds, why the implication strength measures whether the number of counter-examples decreases when the condition holds. A significant p-value means that the latter decrease is significant.
> Best.
> Gilbert
> From: traminer-users-bounces at lists.r-forge.r-project.org [mailto:traminer-users-bounces at lists.r-forge.r-project.org] On Behalf Of Aron Lindberg
> Sent: Thursday, March 12, 2015 02:18
> To: Users questions
> Cc: Users questions
> Subject: Re: [Traminer-users] p-values for sequential association rules?
> Thanks Gilbert,
> That’s very helpful! How do I mesh this with the interpretation of the lift? For example, I have many rules where the lift is below 1 (and then based on assessing the lift is of no interest), but where the p-value is significant. I also have cases which show the reverse scenario, lift > 1, but insignificant p-values. Is it the case that the lift is similar to a coefficient, and then there is some error around it, thus sometimes causing large lift values to be insignificant?
> Best,
> Aron
> --
> Aron Lindberg
> Doctoral Candidate, Information Systems
> Weatherhead School of Management
> Case Western Reserve University
> aronlindberg.github.io
> On Wed, Mar 11, 2015 at 3:09 PM, Gilbert Ritschard <Gilbert.Ritschard at unige.ch<mailto:Gilbert.Ritschard at unige.ch>> wrote:
> Hi Aron,
> The p-value is that of the implication strength (ImplicStat). This criteria is a z value. The lower ImplicStat (greater negative value), the greater the implication strength of the rule. Your first two rules have an implication strength of about -1, and P(Z<-1) is about .16 for a normal distribution. Likewise, for the 3rd rule, you have an implicative strength of .08, and P(Z<.08) = .53. When the p-value is less than .05, the rule has a significant implicative strength.
> For more explanation see for example,
> Ritschard, G., V. Pisetta and D.A. Zighed (2008), Inducing and evaluating classification trees with statistical implicative criteria, in R. Gras, E. Suzuki, F. Guillet and F. Spagnolo (eds), Statistical Implicative Analysis: Theory and Applications, Series Studies in Computational Intelligence, Vol. 127. Berlin: Springer, 397-420.
> Preprint: http://mephisto.unige.ch/pub/publications/gr/ritsch-pisetta-zighed_bookGras_final_plain.pdf
> Best,
> Gilbert
> From: traminer-users-bounces at lists.r-forge.r-project.org<mailto:traminer-users-bounces at lists.r-forge.r-project.org> [mailto:traminer-users-bounces at lists.r-forge.r-project.org] On Behalf Of Aron Lindberg
> Sent: Wednesday, March 11, 2015 16:29
> To: traminer-users at lists.r-forge.r-project.org<mailto:traminer-users at lists.r-forge.r-project.org>
> Subject: [Traminer-users] p-values for sequential association rules?
> Hi,
> When running sequential association rules I get several values in the output:
>          Rules Support      Conf      Lift Standardlift   JMeasure ImplicStat   p.value
> 1 (I1) => (I2)      15 0.7142857 0.7482993   -1.1607143 0.45894146 -0.9770084 0.1642825
> 2 (I2) => (I1)      17 0.8095238 0.8480726   -0.4404762 0.20127953 -0.9770084 0.1642825
> 3 (I2) => (I3)      17 0.8095238 0.9373434    0.5193644 0.01626898  0.0805823 0.5321129
> From here I understand what support, confidence, and lift are:
> http://stackoverflow.com/questions/27947556/traminerseqerules-help-page
> However, what does the p-value mean? It seems to be highly and negatively correlated with the confidence, but at the same time I have many sequences with combinations high support, confidence, and lift that still are insignificant.
> Hence: what does the p-value pertain to? Can the rules be meaningfully interpreted even with the p-value is insignificant?
> Best,
> ARon
> --
> Aron Lindberg
> Doctoral Candidate, Information Systems
> Weatherhead School of Management
> Case Western Reserve University
> aronlindberg.github.io
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20150313/fa8ec609/attachment.html>


More information about the Traminer-users mailing list