[Traminer-users] p-values for sequential association rules?
Aron Lindberg
aron.lindberg at case.edu
Thu Mar 12 02:18:17 CET 2015
Thanks Gilbert,
That’s very helpful! How do I mesh this with the interpretation of the lift? For example, I have many rules where the lift is below 1 (and then based on assessing the lift is of no interest), but where the p-value is significant. I also have cases which show the reverse scenario, lift > 1, but insignificant p-values. Is it the case that the lift is similar to a coefficient, and then there is some error around it, thus sometimes causing large lift values to be insignificant?
Best,
Aron
--
Aron Lindberg
Doctoral Candidate, Information Systems
Weatherhead School of Management
Case Western Reserve University
aronlindberg.github.io
On Wed, Mar 11, 2015 at 3:09 PM, Gilbert Ritschard
<Gilbert.Ritschard at unige.ch> wrote:
> Hi Aron,
> The p-value is that of the implication strength (ImplicStat). This criteria is a z value. The lower ImplicStat (greater negative value), the greater the implication strength of the rule. Your first two rules have an implication strength of about -1, and P(Z<-1) is about .16 for a normal distribution. Likewise, for the 3rd rule, you have an implicative strength of .08, and P(Z<.08) = .53. When the p-value is less than .05, the rule has a significant implicative strength.
> For more explanation see for example,
> Ritschard, G., V. Pisetta and D.A. Zighed (2008), Inducing and evaluating classification trees with statistical implicative criteria, in R. Gras, E. Suzuki, F. Guillet and F. Spagnolo (eds), Statistical Implicative Analysis: Theory and Applications, Series Studies in Computational Intelligence, Vol. 127. Berlin: Springer, 397-420.
> Preprint: http://mephisto.unige.ch/pub/publications/gr/ritsch-pisetta-zighed_bookGras_final_plain.pdf
> Best,
> Gilbert
> From: traminer-users-bounces at lists.r-forge.r-project.org [mailto:traminer-users-bounces at lists.r-forge.r-project.org] On Behalf Of Aron Lindberg
> Sent: Wednesday, March 11, 2015 16:29
> To: traminer-users at lists.r-forge.r-project.org
> Subject: [Traminer-users] p-values for sequential association rules?
> Hi,
> When running sequential association rules I get several values in the output:
> Rules Support Conf Lift Standardlift JMeasure ImplicStat p.value
> 1 (I1) => (I2) 15 0.7142857 0.7482993 -1.1607143 0.45894146 -0.9770084 0.1642825
> 2 (I2) => (I1) 17 0.8095238 0.8480726 -0.4404762 0.20127953 -0.9770084 0.1642825
> 3 (I2) => (I3) 17 0.8095238 0.9373434 0.5193644 0.01626898 0.0805823 0.5321129
> From here I understand what support, confidence, and lift are:
> http://stackoverflow.com/questions/27947556/traminerseqerules-help-page
> However, what does the p-value mean? It seems to be highly and negatively correlated with the confidence, but at the same time I have many sequences with combinations high support, confidence, and lift that still are insignificant.
> Hence: what does the p-value pertain to? Can the rules be meaningfully interpreted even with the p-value is insignificant?
> Best,
> ARon
> --
> Aron Lindberg
> Doctoral Candidate, Information Systems
> Weatherhead School of Management
> Case Western Reserve University
> aronlindberg.github.io
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/traminer-users/attachments/20150311/1a1d4758/attachment-0001.html>
More information about the Traminer-users
mailing list