[GenABEL-dev] P-values

Mon Jun 23 15:21:36 CEST 2014

Hi all,

I have been trying to figure out an efficient way to calculate p-values, and it seems that I managed to come to an efficient compromise between speed an accuracy. 99% of the regressions will yield a non significant p value, that is to say p > 0.05 or even conservatively 0.1 . It is easy to know apriori if the p-value is significant, by looking at the t-score where it originates from. For any t-score < 1.28 the pvalue will not go below 0.1 for a t-distribution or normal distribution. In this cases (9X%) an aproximation with an error of 10^-(4~5) of the p-value is enough. This calculation is efficient involving only a quadratic polynomial to be approximated. For possible significant p-values, with t-score > 1.28 a proper calculation of the p-value can be done. Note that a p-value calculation involves approximating the integral of the distribution used, either t-students (n<1000?) or normal distribution.

How are the different genabel packages handling this at the moment? This is a speedup that can be applied to any p-value calculation.

Plotted Error between 1-ncdf(x)  and polynomial approx:

http://www.wolframalpha.com/input/?i=y+%3D+%28%281%2F2+-+1%2F2*erf%28x%2Fsqrt%282%29%29%29+-+%281%2F2-%280.1*x*%284.4-x%29%29%29%29+from+0+to+1.28

-Alvaro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20140623/bd5e19dd/attachment.html>