From yurii.aulchenko at gmail.com Fri Jun 20 11:03:18 2014 From: yurii.aulchenko at gmail.com (Yury Aulchenko) Date: Fri, 20 Jun 2014 11:03:18 +0200 Subject: [GenABEL-dev] Fwd: Missing file for unit test in GenABEL References: <20140620084443.GA17492@an3as.eu> Message-ID: FYI Begin forwarded message: > From: Andreas Tille > Subject: Missing file for unit test in GenABEL > Date: June 20, 2014 at 10:44:43 GMT+2 > To: Yurii Aulchenko , Debian Med Packaging Team > > Hi Yurii, > > I'm trying to update the Debian package of GenABEL. Since some time > there is an effort to automatically run unit tests of software if > available. Since I noticed that GenABEL comes with unit tests I > tried > > $ make test > export RCMDCHECK=FALSE;\ > cd ../../tests;\ > R --vanilla --slave < doRUnit.R > /bin/sh: 2: cd: can't cd to ../../tests > /bin/sh: 3: cannot open doRUnit.R: No such file > make: *** [test] Error 2 > > > As you can see the file doRUnit.R is missing. It would be great if you > could include this file into the source diustribution to make sure we > can reproduce your exact test procedure in the Debian package. > > Kind regards and thanks for providing GenABEL as Free Software > > Andreas. > > -- > http://fam-tille.de -------------- next part -------------- An HTML attachment was scrubbed... URL: From lennart at karssen.org Fri Jun 20 12:18:04 2014 From: lennart at karssen.org (L.C. Karssen) Date: Fri, 20 Jun 2014 12:18:04 +0200 Subject: [GenABEL-dev] Why is doRUnit.R removed from the final package? Message-ID: <53A40A5C.6000903@karssen.org> Dear list, I just noticed a commit in the Debian packaging system for the Debian package of GenABEL (r-cran-genabel) [1]. The packager (Andreas Tille) wrote in the log that a file is missing (tests/doRUnit.R). It turns out that this file is removed in our makedistrib_GenABEL.sh script [2]. Does anyone remember why this is done? Thanks, Lennart. [1] http://anonscm.debian.org/viewvc/debian-med?view=revision&revision=17252 [2] https://r-forge.r-project.org/scm/viewvc.php/pkg/GenABEL-general/distrib_scripts/makedistrib_GenABEL.sh?view=markup&revision=1684&root=genabel -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 213 bytes Desc: OpenPGP digital signature URL: From yurii.aulchenko at gmail.com Fri Jun 20 12:25:55 2014 From: yurii.aulchenko at gmail.com (Yury Aulchenko) Date: Fri, 20 Jun 2014 12:25:55 +0200 Subject: [GenABEL-dev] Why is doRUnit.R removed from the final package? In-Reply-To: <53A40A5C.6000903@karssen.org> References: <53A40A5C.6000903@karssen.org> Message-ID: <8F9D92B7-2F93-4587-B4DE-436299A5A897@gmail.com> because CRAN requested NOT to include unit tests into distrib - my unerstanding was that it takes to long + tests are not stable; again, for the latter my understanding was that it is not specifically GenABEL, it is something general Lennart, so you think we should address Andreas to our SVN? Yurii On Jun 20, 2014, at 12:18, L.C. Karssen wrote: > Dear list, > > I just noticed a commit in the Debian packaging system for the Debian > package of GenABEL (r-cran-genabel) [1]. The packager (Andreas Tille) > wrote in the log that a file is missing (tests/doRUnit.R). It turns out > that this file is removed in our makedistrib_GenABEL.sh script [2]. Does > anyone remember why this is done? > > > Thanks, > > Lennart. > > > [1] http://anonscm.debian.org/viewvc/debian-med?view=revision&revision=17252 > [2] > https://r-forge.r-project.org/scm/viewvc.php/pkg/GenABEL-general/distrib_scripts/makedistrib_GenABEL.sh?view=markup&revision=1684&root=genabel > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > GPG key ID: A88F554A > -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel From lennart at karssen.org Fri Jun 20 14:13:32 2014 From: lennart at karssen.org (L.C. Karssen) Date: Fri, 20 Jun 2014 14:13:32 +0200 Subject: [GenABEL-dev] Why is doRUnit.R removed from the final package? In-Reply-To: <8F9D92B7-2F93-4587-B4DE-436299A5A897@gmail.com> References: <53A40A5C.6000903@karssen.org> <8F9D92B7-2F93-4587-B4DE-436299A5A897@gmail.com> Message-ID: <53A4256C.4070406@karssen.org> Hi Yurii, I see Andreas already contacted you before I noticed his commit. On 20-06-14 12:25, Yury Aulchenko wrote: > because CRAN requested NOT to include unit tests into distrib - my > unerstanding was that it takes to long + tests are not stable; again, > for the latter my understanding was that it is not specifically GenABEL, > it is something general I quickly checked a few packages on CRAN and I'm not sure about what the results mean: - Rcpp: has doRunit.R - MASS: has tests, but no doRUnit.R (maybe not using RUnit?) - ggplot2: has tests, but no doRUnit.R (maybe not using RUnit?) - HMisc: has tests, but no doRUnit.R (maybe not using RUnit?) As far as I can see the "Writing R Extensions" manual doesn't mention anything about doRUnit.R specifically. > > Lennart, so you think we should address Andreas to our SVN? I guess you meant "add" instead of "address"? If so, then no, that wasn't my idea. Andreas is one of the main forces behind the Debian Med team, which focusses on making Debian packages for Medical/Life Sciences software. I don't think he is interested in participating in the development of GenABEL specifically (but we could ask). Lennart. > > Yurii > > > On Jun 20, 2014, at 12:18, L.C. Karssen wrote: > >> Dear list, >> >> I just noticed a commit in the Debian packaging system for the Debian >> package of GenABEL (r-cran-genabel) [1]. The packager (Andreas Tille) >> wrote in the log that a file is missing (tests/doRUnit.R). It turns out >> that this file is removed in our makedistrib_GenABEL.sh script [2]. Does >> anyone remember why this is done? >> >> >> Thanks, >> >> Lennart. >> >> >> [1] http://anonscm.debian.org/viewvc/debian-med?view=revision&revision=17252 >> [2] >> https://r-forge.r-project.org/scm/viewvc.php/pkg/GenABEL-general/distrib_scripts/makedistrib_GenABEL.sh?view=markup&revision=1684&root=genabel >> -- >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> L.C. Karssen >> Utrecht >> The Netherlands >> >> lennart at karssen.org >> http://blog.karssen.org >> GPG key ID: A88F554A >> -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- >> >> _______________________________________________ >> genabel-devel mailing list >> genabel-devel at lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 213 bytes Desc: OpenPGP digital signature URL: From lennart at karssen.org Mon Jun 23 14:32:54 2014 From: lennart at karssen.org (L.C. Karssen) Date: Mon, 23 Jun 2014 14:32:54 +0200 Subject: [GenABEL-dev] [Genabel-commits] r1748 - in pkg/OmicABELnoMM: . src tests In-Reply-To: <538482AF.2060507@karssen.org> References: <20140527120853.AB60C1873C7@r-forge.r-project.org> <538482AF.2060507@karssen.org> Message-ID: <53A81E76.7080306@karssen.org> Hi Alvaro, I just tried to build OmicABELnoMM and it misses the file tests/Makefile.am in SVN. It looks like you have forgotten to add tests/Makefile.am to SVN, right? I must have overlooked the missing Makefile.am file when reviewing your commit (see mail message below). Thanks, Lennart. On 27-05-14 14:18, L.C. Karssen wrote: > Hi Alvaro, > > On 27-05-14 14:08, noreply at r-forge.r-project.org wrote: >> Author: afrank >> Date: 2014-05-27 14:08:53 +0200 (Tue, 27 May 2014) >> New Revision: 1748 >> >> Modified: >> pkg/OmicABELnoMM/Makefile.am >> pkg/OmicABELnoMM/configure.ac >> pkg/OmicABELnoMM/src/Algorithm.cpp >> pkg/OmicABELnoMM/tests/Makefile >> pkg/OmicABELnoMM/tests/test.cpp >> Log: >> Automake integration of tests now runs them using make check. Tests are also compiled along with the normal executable. > > > That sounds good! > >> >> >> Modified: pkg/OmicABELnoMM/tests/Makefile >> =================================================================== > > Now that you have a Makefile.am, the Makefile itself can be removed from > SVN. > > > Thanks a lot! > > Lennart. > >> >> To get the complete diff run: >> svnlook diff /svnroot/genabel -r 1748 >> _______________________________________________ >> Genabel-commits mailing list >> Genabel-commits at lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits >> > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 213 bytes Desc: OpenPGP digital signature URL: From alvaro.frank at rwth-aachen.de Mon Jun 23 14:52:42 2014 From: alvaro.frank at rwth-aachen.de (Frank, Alvaro Jesus) Date: Mon, 23 Jun 2014 12:52:42 +0000 Subject: [GenABEL-dev] [Genabel-commits] r1748 - in pkg/OmicABELnoMM: . src tests In-Reply-To: <53A81E76.7080306@karssen.org> References: <20140527120853.AB60C1873C7@r-forge.r-project.org> <538482AF.2060507@karssen.org>,<53A81E76.7080306@karssen.org> Message-ID: <244CF001646FF74FB34F372310A332C57BBD78@MBX5.rwth-ad.de> Hi Lennart, I will fix it. At the moment I am working on doing efficient calculations of t-score, R^2 and p-value. This are as expensive as the calculation of the slopes from regression themselves so it has been quite time consuming in the design phase. I will have some prototype soon. -Alvaro ________________________________________ From: genabel-devel-bounces at lists.r-forge.r-project.org [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of L.C. Karssen [lennart at karssen.org] Sent: Monday, June 23, 2014 2:32 PM To: genabel-devel at lists.r-forge.r-project.org Subject: Re: [GenABEL-dev] [Genabel-commits] r1748 - in pkg/OmicABELnoMM: . src tests Hi Alvaro, I just tried to build OmicABELnoMM and it misses the file tests/Makefile.am in SVN. It looks like you have forgotten to add tests/Makefile.am to SVN, right? I must have overlooked the missing Makefile.am file when reviewing your commit (see mail message below). Thanks, Lennart. On 27-05-14 14:18, L.C. Karssen wrote: > Hi Alvaro, > > On 27-05-14 14:08, noreply at r-forge.r-project.org wrote: >> Author: afrank >> Date: 2014-05-27 14:08:53 +0200 (Tue, 27 May 2014) >> New Revision: 1748 >> >> Modified: >> pkg/OmicABELnoMM/Makefile.am >> pkg/OmicABELnoMM/configure.ac >> pkg/OmicABELnoMM/src/Algorithm.cpp >> pkg/OmicABELnoMM/tests/Makefile >> pkg/OmicABELnoMM/tests/test.cpp >> Log: >> Automake integration of tests now runs them using make check. Tests are also compiled along with the normal executable. > > > That sounds good! > >> >> >> Modified: pkg/OmicABELnoMM/tests/Makefile >> =================================================================== > > Now that you have a Makefile.am, the Makefile itself can be removed from > SVN. > > > Thanks a lot! > > Lennart. > >> >> To get the complete diff run: >> svnlook diff /svnroot/genabel -r 1748 >> _______________________________________________ >> Genabel-commits mailing list >> Genabel-commits at lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits >> > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- From alvaro.frank at rwth-aachen.de Mon Jun 23 15:21:36 2014 From: alvaro.frank at rwth-aachen.de (Frank, Alvaro Jesus) Date: Mon, 23 Jun 2014 13:21:36 +0000 Subject: [GenABEL-dev] P-values Message-ID: <244CF001646FF74FB34F372310A332C57BBD8F@MBX5.rwth-ad.de> Hi all, I have been trying to figure out an efficient way to calculate p-values, and it seems that I managed to come to an efficient compromise between speed an accuracy. 99% of the regressions will yield a non significant p value, that is to say p > 0.05 or even conservatively 0.1 . It is easy to know apriori if the p-value is significant, by looking at the t-score where it originates from. For any t-score < 1.28 the pvalue will not go below 0.1 for a t-distribution or normal distribution. In this cases (9X%) an aproximation with an error of 10^-(4~5) of the p-value is enough. This calculation is efficient involving only a quadratic polynomial to be approximated. For possible significant p-values, with t-score > 1.28 a proper calculation of the p-value can be done. Note that a p-value calculation involves approximating the integral of the distribution used, either t-students (n<1000?) or normal distribution. How are the different genabel packages handling this at the moment? This is a speedup that can be applied to any p-value calculation. Plotted Error between 1-ncdf(x) and polynomial approx: http://www.wolframalpha.com/input/?i=y+%3D+%28%281%2F2+-+1%2F2*erf%28x%2Fsqrt%282%29%29%29+-+%281%2F2-%280.1*x*%284.4-x%29%29%29%29+from+0+to+1.28 -Alvaro -------------- next part -------------- An HTML attachment was scrubbed... URL: From lennart at karssen.org Mon Jun 23 15:22:16 2014 From: lennart at karssen.org (L.C. Karssen) Date: Mon, 23 Jun 2014 15:22:16 +0200 Subject: [GenABEL-dev] [Genabel-commits] r1748 - in pkg/OmicABELnoMM: . src tests In-Reply-To: <244CF001646FF74FB34F372310A332C57BBD78@MBX5.rwth-ad.de> References: <20140527120853.AB60C1873C7@r-forge.r-project.org> <538482AF.2060507@karssen.org>, <53A81E76.7080306@karssen.org> <244CF001646FF74FB34F372310A332C57BBD78@MBX5.rwth-ad.de> Message-ID: <53A82A08.7030005@karssen.org> Hi Alvaro, On 23-06-14 14:52, Frank, Alvaro Jesus wrote: > Hi Lennart, > > I will fix it. Thanks! > At the moment I am working on doing efficient calculations of t-score, R^2 and p-value. This are as expensive as the calculation of the slopes from regression themselves so it has been quite time consuming in the design phase. I will have some prototype soon. Sounds cool! By the way, I've been rather busy lately, but I've seen your messages about big data files on this list. Since your mails are quite substantial I need some longer stretch of time to think them over and compose a meaningful reply. I intend to do so in the coming weeks. Best, Lennart. > > -Alvaro > > ________________________________________ > From: genabel-devel-bounces at lists.r-forge.r-project.org [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of L.C. Karssen [lennart at karssen.org] > Sent: Monday, June 23, 2014 2:32 PM > To: genabel-devel at lists.r-forge.r-project.org > Subject: Re: [GenABEL-dev] [Genabel-commits] r1748 - in pkg/OmicABELnoMM: . src tests > > Hi Alvaro, > > I just tried to build OmicABELnoMM and it misses the file > tests/Makefile.am in SVN. > It looks like you have forgotten to add tests/Makefile.am to SVN, right? > I must have overlooked the missing Makefile.am file when reviewing your > commit (see mail message below). > > > Thanks, > > Lennart. > > On 27-05-14 14:18, L.C. Karssen wrote: >> Hi Alvaro, >> >> On 27-05-14 14:08, noreply at r-forge.r-project.org wrote: >>> Author: afrank >>> Date: 2014-05-27 14:08:53 +0200 (Tue, 27 May 2014) >>> New Revision: 1748 >>> >>> Modified: >>> pkg/OmicABELnoMM/Makefile.am >>> pkg/OmicABELnoMM/configure.ac >>> pkg/OmicABELnoMM/src/Algorithm.cpp >>> pkg/OmicABELnoMM/tests/Makefile >>> pkg/OmicABELnoMM/tests/test.cpp >>> Log: >>> Automake integration of tests now runs them using make check. Tests are also compiled along with the normal executable. >> >> >> That sounds good! >> >>> >>> >>> Modified: pkg/OmicABELnoMM/tests/Makefile >>> =================================================================== >> >> Now that you have a Makefile.am, the Makefile itself can be removed from >> SVN. >> >> >> Thanks a lot! >> >> Lennart. >> >>> >>> To get the complete diff run: >>> svnlook diff /svnroot/genabel -r 1748 >>> _______________________________________________ >>> Genabel-commits mailing list >>> Genabel-commits at lists.r-forge.r-project.org >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits >>> >> > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > GPG key ID: A88F554A > -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 213 bytes Desc: OpenPGP digital signature URL: From lennart at karssen.org Mon Jun 23 15:51:21 2014 From: lennart at karssen.org (L.C. Karssen) Date: Mon, 23 Jun 2014 15:51:21 +0200 Subject: [GenABEL-dev] P-values In-Reply-To: <244CF001646FF74FB34F372310A332C57BBD8F@MBX5.rwth-ad.de> References: <244CF001646FF74FB34F372310A332C57BBD8F@MBX5.rwth-ad.de> Message-ID: <53A830D9.8020008@karssen.org> Hi Alvaro, On 23-06-14 15:21, Frank, Alvaro Jesus wrote: > Hi all, > > I have been trying to figure out an efficient way to calculate p-values, > and it seems that I managed to come to an efficient compromise between > speed an accuracy. That sounds cool. An (unnecessary?) word of caution though: I don't know how much research you have done on this before starting, but make sure to look at existing implementations of p-value calculation. Given that in genetics p-values can become extremely small (e.g. 1e-100 or smaller) numerical problems abound. See also http://dx.doi.org/10.1016/j.csda.2008.11.028 > 99% of the regressions will yield a non significant p > value, that is to say p > 0.05 or even conservatively 0.1 . It is easy > to know apriori if the p-value is significant, by looking at the t-score > where it originates from. For any t-score < 1.28 the pvalue will not go > below 0.1 for a t-distribution or normal distribution. In this cases > (9X%) an aproximation with an error of 10^-(4~5) of the p-value is Do you mean an absolute or relative error of 1e-4 here? > enough. This calculation is efficient involving only a quadratic > polynomial to be approximated. For possible significant p-values, with > t-score > 1.28 a proper calculation of the p-value can be done. I like the idea of setting a threshold for t, below which doing slow but accurate calculations make no sense. If that gives a considerable speed up, go for it! I would make the actual threshold a user-definable option, by the way. There may be use cases where people want to have accurate p-values even for non-significant cases. > Note > that a p-value calculation involves approximating the integral of the > distribution used, either t-students (n<1000?) or normal distribution. Be sure to check out existing implementations (e.g. in Boost) on how integration is being handled. Somewhere at the back of my mind the term 'incomplete Gamma function' comes to mind. > > How are the different genabel packages handling this at the moment? This > is a speedup that can be applied to any p-value calculation. The R-based packages will most likely use the functions provided by R, e.g. pchisq() for calculating the p-value of a chi^2 statistic. In ProbABEL we are now implementing the calculation of p-values, but the actual calculation is being left to the Boost libraries. Currently we calculate the p-values irrespective of the value of the statistic. I'll profile the code to see how much time this takes and whether it makes sense to only calculate p-values for statistics with a value above a certain threshold. > > Plotted Error between 1-ncdf(x) and polynomial approx: > > http://www.wolframalpha.com/input/?i=y+%3D+%28%281%2F2+-+1%2F2*erf%28x%2Fsqrt%282%29%29%29+-+%281%2F2-%280.1*x*%284.4-x%29%29%29%29+from+0+to+1.28 > ah, that leads me to believe the answer to my question above is that you mean an absolute error of ~1e-4, correct? Best, Lennart. > > -Alvaro > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 213 bytes Desc: OpenPGP digital signature URL: From alvaro.frank at rwth-aachen.de Mon Jun 23 16:02:21 2014 From: alvaro.frank at rwth-aachen.de (Frank, Alvaro Jesus) Date: Mon, 23 Jun 2014 14:02:21 +0000 Subject: [GenABEL-dev] P-values In-Reply-To: <53A830D9.8020008@karssen.org> References: <244CF001646FF74FB34F372310A332C57BBD8F@MBX5.rwth-ad.de>, <53A830D9.8020008@karssen.org> Message-ID: <244CF001646FF74FB34F372310A332C57BBDD8@MBX5.rwth-ad.de> That sounds cool. An (unnecessary?) word of caution though: I don't know how much research you have done on this before starting, but make sure to look at existing implementations of p-value calculation. Given that in genetics p-values can become extremely small (e.g. 1e-100 or smaller) numerical problems abound. See also http://dx.doi.org/10.1016/j.csda.2008.11.028 Be sure to check out existing implementations (e.g. in Boost) on how integration is being handled. Somewhere at the back of my mind the term 'incomplete Gamma function' comes to mind. Such extreme values are directly related to the t-score, and it starts being a problem at around 20+. For those cases I am set to use arbitrary precise functions as in: http://www.mpfr.org/mpfr-current/mpfr.html Which leads me to the critic that if people only report p-values as their work, perhaps there is a cultural and scientific problem with the workflow and conceptual basis. But if they want their 3.5*10^-300 I will calculate it. I would make the actual threshold a user-definable option, by the way. There may be use cases where people want to have accurate p-values even for non-significant cases. I was thinking the same. From Local workflows here at Helmholtz I see that any insignificant value gets ignored completely unless it comes to doing Manhattan plots. So another setting would be to not calculate or store at all anything for non significant t-scores, which is helpfull during early discovery of significant signals. For Final calculations for plotting and reporting, the setting can then be changed to enable those. This effectivly removes the "Big data" problem, since 99% of the data is meaningless to determining whether the slope is significant. In ProbABEL we are now implementing the calculation of p-values, but the actual calculation is being left to the Boost libraries. Currently we calculate the p-values irrespective of the value of the statistic. I'll profile the code to see how much time this takes and whether it makes sense to only calculate p-values for statistics with a value above a certain threshold. I was considering boost, if your profiling is good, perhaps boost is doing the threshold already. Let me know to see if I should switch over to boost too. -A Frank ________________________________________ From: genabel-devel-bounces at lists.r-forge.r-project.org [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of L.C. Karssen [lennart at karssen.org] Sent: Monday, June 23, 2014 3:51 PM To: genabel-devel at lists.r-forge.r-project.org Subject: Re: [GenABEL-dev] P-values Hi Alvaro, On 23-06-14 15:21, Frank, Alvaro Jesus wrote: > Hi all, > > I have been trying to figure out an efficient way to calculate p-values, > and it seems that I managed to come to an efficient compromise between > speed an accuracy. That sounds cool. An (unnecessary?) word of caution though: I don't know how much research you have done on this before starting, but make sure to look at existing implementations of p-value calculation. Given that in genetics p-values can become extremely small (e.g. 1e-100 or smaller) numerical problems abound. See also http://dx.doi.org/10.1016/j.csda.2008.11.028 > 99% of the regressions will yield a non significant p > value, that is to say p > 0.05 or even conservatively 0.1 . It is easy > to know apriori if the p-value is significant, by looking at the t-score > where it originates from. For any t-score < 1.28 the pvalue will not go > below 0.1 for a t-distribution or normal distribution. In this cases > (9X%) an aproximation with an error of 10^-(4~5) of the p-value is Do you mean an absolute or relative error of 1e-4 here? > enough. This calculation is efficient involving only a quadratic > polynomial to be approximated. For possible significant p-values, with > t-score > 1.28 a proper calculation of the p-value can be done. I like the idea of setting a threshold for t, below which doing slow but accurate calculations make no sense. If that gives a considerable speed up, go for it! I would make the actual threshold a user-definable option, by the way. There may be use cases where people want to have accurate p-values even for non-significant cases. > Note > that a p-value calculation involves approximating the integral of the > distribution used, either t-students (n<1000?) or normal distribution. Be sure to check out existing implementations (e.g. in Boost) on how integration is being handled. Somewhere at the back of my mind the term 'incomplete Gamma function' comes to mind. > > How are the different genabel packages handling this at the moment? This > is a speedup that can be applied to any p-value calculation. The R-based packages will most likely use the functions provided by R, e.g. pchisq() for calculating the p-value of a chi^2 statistic. In ProbABEL we are now implementing the calculation of p-values, but the actual calculation is being left to the Boost libraries. Currently we calculate the p-values irrespective of the value of the statistic. I'll profile the code to see how much time this takes and whether it makes sense to only calculate p-values for statistics with a value above a certain threshold. > > Plotted Error between 1-ncdf(x) and polynomial approx: > > http://www.wolframalpha.com/input/?i=y+%3D+%28%281%2F2+-+1%2F2*erf%28x%2Fsqrt%282%29%29%29+-+%281%2F2-%280.1*x*%284.4-x%29%29%29%29+from+0+to+1.28 > ah, that leads me to believe the answer to my question above is that you mean an absolute error of ~1e-4, correct? Best, Lennart. > > -Alvaro > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvaro.frank at rwth-aachen.de Mon Jun 23 16:38:43 2014 From: alvaro.frank at rwth-aachen.de (Frank, Alvaro Jesus) Date: Mon, 23 Jun 2014 14:38:43 +0000 Subject: [GenABEL-dev] P-values In-Reply-To: <244CF001646FF74FB34F372310A332C57BBDD8@MBX5.rwth-ad.de> References: <244CF001646FF74FB34F372310A332C57BBD8F@MBX5.rwth-ad.de>, <53A830D9.8020008@karssen.org>, <244CF001646FF74FB34F372310A332C57BBDD8@MBX5.rwth-ad.de> Message-ID: <244CF001646FF74FB34F372310A332C57BBDF2@MBX5.rwth-ad.de> Sorry for the p-value spam, but... On Another note, it is important to note that a t-score is equivalent (in the 2 tailed case or a well defined head/tail) to any p-value since there is a 1-1 correspondence of which t-score produces which p-value after integrating the corresponding CDF. Reporting p<0.05 is as meaningful/meaningless as reporting t-score > 1.64485 (Normal CDF 1 tail test) . Instead of reporting the p<3524.1646*10^-300 one could just report t-score > 50 too. This is a cultural issue of how the workflow has always been and it really needs revisiting. As a side side note, perhaps since all that matters is if a line is present (correlation) what about proposing a less expensive method that looks for this relation in very loose terms? Since the result really doesn't matter beyond identifying a "signal" for further study, the whole GWAS could be revisited to still give 'meaningful' results by identifying slopes ina visual or similar manner. Just a thought. A Frank ________________________________ From: genabel-devel-bounces at lists.r-forge.r-project.org [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of Frank, Alvaro Jesus [alvaro.frank at rwth-aachen.de] Sent: Monday, June 23, 2014 4:02 PM To: L.C. Karssen; genabel-devel at lists.r-forge.r-project.org Subject: Re: [GenABEL-dev] P-values That sounds cool. An (unnecessary?) word of caution though: I don't know how much research you have done on this before starting, but make sure to look at existing implementations of p-value calculation. Given that in genetics p-values can become extremely small (e.g. 1e-100 or smaller) numerical problems abound. See also http://dx.doi.org/10.1016/j.csda.2008.11.028 Be sure to check out existing implementations (e.g. in Boost) on how integration is being handled. Somewhere at the back of my mind the term 'incomplete Gamma function' comes to mind. Such extreme values are directly related to the t-score, and it starts being a problem at around 20+. For those cases I am set to use arbitrary precise functions as in: http://www.mpfr.org/mpfr-current/mpfr.html Which leads me to the critic that if people only report p-values as their work, perhaps there is a cultural and scientific problem with the workflow and conceptual basis. But if they want their 3.5*10^-300 I will calculate it. I would make the actual threshold a user-definable option, by the way. There may be use cases where people want to have accurate p-values even for non-significant cases. I was thinking the same. From Local workflows here at Helmholtz I see that any insignificant value gets ignored completely unless it comes to doing Manhattan plots. So another setting would be to not calculate or store at all anything for non significant t-scores, which is helpfull during early discovery of significant signals. For Final calculations for plotting and reporting, the setting can then be changed to enable those. This effectivly removes the "Big data" problem, since 99% of the data is meaningless to determining whether the slope is significant. In ProbABEL we are now implementing the calculation of p-values, but the actual calculation is being left to the Boost libraries. Currently we calculate the p-values irrespective of the value of the statistic. I'll profile the code to see how much time this takes and whether it makes sense to only calculate p-values for statistics with a value above a certain threshold. I was considering boost, if your profiling is good, perhaps boost is doing the threshold already. Let me know to see if I should switch over to boost too. -A Frank ________________________________________ From: genabel-devel-bounces at lists.r-forge.r-project.org [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of L.C. Karssen [lennart at karssen.org] Sent: Monday, June 23, 2014 3:51 PM To: genabel-devel at lists.r-forge.r-project.org Subject: Re: [GenABEL-dev] P-values Hi Alvaro, On 23-06-14 15:21, Frank, Alvaro Jesus wrote: > Hi all, > > I have been trying to figure out an efficient way to calculate p-values, > and it seems that I managed to come to an efficient compromise between > speed an accuracy. That sounds cool. An (unnecessary?) word of caution though: I don't know how much research you have done on this before starting, but make sure to look at existing implementations of p-value calculation. Given that in genetics p-values can become extremely small (e.g. 1e-100 or smaller) numerical problems abound. See also http://dx.doi.org/10.1016/j.csda.2008.11.028 > 99% of the regressions will yield a non significant p > value, that is to say p > 0.05 or even conservatively 0.1 . It is easy > to know apriori if the p-value is significant, by looking at the t-score > where it originates from. For any t-score < 1.28 the pvalue will not go > below 0.1 for a t-distribution or normal distribution. In this cases > (9X%) an aproximation with an error of 10^-(4~5) of the p-value is Do you mean an absolute or relative error of 1e-4 here? > enough. This calculation is efficient involving only a quadratic > polynomial to be approximated. For possible significant p-values, with > t-score > 1.28 a proper calculation of the p-value can be done. I like the idea of setting a threshold for t, below which doing slow but accurate calculations make no sense. If that gives a considerable speed up, go for it! I would make the actual threshold a user-definable option, by the way. There may be use cases where people want to have accurate p-values even for non-significant cases. > Note > that a p-value calculation involves approximating the integral of the > distribution used, either t-students (n<1000?) or normal distribution. Be sure to check out existing implementations (e.g. in Boost) on how integration is being handled. Somewhere at the back of my mind the term 'incomplete Gamma function' comes to mind. > > How are the different genabel packages handling this at the moment? This > is a speedup that can be applied to any p-value calculation. The R-based packages will most likely use the functions provided by R, e.g. pchisq() for calculating the p-value of a chi^2 statistic. In ProbABEL we are now implementing the calculation of p-values, but the actual calculation is being left to the Boost libraries. Currently we calculate the p-values irrespective of the value of the statistic. I'll profile the code to see how much time this takes and whether it makes sense to only calculate p-values for statistics with a value above a certain threshold. > > Plotted Error between 1-ncdf(x) and polynomial approx: > > http://www.wolframalpha.com/input/?i=y+%3D+%28%281%2F2+-+1%2F2*erf%28x%2Fsqrt%282%29%29%29+-+%281%2F2-%280.1*x*%284.4-x%29%29%29%29+from+0+to+1.28 > ah, that leads me to believe the answer to my question above is that you mean an absolute error of ~1e-4, correct? Best, Lennart. > > -Alvaro > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- An HTML attachment was scrubbed... URL: From lennart at karssen.org Mon Jun 23 17:38:28 2014 From: lennart at karssen.org (L.C. Karssen) Date: Mon, 23 Jun 2014 17:38:28 +0200 Subject: [GenABEL-dev] P-values In-Reply-To: <244CF001646FF74FB34F372310A332C57BBDF2@MBX5.rwth-ad.de> References: <244CF001646FF74FB34F372310A332C57BBD8F@MBX5.rwth-ad.de>, <53A830D9.8020008@karssen.org>, <244CF001646FF74FB34F372310A332C57BBDD8@MBX5.rwth-ad.de> <244CF001646FF74FB34F372310A332C57BBDF2@MBX5.rwth-ad.de> Message-ID: <53A849F4.5070903@karssen.org> Hi Alvaro, On 23-06-14 16:38, Frank, Alvaro Jesus wrote: > Sorry for the p-value spam, but... :-). > > On Another note, it is important to note that a t-score is equivalent > (in the 2 tailed case or a well defined head/tail) to any p-value since > there is a 1-1 correspondence of which t-score produces which p-value > after integrating the corresponding CDF. I completely agree. However (and maybe superfluous), note that p-values can also be calculated for statistics other than the t-statistic. In that sense I can appreciate the fact that the p-value can be used to 'define' significance. Instead of remembering significance thresholds for the t-distribution, chi^2 distribution, Z-distribution, F-distribution and others, I can simply remember the p-value threshold. That said, you are not the only one concerned about the simple reporting of p-values or the taking of 0.05 as a strict threshold for significance. See for example this recent feature in Nature: http://dx.doi.org/10.1038/506150a Another very good read is the paper "Why most published research findings are false" by epidemiologist John Ioannidis, which puts the p-value debate in a broader context by looking at reproducibility. Back in 1998 Kenneth Rothman (a renowned epidemiologist) even discouraged the use of p-values from the journal "Epidemiology", of which he was the editor, see http://journals.lww.com/epidem/Citation/1998/01000/That_Confounded_P_Value_.4.aspx. It looks like he didn't succeed... > Reporting p<0.05 is as meaningful/meaningless as reporting t-score > > 1.64485 (Normal CDF 1 tail test) . > Instead of reporting the p<3524.1646*10^-300 one could just report > t-score > 50 too. Yup. And numerically that would be more easy to do as well. > > This is a cultural issue of how the workflow has always been and it > really needs revisiting. > As a side side note, perhaps since all that matters is if a line is > present (correlation) what about proposing a less expensive method that > looks for this relation in very loose terms? Since the result really > doesn't matter beyond identifying a "signal" for further study, the > whole GWAS could be revisited to still give 'meaningful' results by > identifying slopes ina visual or similar manner. Just a thought. Not only that, but remember that a p-value are the result of some form of hypothesis testing. So it's not only about whether a certain coefficient is different from zero (the usual hypothesis), but also whether the alternative hypothesis makes sense in the first place. While we're at it :-), here are few other things that should be changed in the culture of the field: - coefficients have physical units: these should be reported! So instead of writing "beta is 3" one should add units (e.g. km/h, mmol/l per copy of the A allele, kg/year, whatever). - number of significant digits that are reported (and doing so consistently throughout the paper). - version numbers of software used for a given analysis should be reported. A lot of these issues are addressed every once in a while, and until the change in culture has taken place, the least thing you can do is keep arguing the case(s) each time they come up (or when giving lectures, etc.). Best, Lennart. > > A Frank > > > > > ------------------------------------------------------------------------ > *From:* genabel-devel-bounces at lists.r-forge.r-project.org > [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of Frank, > Alvaro Jesus [alvaro.frank at rwth-aachen.de] > *Sent:* Monday, June 23, 2014 4:02 PM > *To:* L.C. Karssen; genabel-devel at lists.r-forge.r-project.org > *Subject:* Re: [GenABEL-dev] P-values > > That sounds cool. An (unnecessary?) word of caution though: I don't know > how much research you have done on this before starting, but make sure > to look at existing implementations of p-value calculation. Given that > in genetics p-values can become extremely small (e.g. 1e-100 or smaller) > numerical problems abound. > See also http://dx.doi.org/10.1016/j.csda.2008.11.028 > > Be sure to check out existing implementations (e.g. in Boost) on how > integration is being handled. Somewhere at the back of my mind the term > 'incomplete Gamma function' comes to mind. > > Such extreme values are directly related to the t-score, and it starts > being a problem at around 20+. > For those cases I am set to use arbitrary precise functions as in: > http://www.mpfr.org/mpfr-current/mpfr.html > > Which leads me to the critic that if people only report p-values as > their work, perhaps there is a cultural and scientific problem with the > workflow and conceptual basis. But if they want their 3.5*10^-300 I will > calculate it. > > I would make the actual threshold a user-definable option, by the way. > There may be use cases where people want to have accurate p-values even > for non-significant cases. > > I was thinking the same. From Local workflows here at Helmholtz I see > that any insignificant value gets ignored completely unless it comes to > doing Manhattan plots. So another setting would be to not calculate or > store at all anything for non significant t-scores, which is helpfull > during early discovery of significant signals. For Final calculations > for plotting and reporting, the setting can then be changed to enable > those. This effectivly removes the "Big data" problem, since 99% of the > data is meaningless to determining whether the slope is significant. > > In ProbABEL we are now implementing the calculation of p-values, but the > actual calculation is being left to the Boost libraries. Currently we > calculate the p-values irrespective of the value of the statistic. I'll > profile the code to see how much time this takes and whether it makes > sense to only calculate p-values for statistics with a value above a > certain threshold. > > I was considering boost, if your profiling is good, perhaps boost is > doing the threshold already. Let me know to see if I should switch over > to boost too. > > -A Frank > ________________________________________ > From: genabel-devel-bounces at lists.r-forge.r-project.org > [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of L.C. > Karssen [lennart at karssen.org] > Sent: Monday, June 23, 2014 3:51 PM > To: genabel-devel at lists.r-forge.r-project.org > Subject: Re: [GenABEL-dev] P-values > > Hi Alvaro, > > On 23-06-14 15:21, Frank, Alvaro Jesus wrote: >> Hi all, >> >> I have been trying to figure out an efficient way to calculate p-values, >> and it seems that I managed to come to an efficient compromise between >> speed an accuracy. > > That sounds cool. An (unnecessary?) word of caution though: I don't know > how much research you have done on this before starting, but make sure > to look at existing implementations of p-value calculation. Given that > in genetics p-values can become extremely small (e.g. 1e-100 or smaller) > numerical problems abound. > See also http://dx.doi.org/10.1016/j.csda.2008.11.028 > >> 99% of the regressions will yield a non significant p >> value, that is to say p > 0.05 or even conservatively 0.1 . It is easy >> to know apriori if the p-value is significant, by looking at the t-score >> where it originates from. For any t-score < 1.28 the pvalue will not go >> below 0.1 for a t-distribution or normal distribution. In this cases >> (9X%) an aproximation with an error of 10^-(4~5) of the p-value is > > Do you mean an absolute or relative error of 1e-4 here? > >> enough. This calculation is efficient involving only a quadratic >> polynomial to be approximated. For possible significant p-values, with >> t-score > 1.28 a proper calculation of the p-value can be done. > > I like the idea of setting a threshold for t, below which doing slow but > accurate calculations make no sense. If that gives a considerable speed > up, go for it! > I would make the actual threshold a user-definable option, by the way. > There may be use cases where people want to have accurate p-values even > for non-significant cases. > >> Note >> that a p-value calculation involves approximating the integral of the >> distribution used, either t-students (n<1000?) or normal distribution. > > Be sure to check out existing implementations (e.g. in Boost) on how > integration is being handled. Somewhere at the back of my mind the term > 'incomplete Gamma function' comes to mind. > >> >> How are the different genabel packages handling this at the moment? This >> is a speedup that can be applied to any p-value calculation. > > The R-based packages will most likely use the functions provided by R, > e.g. pchisq() for calculating the p-value of a chi^2 statistic. > In ProbABEL we are now implementing the calculation of p-values, but the > actual calculation is being left to the Boost libraries. Currently we > calculate the p-values irrespective of the value of the statistic. I'll > profile the code to see how much time this takes and whether it makes > sense to only calculate p-values for statistics with a value above a > certain threshold. > >> >> Plotted Error between 1-ncdf(x) and polynomial approx: >> >> > http://www.wolframalpha.com/input/?i=y+%3D+%28%281%2F2+-+1%2F2*erf%28x%2Fsqrt%282%29%29%29+-+%281%2F2-%280.1*x*%284.4-x%29%29%29%29+from+0+to+1.28 >> > > ah, that leads me to believe the answer to my question above is that you > mean an absolute error of ~1e-4, correct? > > > Best, > > Lennart. > >> >> -Alvaro >> >> >> _______________________________________________ >> genabel-devel mailing list >> genabel-devel at lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >> > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > GPG key ID: A88F554A > -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 213 bytes Desc: OpenPGP digital signature URL: From lennart at karssen.org Mon Jun 23 18:39:35 2014 From: lennart at karssen.org (L.C. Karssen) Date: Mon, 23 Jun 2014 18:39:35 +0200 Subject: [GenABEL-dev] P-values In-Reply-To: <244CF001646FF74FB34F372310A332C57BBDD8@MBX5.rwth-ad.de> References: <244CF001646FF74FB34F372310A332C57BBD8F@MBX5.rwth-ad.de>, <53A830D9.8020008@karssen.org> <244CF001646FF74FB34F372310A332C57BBDD8@MBX5.rwth-ad.de> Message-ID: <53A85847.3050600@karssen.org> Hi Alvaro, On 23-06-14 16:02, Frank, Alvaro Jesus wrote: > That sounds cool. An (unnecessary?) word of caution though: I don't know > how much research you have done on this before starting, but make sure > to look at existing implementations of p-value calculation. Given that > in genetics p-values can become extremely small (e.g. 1e-100 or smaller) > numerical problems abound. > See also http://dx.doi.org/10.1016/j.csda.2008.11.028 > > Be sure to check out existing implementations (e.g. in Boost) on how > integration is being handled. Somewhere at the back of my mind the term > 'incomplete Gamma function' comes to mind. > > Such extreme values are directly related to the t-score, and it starts > being a problem at around 20+. > For those cases I am set to use arbitrary precise functions as in: > http://www.mpfr.org/mpfr-current/mpfr.html Thanks for the link. I'm not really familiar with arbitrary precision math, and this seems like a good place to start reading and playing. > > Which leads me to the critic that if people only report p-values as > their work, perhaps there is a cultural and scientific problem with the > workflow and conceptual basis. But if they want their 3.5*10^-300 I will > calculate it. For now, that's the way life is. ;-) > > I would make the actual threshold a user-definable option, by the way. > There may be use cases where people want to have accurate p-values even > for non-significant cases. > > I was thinking the same. From Local workflows here at Helmholtz I see > that any insignificant value gets ignored completely unless it comes to > doing Manhattan plots. That's mostly my experience as well. However, when it comes to meta-analysing data from different groups data (sometimes including p-values) need to be present for all SNPs. > So another setting would be to not calculate or > store at all anything for non significant t-scores, which is helpfull > during early discovery of significant signals. I think that such an option would be a good thing. > For Final calculations > for plotting and reporting, the setting can then be changed to enable > those. This effectivly removes the "Big data" problem, since 99% of the > data is meaningless to determining whether the slope is significant. I agree. > > In ProbABEL we are now implementing the calculation of p-values, but the > actual calculation is being left to the Boost libraries. Currently we > calculate the p-values irrespective of the value of the statistic. I'll > profile the code to see how much time this takes and whether it makes > sense to only calculate p-values for statistics with a value above a > certain threshold. > > I was considering boost, if your profiling is good, perhaps boost is > doing the threshold already. Let me know to see if I should switch over > to boost too. I'm not familiar with MPFR, but from a maintainability point of view it may be worth considering Boost since we're already using that in ProbABEL. Moreover, since Boost contains many other algorithms and is extensively used I also assume it is well-tested. Another point to consider is that Boost is C++ whereas MPFR is C. Since most of the code we write is C++ it makes sense to go for Boost (again, from a maintainability point of view; from a performance PoV that may be different). As to the profiling: I'll keep you posted! Best, Lennart. > > -A Frank > ________________________________________ > From: genabel-devel-bounces at lists.r-forge.r-project.org > [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of L.C. > Karssen [lennart at karssen.org] > Sent: Monday, June 23, 2014 3:51 PM > To: genabel-devel at lists.r-forge.r-project.org > Subject: Re: [GenABEL-dev] P-values > > Hi Alvaro, > > On 23-06-14 15:21, Frank, Alvaro Jesus wrote: >> Hi all, >> >> I have been trying to figure out an efficient way to calculate p-values, >> and it seems that I managed to come to an efficient compromise between >> speed an accuracy. > > That sounds cool. An (unnecessary?) word of caution though: I don't know > how much research you have done on this before starting, but make sure > to look at existing implementations of p-value calculation. Given that > in genetics p-values can become extremely small (e.g. 1e-100 or smaller) > numerical problems abound. > See also http://dx.doi.org/10.1016/j.csda.2008.11.028 > >> 99% of the regressions will yield a non significant p >> value, that is to say p > 0.05 or even conservatively 0.1 . It is easy >> to know apriori if the p-value is significant, by looking at the t-score >> where it originates from. For any t-score < 1.28 the pvalue will not go >> below 0.1 for a t-distribution or normal distribution. In this cases >> (9X%) an aproximation with an error of 10^-(4~5) of the p-value is > > Do you mean an absolute or relative error of 1e-4 here? > >> enough. This calculation is efficient involving only a quadratic >> polynomial to be approximated. For possible significant p-values, with >> t-score > 1.28 a proper calculation of the p-value can be done. > > I like the idea of setting a threshold for t, below which doing slow but > accurate calculations make no sense. If that gives a considerable speed > up, go for it! > I would make the actual threshold a user-definable option, by the way. > There may be use cases where people want to have accurate p-values even > for non-significant cases. > >> Note >> that a p-value calculation involves approximating the integral of the >> distribution used, either t-students (n<1000?) or normal distribution. > > Be sure to check out existing implementations (e.g. in Boost) on how > integration is being handled. Somewhere at the back of my mind the term > 'incomplete Gamma function' comes to mind. > >> >> How are the different genabel packages handling this at the moment? This >> is a speedup that can be applied to any p-value calculation. > > The R-based packages will most likely use the functions provided by R, > e.g. pchisq() for calculating the p-value of a chi^2 statistic. > In ProbABEL we are now implementing the calculation of p-values, but the > actual calculation is being left to the Boost libraries. Currently we > calculate the p-values irrespective of the value of the statistic. I'll > profile the code to see how much time this takes and whether it makes > sense to only calculate p-values for statistics with a value above a > certain threshold. > >> >> Plotted Error between 1-ncdf(x) and polynomial approx: >> >> > http://www.wolframalpha.com/input/?i=y+%3D+%28%281%2F2+-+1%2F2*erf%28x%2Fsqrt%282%29%29%29+-+%281%2F2-%280.1*x*%284.4-x%29%29%29%29+from+0+to+1.28 >> > > ah, that leads me to believe the answer to my question above is that you > mean an absolute error of ~1e-4, correct? > > > Best, > > Lennart. > >> >> -Alvaro >> >> >> _______________________________________________ >> genabel-devel mailing list >> genabel-devel at lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >> > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > GPG key ID: A88F554A > -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 213 bytes Desc: OpenPGP digital signature URL: From alvaro.frank at rwth-aachen.de Tue Jun 24 11:51:06 2014 From: alvaro.frank at rwth-aachen.de (Frank, Alvaro Jesus) Date: Tue, 24 Jun 2014 09:51:06 +0000 Subject: [GenABEL-dev] automake boost Message-ID: <244CF001646FF74FB34F372310A332C57BC165@MBX5.rwth-ad.de> Hi Lennart, I am, having trouble with automake recognizing boost root path. I used AX_BOOST_BASE but the root path (/usr/local/) is still not being recognized. and the path $Boost_root is still not helping me, perhaps I am setting it wrong. Do you also compile it all or keep the macros as they are? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From lennart at karssen.org Tue Jun 24 12:02:14 2014 From: lennart at karssen.org (L.C. Karssen) Date: Tue, 24 Jun 2014 12:02:14 +0200 Subject: [GenABEL-dev] automake boost In-Reply-To: <244CF001646FF74FB34F372310A332C57BC165@MBX5.rwth-ad.de> References: <244CF001646FF74FB34F372310A332C57BC165@MBX5.rwth-ad.de> Message-ID: <53A94CA6.7030100@karssen.org> Hi Alvaro, On 24-06-14 11:51, Frank, Alvaro Jesus wrote: > Hi Lennart, > > I am, having trouble with automake recognizing boost root path. > I used AX_BOOST_BASE but the root path (/usr/local/) is still not being > recognized. and the path $Boost_root is still not helping me, perhaps I > am setting it wrong. > Do you also compile it all or keep the macros as they are? I use the Boost package provided by Ubuntu, so I haven't compiled them myself. In that case the libraries are installed in /usr/include/boost/. In ProbABEL's configure.ac I test for boost as follows (see the branch at https://r-forge.r-project.org/scm/viewvc.php/branches/ProbABEL-pvals/ProbABEL/?root=genabel): ---------------------------------------- if test "x$with_boost_math" != "xno"; then AC_MSG_NOTICE([building using the Boost Math library enabled]) AC_ARG_WITH([boost-include-path], [AS_HELP_STRING([--with-boost-include-path], [location of the Boost headers, defaults to /usr/include/boost])], [CXXFLAGS+=" -I${withval}" CPPFLAGS+=" -I${withval}"], [CXXFLAGS+=' -I/usr/include/boost' CPPFLAGS+=' -I/usr/include/boost']) # Check for the Boost Math header files AC_CHECK_HEADERS([boost/math/distributions.hpp]) if test x$ac_cv_header_boost_math_distributions_hpp = xno; then AC_MSG_ERROR([Could not find the Boost Math header files. Did \ you specify --with-boost-include-path correctly? Or use --without-boost \ to disable the calculation of p-values.]) fi else AC_MSG_NOTICE([not using the Boost Math libraries, so no p-values in the \ output]) fi AM_CONDITIONAL([WITH_BOOST_MATH], test "x$with_boost_math" != "xno") ---------------------------------------- This creates a variable WITH_BOOST_MATH that can be used in AutoMake's Makefile.am files: ---------------------------------------- if WITH_BOOST_MATH palinear_CXXFLAGS += -DWITH_BOOST_MATH endif ---------------------------------------- And then in the .cpp files, for example: ---------------------------------------- #if WITH_BOOST_MATH std::vector pval; #endif ---------------------------------------- Hope that helps. Lennart. > > Thanks! > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 213 bytes Desc: OpenPGP digital signature URL: From alvaro.frank at rwth-aachen.de Tue Jun 24 13:19:23 2014 From: alvaro.frank at rwth-aachen.de (Frank, Alvaro Jesus) Date: Tue, 24 Jun 2014 11:19:23 +0000 Subject: [GenABEL-dev] automake boost In-Reply-To: <53A94CA6.7030100@karssen.org> References: <244CF001646FF74FB34F372310A332C57BC165@MBX5.rwth-ad.de>, <53A94CA6.7030100@karssen.org> Message-ID: <244CF001646FF74FB34F372310A332C57BC18E@MBX5.rwth-ad.de> Thanks for the info, I tried my version (which fails to recognize boost at all) and yours (which configure doesnt recognize boost but I can still get the compiler to include the files) and I get the following error (lots of similar ones after this): /usr/include/boost/iterator/iterator_facade.hpp:43:19: error: expected identifier before ?(? token template class iterator_facade; by just including once the following: #include Any Ideas? ________________________________________ From: genabel-devel-bounces at lists.r-forge.r-project.org [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of L.C. Karssen [lennart at karssen.org] Sent: Tuesday, June 24, 2014 12:02 PM To: genabel-devel at lists.r-forge.r-project.org Subject: Re: [GenABEL-dev] automake boost Hi Alvaro, On 24-06-14 11:51, Frank, Alvaro Jesus wrote: > Hi Lennart, > > I am, having trouble with automake recognizing boost root path. > I used AX_BOOST_BASE but the root path (/usr/local/) is still not being > recognized. and the path $Boost_root is still not helping me, perhaps I > am setting it wrong. > Do you also compile it all or keep the macros as they are? I use the Boost package provided by Ubuntu, so I haven't compiled them myself. In that case the libraries are installed in /usr/include/boost/. In ProbABEL's configure.ac I test for boost as follows (see the branch at https://r-forge.r-project.org/scm/viewvc.php/branches/ProbABEL-pvals/ProbABEL/?root=genabel): ---------------------------------------- if test "x$with_boost_math" != "xno"; then AC_MSG_NOTICE([building using the Boost Math library enabled]) AC_ARG_WITH([boost-include-path], [AS_HELP_STRING([--with-boost-include-path], [location of the Boost headers, defaults to /usr/include/boost])], [CXXFLAGS+=" -I${withval}" CPPFLAGS+=" -I${withval}"], [CXXFLAGS+=' -I/usr/include/boost' CPPFLAGS+=' -I/usr/include/boost']) # Check for the Boost Math header files AC_CHECK_HEADERS([boost/math/distributions.hpp]) if test x$ac_cv_header_boost_math_distributions_hpp = xno; then AC_MSG_ERROR([Could not find the Boost Math header files. Did \ you specify --with-boost-include-path correctly? Or use --without-boost \ to disable the calculation of p-values.]) fi else AC_MSG_NOTICE([not using the Boost Math libraries, so no p-values in the \ output]) fi AM_CONDITIONAL([WITH_BOOST_MATH], test "x$with_boost_math" != "xno") ---------------------------------------- This creates a variable WITH_BOOST_MATH that can be used in AutoMake's Makefile.am files: ---------------------------------------- if WITH_BOOST_MATH palinear_CXXFLAGS += -DWITH_BOOST_MATH endif ---------------------------------------- And then in the .cpp files, for example: ---------------------------------------- #if WITH_BOOST_MATH std::vector pval; #endif ---------------------------------------- Hope that helps. Lennart. > > Thanks! > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- From alvaro.frank at rwth-aachen.de Tue Jun 24 14:11:24 2014 From: alvaro.frank at rwth-aachen.de (Frank, Alvaro Jesus) Date: Tue, 24 Jun 2014 12:11:24 +0000 Subject: [GenABEL-dev] automake boost In-Reply-To: <244CF001646FF74FB34F372310A332C57BC18E@MBX5.rwth-ad.de> References: <244CF001646FF74FB34F372310A332C57BC165@MBX5.rwth-ad.de>, <53A94CA6.7030100@karssen.org>, <244CF001646FF74FB34F372310A332C57BC18E@MBX5.rwth-ad.de> Message-ID: <244CF001646FF74FB34F372310A332C57BC1AB@MBX5.rwth-ad.de> I am truly stuck with this. The library installs correctly but including it just breaks the compile. ________________________________________ From: genabel-devel-bounces at lists.r-forge.r-project.org [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of Frank, Alvaro Jesus [alvaro.frank at rwth-aachen.de] Sent: Tuesday, June 24, 2014 1:19 PM To: L.C. Karssen; genabel-devel at lists.r-forge.r-project.org Subject: Re: [GenABEL-dev] automake boost Thanks for the info, I tried my version (which fails to recognize boost at all) and yours (which configure doesnt recognize boost but I can still get the compiler to include the files) and I get the following error (lots of similar ones after this): /usr/include/boost/iterator/iterator_facade.hpp:43:19: error: expected identifier before ?(? token template class iterator_facade; by just including once the following: #include Any Ideas? ________________________________________ From: genabel-devel-bounces at lists.r-forge.r-project.org [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of L.C. Karssen [lennart at karssen.org] Sent: Tuesday, June 24, 2014 12:02 PM To: genabel-devel at lists.r-forge.r-project.org Subject: Re: [GenABEL-dev] automake boost Hi Alvaro, On 24-06-14 11:51, Frank, Alvaro Jesus wrote: > Hi Lennart, > > I am, having trouble with automake recognizing boost root path. > I used AX_BOOST_BASE but the root path (/usr/local/) is still not being > recognized. and the path $Boost_root is still not helping me, perhaps I > am setting it wrong. > Do you also compile it all or keep the macros as they are? I use the Boost package provided by Ubuntu, so I haven't compiled them myself. In that case the libraries are installed in /usr/include/boost/. In ProbABEL's configure.ac I test for boost as follows (see the branch at https://r-forge.r-project.org/scm/viewvc.php/branches/ProbABEL-pvals/ProbABEL/?root=genabel): ---------------------------------------- if test "x$with_boost_math" != "xno"; then AC_MSG_NOTICE([building using the Boost Math library enabled]) AC_ARG_WITH([boost-include-path], [AS_HELP_STRING([--with-boost-include-path], [location of the Boost headers, defaults to /usr/include/boost])], [CXXFLAGS+=" -I${withval}" CPPFLAGS+=" -I${withval}"], [CXXFLAGS+=' -I/usr/include/boost' CPPFLAGS+=' -I/usr/include/boost']) # Check for the Boost Math header files AC_CHECK_HEADERS([boost/math/distributions.hpp]) if test x$ac_cv_header_boost_math_distributions_hpp = xno; then AC_MSG_ERROR([Could not find the Boost Math header files. Did \ you specify --with-boost-include-path correctly? Or use --without-boost \ to disable the calculation of p-values.]) fi else AC_MSG_NOTICE([not using the Boost Math libraries, so no p-values in the \ output]) fi AM_CONDITIONAL([WITH_BOOST_MATH], test "x$with_boost_math" != "xno") ---------------------------------------- This creates a variable WITH_BOOST_MATH that can be used in AutoMake's Makefile.am files: ---------------------------------------- if WITH_BOOST_MATH palinear_CXXFLAGS += -DWITH_BOOST_MATH endif ---------------------------------------- And then in the .cpp files, for example: ---------------------------------------- #if WITH_BOOST_MATH std::vector pval; #endif ---------------------------------------- Hope that helps. Lennart. > > Thanks! > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- _______________________________________________ genabel-devel mailing list genabel-devel at lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel From lennart at karssen.org Tue Jun 24 14:21:57 2014 From: lennart at karssen.org (L.C. Karssen) Date: Tue, 24 Jun 2014 14:21:57 +0200 Subject: [GenABEL-dev] automake boost In-Reply-To: <244CF001646FF74FB34F372310A332C57BC18E@MBX5.rwth-ad.de> References: <244CF001646FF74FB34F372310A332C57BC165@MBX5.rwth-ad.de>, <53A94CA6.7030100@karssen.org> <244CF001646FF74FB34F372310A332C57BC18E@MBX5.rwth-ad.de> Message-ID: <53A96D65.9080604@karssen.org> Hi Alvaro, So if I understand you correctly this is the situation: - When using my version autoconf doesn't find boost. Correct? - But nevertheless if you try to include the erf.hpp file (and manually (?) instruct the compiler where to find the header files, you get the error below. How do you compile your code? Manually involing g++ or using a Makefile generated by automake? Could you provide me with a minimum working example (actually a minimum breaking example) of configure.ac, Makefile.am and main.cpp to test? Lennart. On 24-06-14 13:19, Frank, Alvaro Jesus wrote: > Thanks for the info, > > I tried my version (which fails to recognize boost at all) and yours (which configure doesnt recognize boost but I can still get the compiler to include the files) and I get the following error (lots of similar ones after this): > > /usr/include/boost/iterator/iterator_facade.hpp:43:19: error: expected identifier before ?(? token > template class iterator_facade; > > by just including once the following: > > #include > > Any Ideas? > ________________________________________ > From: genabel-devel-bounces at lists.r-forge.r-project.org [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of L.C. Karssen [lennart at karssen.org] > Sent: Tuesday, June 24, 2014 12:02 PM > To: genabel-devel at lists.r-forge.r-project.org > Subject: Re: [GenABEL-dev] automake boost > > Hi Alvaro, > > > On 24-06-14 11:51, Frank, Alvaro Jesus wrote: >> Hi Lennart, >> >> I am, having trouble with automake recognizing boost root path. >> I used AX_BOOST_BASE but the root path (/usr/local/) is still not being >> recognized. and the path $Boost_root is still not helping me, perhaps I >> am setting it wrong. >> Do you also compile it all or keep the macros as they are? > > I use the Boost package provided by Ubuntu, so I haven't compiled them > myself. In that case the libraries are installed in /usr/include/boost/. > > In ProbABEL's configure.ac I test for boost as follows (see the branch > at > https://r-forge.r-project.org/scm/viewvc.php/branches/ProbABEL-pvals/ProbABEL/?root=genabel): > > ---------------------------------------- > if test "x$with_boost_math" != "xno"; then > AC_MSG_NOTICE([building using the Boost Math library enabled]) > > AC_ARG_WITH([boost-include-path], > [AS_HELP_STRING([--with-boost-include-path], > [location of the Boost headers, defaults to /usr/include/boost])], > [CXXFLAGS+=" -I${withval}" > CPPFLAGS+=" -I${withval}"], > [CXXFLAGS+=' -I/usr/include/boost' > CPPFLAGS+=' -I/usr/include/boost']) > > # Check for the Boost Math header files > AC_CHECK_HEADERS([boost/math/distributions.hpp]) > > if test x$ac_cv_header_boost_math_distributions_hpp = xno; then > AC_MSG_ERROR([Could not find the Boost Math header files. Did \ > you specify --with-boost-include-path correctly? Or use --without-boost \ > to disable the calculation of p-values.]) > fi > else > AC_MSG_NOTICE([not using the Boost Math libraries, so no p-values in > the \ > output]) > fi > AM_CONDITIONAL([WITH_BOOST_MATH], test "x$with_boost_math" != "xno") > ---------------------------------------- > > This creates a variable WITH_BOOST_MATH that can be used in AutoMake's > Makefile.am files: > ---------------------------------------- > if WITH_BOOST_MATH > palinear_CXXFLAGS += -DWITH_BOOST_MATH > endif > ---------------------------------------- > > And then in the .cpp files, for example: > ---------------------------------------- > #if WITH_BOOST_MATH > std::vector pval; > #endif > ---------------------------------------- > > > Hope that helps. > > Lennart. > > >> >> Thanks! >> >> >> _______________________________________________ >> genabel-devel mailing list >> genabel-devel at lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >> > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > GPG key ID: A88F554A > -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 213 bytes Desc: OpenPGP digital signature URL: From alvaro.frank at rwth-aachen.de Tue Jun 24 14:25:28 2014 From: alvaro.frank at rwth-aachen.de (Frank, Alvaro Jesus) Date: Tue, 24 Jun 2014 12:25:28 +0000 Subject: [GenABEL-dev] automake boost In-Reply-To: <53A96D65.9080604@karssen.org> References: <244CF001646FF74FB34F372310A332C57BC165@MBX5.rwth-ad.de>, <53A94CA6.7030100@karssen.org> <244CF001646FF74FB34F372310A332C57BC18E@MBX5.rwth-ad.de>, <53A96D65.9080604@karssen.org> Message-ID: <244CF001646FF74FB34F372310A332C57BC1BF@MBX5.rwth-ad.de> Using any working copy of boost, including the file #include will lead to a failed compilation IF the file was NOT included as the first file in the main.cpp which is ridiculous. I got it working right now. It could be some kind of redefinition of something relevant only to this code. ________________________________________ From: L.C. Karssen [lennart at karssen.org] Sent: Tuesday, June 24, 2014 2:21 PM To: Frank, Alvaro Jesus; genabel-devel at lists.r-forge.r-project.org Subject: Re: [GenABEL-dev] automake boost Hi Alvaro, So if I understand you correctly this is the situation: - When using my version autoconf doesn't find boost. Correct? - But nevertheless if you try to include the erf.hpp file (and manually (?) instruct the compiler where to find the header files, you get the error below. How do you compile your code? Manually involing g++ or using a Makefile generated by automake? Could you provide me with a minimum working example (actually a minimum breaking example) of configure.ac, Makefile.am and main.cpp to test? Lennart. On 24-06-14 13:19, Frank, Alvaro Jesus wrote: > Thanks for the info, > > I tried my version (which fails to recognize boost at all) and yours (which configure doesnt recognize boost but I can still get the compiler to include the files) and I get the following error (lots of similar ones after this): > > /usr/include/boost/iterator/iterator_facade.hpp:43:19: error: expected identifier before ?(? token > template class iterator_facade; > > by just including once the following: > > #include > > Any Ideas? > ________________________________________ > From: genabel-devel-bounces at lists.r-forge.r-project.org [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of L.C. Karssen [lennart at karssen.org] > Sent: Tuesday, June 24, 2014 12:02 PM > To: genabel-devel at lists.r-forge.r-project.org > Subject: Re: [GenABEL-dev] automake boost > > Hi Alvaro, > > > On 24-06-14 11:51, Frank, Alvaro Jesus wrote: >> Hi Lennart, >> >> I am, having trouble with automake recognizing boost root path. >> I used AX_BOOST_BASE but the root path (/usr/local/) is still not being >> recognized. and the path $Boost_root is still not helping me, perhaps I >> am setting it wrong. >> Do you also compile it all or keep the macros as they are? > > I use the Boost package provided by Ubuntu, so I haven't compiled them > myself. In that case the libraries are installed in /usr/include/boost/. > > In ProbABEL's configure.ac I test for boost as follows (see the branch > at > https://r-forge.r-project.org/scm/viewvc.php/branches/ProbABEL-pvals/ProbABEL/?root=genabel): > > ---------------------------------------- > if test "x$with_boost_math" != "xno"; then > AC_MSG_NOTICE([building using the Boost Math library enabled]) > > AC_ARG_WITH([boost-include-path], > [AS_HELP_STRING([--with-boost-include-path], > [location of the Boost headers, defaults to /usr/include/boost])], > [CXXFLAGS+=" -I${withval}" > CPPFLAGS+=" -I${withval}"], > [CXXFLAGS+=' -I/usr/include/boost' > CPPFLAGS+=' -I/usr/include/boost']) > > # Check for the Boost Math header files > AC_CHECK_HEADERS([boost/math/distributions.hpp]) > > if test x$ac_cv_header_boost_math_distributions_hpp = xno; then > AC_MSG_ERROR([Could not find the Boost Math header files. Did \ > you specify --with-boost-include-path correctly? Or use --without-boost \ > to disable the calculation of p-values.]) > fi > else > AC_MSG_NOTICE([not using the Boost Math libraries, so no p-values in > the \ > output]) > fi > AM_CONDITIONAL([WITH_BOOST_MATH], test "x$with_boost_math" != "xno") > ---------------------------------------- > > This creates a variable WITH_BOOST_MATH that can be used in AutoMake's > Makefile.am files: > ---------------------------------------- > if WITH_BOOST_MATH > palinear_CXXFLAGS += -DWITH_BOOST_MATH > endif > ---------------------------------------- > > And then in the .cpp files, for example: > ---------------------------------------- > #if WITH_BOOST_MATH > std::vector pval; > #endif > ---------------------------------------- > > > Hope that helps. > > Lennart. > > >> >> Thanks! >> >> >> _______________________________________________ >> genabel-devel mailing list >> genabel-devel at lists.r-forge.r-project.org >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >> > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > GPG key ID: A88F554A > -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- From lennart at karssen.org Tue Jun 24 14:27:22 2014 From: lennart at karssen.org (L.C. Karssen) Date: Tue, 24 Jun 2014 14:27:22 +0200 Subject: [GenABEL-dev] automake boost In-Reply-To: <53A96D65.9080604@karssen.org> References: <244CF001646FF74FB34F372310A332C57BC165@MBX5.rwth-ad.de>, <53A94CA6.7030100@karssen.org> <244CF001646FF74FB34F372310A332C57BC18E@MBX5.rwth-ad.de> <53A96D65.9080604@karssen.org> Message-ID: <53A96EAA.8040806@karssen.org> Hi Alvaro, A quick follow-up: I dug up the attached file with which I made my first Boost steps (exploring how to compute p-values for ProbABEL). If you run make p-from-chi2 in the directory that contains the .cpp file it should compile if Boost is in a standard location (no Makefile needed because make will use its built-in rules). Otherwise g++ p-from-chi2.cpp -o p-from-chi2 with the appropriate -I lines for the Boost include files should work. Lennart. On 24-06-14 14:21, L.C. Karssen wrote: > Hi Alvaro, > > So if I understand you correctly this is the situation: > - When using my version autoconf doesn't find boost. Correct? > - But nevertheless if you try to include the erf.hpp file (and manually > (?) instruct the compiler where to find the header files, you get the > error below. > > How do you compile your code? Manually involing g++ or using a Makefile > generated by automake? > > Could you provide me with a minimum working example (actually a minimum > breaking example) of configure.ac, Makefile.am and main.cpp to test? > > > Lennart. > > On 24-06-14 13:19, Frank, Alvaro Jesus wrote: >> Thanks for the info, >> >> I tried my version (which fails to recognize boost at all) and yours (which configure doesnt recognize boost but I can still get the compiler to include the files) and I get the following error (lots of similar ones after this): >> >> /usr/include/boost/iterator/iterator_facade.hpp:43:19: error: expected identifier before ?(? token >> template class iterator_facade; >> >> by just including once the following: >> >> #include >> >> Any Ideas? >> ________________________________________ >> From: genabel-devel-bounces at lists.r-forge.r-project.org [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of L.C. Karssen [lennart at karssen.org] >> Sent: Tuesday, June 24, 2014 12:02 PM >> To: genabel-devel at lists.r-forge.r-project.org >> Subject: Re: [GenABEL-dev] automake boost >> >> Hi Alvaro, >> >> >> On 24-06-14 11:51, Frank, Alvaro Jesus wrote: >>> Hi Lennart, >>> >>> I am, having trouble with automake recognizing boost root path. >>> I used AX_BOOST_BASE but the root path (/usr/local/) is still not being >>> recognized. and the path $Boost_root is still not helping me, perhaps I >>> am setting it wrong. >>> Do you also compile it all or keep the macros as they are? >> >> I use the Boost package provided by Ubuntu, so I haven't compiled them >> myself. In that case the libraries are installed in /usr/include/boost/. >> >> In ProbABEL's configure.ac I test for boost as follows (see the branch >> at >> https://r-forge.r-project.org/scm/viewvc.php/branches/ProbABEL-pvals/ProbABEL/?root=genabel): >> >> ---------------------------------------- >> if test "x$with_boost_math" != "xno"; then >> AC_MSG_NOTICE([building using the Boost Math library enabled]) >> >> AC_ARG_WITH([boost-include-path], >> [AS_HELP_STRING([--with-boost-include-path], >> [location of the Boost headers, defaults to /usr/include/boost])], >> [CXXFLAGS+=" -I${withval}" >> CPPFLAGS+=" -I${withval}"], >> [CXXFLAGS+=' -I/usr/include/boost' >> CPPFLAGS+=' -I/usr/include/boost']) >> >> # Check for the Boost Math header files >> AC_CHECK_HEADERS([boost/math/distributions.hpp]) >> >> if test x$ac_cv_header_boost_math_distributions_hpp = xno; then >> AC_MSG_ERROR([Could not find the Boost Math header files. Did \ >> you specify --with-boost-include-path correctly? Or use --without-boost \ >> to disable the calculation of p-values.]) >> fi >> else >> AC_MSG_NOTICE([not using the Boost Math libraries, so no p-values in >> the \ >> output]) >> fi >> AM_CONDITIONAL([WITH_BOOST_MATH], test "x$with_boost_math" != "xno") >> ---------------------------------------- >> >> This creates a variable WITH_BOOST_MATH that can be used in AutoMake's >> Makefile.am files: >> ---------------------------------------- >> if WITH_BOOST_MATH >> palinear_CXXFLAGS += -DWITH_BOOST_MATH >> endif >> ---------------------------------------- >> >> And then in the .cpp files, for example: >> ---------------------------------------- >> #if WITH_BOOST_MATH >> std::vector pval; >> #endif >> ---------------------------------------- >> >> >> Hope that helps. >> >> Lennart. >> >> >>> >>> Thanks! >>> >>> >>> _______________________________________________ >>> genabel-devel mailing list >>> genabel-devel at lists.r-forge.r-project.org >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >>> >> >> -- >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> L.C. Karssen >> Utrecht >> The Netherlands >> >> lennart at karssen.org >> http://blog.karssen.org >> GPG key ID: A88F554A >> -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- >> > > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- A non-text attachment was scrubbed... Name: p-from-chi2.cpp Type: text/x-c++src Size: 1186 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 213 bytes Desc: OpenPGP digital signature URL: From lennart at karssen.org Tue Jun 24 14:28:40 2014 From: lennart at karssen.org (L.C. Karssen) Date: Tue, 24 Jun 2014 14:28:40 +0200 Subject: [GenABEL-dev] automake boost In-Reply-To: <244CF001646FF74FB34F372310A332C57BC1BF@MBX5.rwth-ad.de> References: <244CF001646FF74FB34F372310A332C57BC165@MBX5.rwth-ad.de>, <53A94CA6.7030100@karssen.org> <244CF001646FF74FB34F372310A332C57BC18E@MBX5.rwth-ad.de>, <53A96D65.9080604@karssen.org> <244CF001646FF74FB34F372310A332C57BC1BF@MBX5.rwth-ad.de> Message-ID: <53A96EF8.4040606@karssen.org> On 24-06-14 14:25, Frank, Alvaro Jesus wrote: > Using any working copy of boost, including the file #include > will lead to a failed compilation IF the file was NOT included as the first file in the main.cpp > which is ridiculous. I agree, that shouldn't happen. Weird! > > I got it working right now. It could be some kind of redefinition of something relevant only to this code. Good luck! Lennart. > ________________________________________ > From: L.C. Karssen [lennart at karssen.org] > Sent: Tuesday, June 24, 2014 2:21 PM > To: Frank, Alvaro Jesus; genabel-devel at lists.r-forge.r-project.org > Subject: Re: [GenABEL-dev] automake boost > > Hi Alvaro, > > So if I understand you correctly this is the situation: > - When using my version autoconf doesn't find boost. Correct? > - But nevertheless if you try to include the erf.hpp file (and manually > (?) instruct the compiler where to find the header files, you get the > error below. > > How do you compile your code? Manually involing g++ or using a Makefile > generated by automake? > > Could you provide me with a minimum working example (actually a minimum > breaking example) of configure.ac, Makefile.am and main.cpp to test? > > > Lennart. > > On 24-06-14 13:19, Frank, Alvaro Jesus wrote: >> Thanks for the info, >> >> I tried my version (which fails to recognize boost at all) and yours (which configure doesnt recognize boost but I can still get the compiler to include the files) and I get the following error (lots of similar ones after this): >> >> /usr/include/boost/iterator/iterator_facade.hpp:43:19: error: expected identifier before ?(? token >> template class iterator_facade; >> >> by just including once the following: >> >> #include >> >> Any Ideas? >> ________________________________________ >> From: genabel-devel-bounces at lists.r-forge.r-project.org [genabel-devel-bounces at lists.r-forge.r-project.org] on behalf of L.C. Karssen [lennart at karssen.org] >> Sent: Tuesday, June 24, 2014 12:02 PM >> To: genabel-devel at lists.r-forge.r-project.org >> Subject: Re: [GenABEL-dev] automake boost >> >> Hi Alvaro, >> >> >> On 24-06-14 11:51, Frank, Alvaro Jesus wrote: >>> Hi Lennart, >>> >>> I am, having trouble with automake recognizing boost root path. >>> I used AX_BOOST_BASE but the root path (/usr/local/) is still not being >>> recognized. and the path $Boost_root is still not helping me, perhaps I >>> am setting it wrong. >>> Do you also compile it all or keep the macros as they are? >> >> I use the Boost package provided by Ubuntu, so I haven't compiled them >> myself. In that case the libraries are installed in /usr/include/boost/. >> >> In ProbABEL's configure.ac I test for boost as follows (see the branch >> at >> https://r-forge.r-project.org/scm/viewvc.php/branches/ProbABEL-pvals/ProbABEL/?root=genabel): >> >> ---------------------------------------- >> if test "x$with_boost_math" != "xno"; then >> AC_MSG_NOTICE([building using the Boost Math library enabled]) >> >> AC_ARG_WITH([boost-include-path], >> [AS_HELP_STRING([--with-boost-include-path], >> [location of the Boost headers, defaults to /usr/include/boost])], >> [CXXFLAGS+=" -I${withval}" >> CPPFLAGS+=" -I${withval}"], >> [CXXFLAGS+=' -I/usr/include/boost' >> CPPFLAGS+=' -I/usr/include/boost']) >> >> # Check for the Boost Math header files >> AC_CHECK_HEADERS([boost/math/distributions.hpp]) >> >> if test x$ac_cv_header_boost_math_distributions_hpp = xno; then >> AC_MSG_ERROR([Could not find the Boost Math header files. Did \ >> you specify --with-boost-include-path correctly? Or use --without-boost \ >> to disable the calculation of p-values.]) >> fi >> else >> AC_MSG_NOTICE([not using the Boost Math libraries, so no p-values in >> the \ >> output]) >> fi >> AM_CONDITIONAL([WITH_BOOST_MATH], test "x$with_boost_math" != "xno") >> ---------------------------------------- >> >> This creates a variable WITH_BOOST_MATH that can be used in AutoMake's >> Makefile.am files: >> ---------------------------------------- >> if WITH_BOOST_MATH >> palinear_CXXFLAGS += -DWITH_BOOST_MATH >> endif >> ---------------------------------------- >> >> And then in the .cpp files, for example: >> ---------------------------------------- >> #if WITH_BOOST_MATH >> std::vector pval; >> #endif >> ---------------------------------------- >> >> >> Hope that helps. >> >> Lennart. >> >> >>> >>> Thanks! >>> >>> >>> _______________________________________________ >>> genabel-devel mailing list >>> genabel-devel at lists.r-forge.r-project.org >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel >>> >> >> -- >> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* >> L.C. Karssen >> Utrecht >> The Netherlands >> >> lennart at karssen.org >> http://blog.karssen.org >> GPG key ID: A88F554A >> -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- >> > > -- > *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* > L.C. Karssen > Utrecht > The Netherlands > > lennart at karssen.org > http://blog.karssen.org > GPG key ID: A88F554A > -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 213 bytes Desc: OpenPGP digital signature URL: From yurii.aulchenko at gmail.com Fri Jun 27 14:29:11 2014 From: yurii.aulchenko at gmail.com (Yurii Aulchenko) Date: Fri, 27 Jun 2014 14:29:11 +0200 Subject: [GenABEL-dev] Missing file for unit test in GenABEL In-Reply-To: <20140626134702.GA19423@an3as.eu> References: <20140620084443.GA17492@an3as.eu> <20140626134702.GA19423@an3as.eu> Message-ID: <1145061290386285028@unknownmsgid> Dear Andreas, Sorry it delay with answer. We specifically remove unit tests before submitting to cran. Our dev repo does contain a version with unit tests, also tags for the versions. So one option would be to use our code repo instead of what is on cran. What do you think? Yurii ---------------------- Yurii Aulchenko (sent from mobile device) > On Jun 26, 2014, at 15:47, Andreas Tille wrote: > > Hi Yurii, > >> On Fri, Jun 20, 2014 at 10:44:43AM +0200, Andreas Tille wrote: >> Hi Yurii, >> >> I'm trying to update the Debian package of GenABEL. Since some time >> there is an effort to automatically run unit tests of software if >> available. Since I noticed that GenABEL comes with unit tests I >> tried >> >> $ make test >> export RCMDCHECK=FALSE;\ >> cd ../../tests;\ >> R --vanilla --slave < doRUnit.R >> /bin/sh: 2: cd: can't cd to ../../tests >> /bin/sh: 3: cannot open doRUnit.R: No such file >> make: *** [test] Error 2 >> >> >> As you can see the file doRUnit.R is missing. It would be great if you >> could include this file into the source diustribution to make sure we >> can reproduce your exact test procedure in the Debian package. > > I cloned https://github.com/cran/GenABEL.git and had a look into the > code. I realised that the last tag containing the file doRUnit.R was > 1.6-7. I wonder how you are doing unit testing in the current version. > > Thanks for any enlightenment > > Andreas. > > -- > http://fam-tille.de From andreas at an3as.eu Fri Jun 27 14:34:55 2014 From: andreas at an3as.eu (Andreas Tille) Date: Fri, 27 Jun 2014 14:34:55 +0200 Subject: [GenABEL-dev] Missing file for unit test in GenABEL In-Reply-To: <1145061290386285028@unknownmsgid> References: <20140620084443.GA17492@an3as.eu> <20140626134702.GA19423@an3as.eu> <1145061290386285028@unknownmsgid> Message-ID: <20140627123455.GC17339@an3as.eu> Hi Yurii, On Fri, Jun 27, 2014 at 02:29:11PM +0200, Yurii Aulchenko wrote: > Dear Andreas, > > Sorry it delay with answer. No problem. :-) > We specifically remove unit tests before submitting to cran. Our dev > repo does contain a version with unit tests, also tags for the > versions. Do you have any reasons to remove the tests from the tarball? I could imagine that larger chunks of data might be used but I assumed that for this very purpose GenABEL.data exists ... and so I did package it for Debian as well. > So one option would be to use our code repo instead of what is on > cran. What do you think? While this would possible in prinziple I personally really prefer a downloadable source tarball including testing features if any possible. However, if you might have strong reasons I would find ways to drain the stuff needed from the repository. Kind regards Andreas. > ---------------------- > Yurii Aulchenko > (sent from mobile device) > > > On Jun 26, 2014, at 15:47, Andreas Tille wrote: > > > > Hi Yurii, > > > >> On Fri, Jun 20, 2014 at 10:44:43AM +0200, Andreas Tille wrote: > >> Hi Yurii, > >> > >> I'm trying to update the Debian package of GenABEL. Since some time > >> there is an effort to automatically run unit tests of software if > >> available. Since I noticed that GenABEL comes with unit tests I > >> tried > >> > >> $ make test > >> export RCMDCHECK=FALSE;\ > >> cd ../../tests;\ > >> R --vanilla --slave < doRUnit.R > >> /bin/sh: 2: cd: can't cd to ../../tests > >> /bin/sh: 3: cannot open doRUnit.R: No such file > >> make: *** [test] Error 2 > >> > >> > >> As you can see the file doRUnit.R is missing. It would be great if you > >> could include this file into the source diustribution to make sure we > >> can reproduce your exact test procedure in the Debian package. > > > > I cloned https://github.com/cran/GenABEL.git and had a look into the > > code. I realised that the last tag containing the file doRUnit.R was > > 1.6-7. I wonder how you are doing unit testing in the current version. > > > > Thanks for any enlightenment > > > > Andreas. > > > > -- > > http://fam-tille.de > -- http://fam-tille.de