[GenABEL-dev] [Genabel-commits] r1658 - pkg/ProbABEL/src
L.C. Karssen
lennart at karssen.org
Thu Mar 27 10:36:03 CET 2014
Hi Yurii,
Ouch! Good catch!
That's what you get when coding too late at night ;-).
Thanks a lot,
Lennart.
On 24-03-14 09:46, Yurii Aulchenko wrote:
> Sum of logs = log of prods
>
> The latter is in principle dangerous practice when we multiply
> probabilities - the product is getting close to zero very fast
> potentially leading to numerical problems (of which machine zero is
> the smallest because easily detected)
>
> Yurii
>
> ----------------------
> Yurii Aulchenko
> (sent from mobile device)
>
>> On Mar 21, 2014, at 11:57 PM, "noreply at r-forge.r-project.org" <noreply at r-forge.r-project.org> wrote:
>>
>> Author: lckarssen
>> Date: 2014-03-21 23:57:20 +0100 (Fri, 21 Mar 2014)
>> New Revision: 1658
>>
>> Modified:
>> pkg/ProbABEL/src/reg1.cpp
>> Log:
>> Speed-ups in ProbABEL's logistic regression.
>> Profiling showed lots of (expensive) calls to exp() and log(). I got rid of ~ 1/3 of the exp() calls by saving the result in a variable and reusing it in the calculation of exp(mu) / ( 1+exp(mu) ).
>> The number of log() calls was reduced even more by using the fact that sum_i( log(x_i) ) = log( prod_i(x_i) )
>>
>> In total this gives roughly a speed up of 30% -- 40% when reading txt dosage files and roughly 20% -- 30% when using filevector files (measured for dosage data).
>>
>>
>> Modified: pkg/ProbABEL/src/reg1.cpp
>> ===================================================================
>> --- pkg/ProbABEL/src/reg1.cpp 2014-03-21 21:05:44 UTC (rev 1657)
>> +++ pkg/ProbABEL/src/reg1.cpp 2014-03-21 22:57:20 UTC (rev 1658)
>> @@ -719,7 +719,8 @@
>> double emu = eMu.get(i, 0);
>> double value = emu;
>> double zval;
>> - value = exp(value) / (1. + exp(value));
>> + double expval = exp(value);
>> + value = expval / (1. + expval);
>> residuals[i] = (rdata.Y).get(i, 0) - value;
>> eMu.put(value, i, 0);
>> W.put(value * (1. - value), i, 0);
>> @@ -778,11 +779,24 @@
>> beta.print();
>> }
>> // std::cout << "beta:\n"; beta.print();
>> - // compute likelihood
>> +
>> + // Compute the likelihood. The following commented code gives
>> + // the 'easy to understand' algorithm. The code that's
>> + // actually used is mathematically equivalent (remember:
>> + // log(a*b) = log(a)+log(b)), but faster because log() is
>> + // relatively expensive.
>> + // for (int i = 0; i < eMu.nrow; i++) {
>> + // loglik += rdata.Y[i] * eMu_us[i] - log(1. +
>> + // exp(eMu_us[i]));
>> + // }
>> prevlik = loglik;
>> loglik = 0.;
>> - for (int i = 0; i < eMu.nrow; i++)
>> - loglik += rdata.Y[i] * eMu_us[i] - log(1. + exp(eMu_us[i]));
>> + double logterm = 1;
>> + for (int i = 0; i < eMu.nrow; i++) {
>> + loglik += rdata.Y[i] * eMu_us[i];
>> + logterm *= 1. + exp(eMu_us[i]);
>> + }
>> + loglik += - log(logterm);
>>
>> delta = fabs(1. - (prevlik / loglik));
>> niter++;
>>
>> _______________________________________________
>> Genabel-commits mailing list
>> Genabel-commits at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>
--
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
L.C. Karssen
Utrecht
The Netherlands
lennart at karssen.org
http://blog.karssen.org
GPG key ID: A88F554A
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20140327/31445998/attachment.sig>
More information about the genabel-devel
mailing list