[GenABEL-dev] [Genabel-commits] r1671 - in branches/ProbABEL-0.50: examples src

Mon Apr 7 09:06:31 CEST 2014

Hi Maarten,

On 02-04-14 22:25, noreply at r-forge.r-project.org wrote:
> Author: maartenk
> Date: 2014-04-02 22:25:43 +0200 (Wed, 02 Apr 2014)
> New Revision: 1671
> 
> Modified:
>    branches/ProbABEL-0.50/examples/mmscore.R
>    branches/ProbABEL-0.50/src/probabel
>    branches/ProbABEL-0.50/src/reg1.cpp
> Log:
> reg1.cpp : simplified code of mmscore(palinear) and fixed failing test_mms.sh check
> probabel: introduce check that verifies phenotype file does exists before running pa* (prevents errors while running the script)  
> mmscore.R: fixed small typo
> 
> Modified: branches/ProbABEL-0.50/examples/mmscore.R
> ===================================================================
> --- branches/ProbABEL-0.50/examples/mmscore.R	2014-04-02 15:57:31 UTC (rev 1670)
> +++ branches/ProbABEL-0.50/examples/mmscore.R	2014-04-02 20:25:43 UTC (rev 1671)
> @@ -88,5 +88,5 @@
>  ## 2) residuals of the phenotype, which will be the new phenotype that
>  ## ProbABEL will analyse.
>  
> -## Mow, go to ProbABEL and start analysis
> +## Now, go to ProbABEL and start analysis

Thanks :-).

>  
> 
> Modified: branches/ProbABEL-0.50/src/probabel
> ===================================================================
> --- branches/ProbABEL-0.50/src/probabel	2014-04-02 15:57:31 UTC (rev 1670)
> +++ branches/ProbABEL-0.50/src/probabel	2014-04-02 20:25:43 UTC (rev 1671)
> @@ -169,6 +169,11 @@
>  
>  
>  my $phename = $ARGV[5];
> +if (! -e $phename.".PHE"){
> +die "Phenotype file $phename.PHE does not exists. The phenotype file should be specified without the .PHE extension.\n";
> +}
> +
> +
>  # By default the output file prefix is the same as the name of the
>  # phenotype file (minus the .PHE extension and any paths)
>  use File::Basename;
> 
> Modified: branches/ProbABEL-0.50/src/reg1.cpp
> ===================================================================
> --- branches/ProbABEL-0.50/src/reg1.cpp	2014-04-02 15:57:31 UTC (rev 1670)
> +++ branches/ProbABEL-0.50/src/reg1.cpp	2014-04-02 20:25:43 UTC (rev 1671)
> @@ -313,35 +313,21 @@
>  void linear_reg::mmscore_regression(const mematrix<double>& X,
>          const masked_matrix& W_masked, LDLT<MatrixXd>& Ch) {
>  
> -
>      VectorXd Y = reg_data.Y.data.col(0);
> -    if (X.data.cols() == 3)
> -    {
> -        Matrix<double, Dynamic, 3> tXW = W_masked.masked_data->data * X.data;
> -        Matrix2d xWx = tXW.transpose() * X.data;
> -        Ch = LDLT<MatrixXd>(xWx);
> -        Vector3d beta_3f = Ch.solve(tXW.transpose() * Y);
> -        sigma2 = (Y - tXW * beta_3f).squaredNorm();
> -        beta.data = beta_3f;
> -    }
> -    else if (X.data.cols() == 2)
> -    {
> -        Matrix<double,  Dynamic,2> tXW =  W_masked.masked_data->data*X.data;
> -        Matrix2d xWx = tXW.transpose() * X.data;
> -        Ch = LDLT<MatrixXd> (xWx);
> -        Vector2d beta_2f = Ch.solve(tXW.transpose() * Y);
> -        sigma2 = (Y - tXW * beta_2f).squaredNorm();
> -        beta.data = beta_2f;
> -    }
> -    else
> -    {
> -        // next line is  5997000 flops
> -        MatrixXd tXW = X.data.transpose() * W_masked.masked_data->data;
> -        Ch = LDLT<MatrixXd>(tXW * X.data); // 17991 flops
> -        beta.data = Ch.solve(tXW * Y); //5997 flops
> -        //next line is: 1000+5000+3000= 9000 flops
> -        sigma2 = (Y - tXW.transpose() * beta.data).squaredNorm();
> -    }

Glad to see the if/else go! This is much cleaner (and apparently not
slower).

> +    /*
> +     in ProbABEL <0.50 this calculation was performed like t(X)*W
> +     This changed to W*X since this is better vectorized since the left hand
> +     side has more rows: this introduces an additional transpose, but can be
> +     neglected compared to the speedup this brings(about a factor 2 for the
> +     palinear with 1 predictor)
> +     */
> +    MatrixXd tXW = W_masked.masked_data->data * X.data;

I think the variable naming should be more apropriate here: tXW sounds
like X^t * W, but you store W * X in that variable.

> +    MatrixXd xWx = tXW.transpose() * X.data;

Similarly here, I'm not sure how to interpret xWx. Since you calculate
(W*X)^t * X a name like WXtX seems more reasonable.

> +    Ch = LDLT<MatrixXd>(xWx);
> +    VectorXd beta_vec = Ch.solve(tXW.transpose() * Y);
> +    sigma2 = (Y - tXW * beta_vec).squaredNorm();
> +    beta.data = beta_vec;
> +
>  }

Thanks for the good work!

Lennart.

>  
>  void linear_reg::logLikelihood(const mematrix<double>& X) {
> 
> _______________________________________________
> Genabel-commits mailing list
> Genabel-commits at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-commits
> 

-- 
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org
GPG key ID: A88F554A
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20140407/8699ebf0/attachment.sig>