[GenABEL-dev] using reshuffle

Содбо Шарапов sharapovsodbo at gmail.com
Thu Jul 25 08:44:17 CEST 2013


Dear all!
I commited newest version of reshuffle
Now reshuffle works 2x faster!=)
Reasons:

  --ostringstream oss: outputs cache

  --exclude from cycle's and  put them upper
     double* buf = new double[per_trait_per_snp];
     char s[30];

  --(int64_t) blablabla instead of (int64_t)bla + (int64_t)bla +
(int64_t)bla

To find "hot spots" in reshuffle, I used

GNU Profiler
GNU Coverage testing tool

Very useful tools to find right places in programm to optimizate!

Now 5Gb CLAK-GWAS output convert to 16 Gb txt files for 380 sec or 6
minutes.
Machine: Intel Core i7 930; 8Gb RAM (it is not cluster's node, I think on
cluster's node reshuffle's run would be faster=)

There are problems with extract heritability and write slim data.
I'll check soon





2013/7/20 Содбо Шарапов <sharapovsodbo at gmail.com>

> Hello!
> I'll will check reshuffle tomorrow.
> 20.07.2013 22:26 пользователь "Yurii Aulchenko" <yurii.aulchenko at gmail.com>
> написал:
>
> Another point: apparently you do not check boundaries - e.g. when I try to
>> get results for trait #200,000 (I have 107,000 only) I get the core dump.
>>
>> YA
>>
>> On Sat, Jul 20, 2013 at 5:15 PM, Yurii Aulchenko <
>> yurii.aulchenko at gmail.com> wrote:
>>
>>> Hi Sodbo,
>>>
>>> It seems that reshuffle does not work correctly, at least I can not get
>>> to the results with it (see below). I use a dataset with ~107k traits and
>>> ~280k SNPs.
>>>
>>> Any idea? - do I do something wrong?
>>>
>>> YA
>>>
>>> With perl-extractor I get chi2 of 62
>>>
>>> ya567666 at cluster:~[167]$ perl extractCell.pl
>>> /hpcwork/df938257/natgen/B2 329 209602 | gawk '{print $_,($2/$4)^2}'
>>>  -0.165153577923775 0.580845952033997 0.0298683661967516
>>> 0.0734809562563896 -0.00155110028572381 62.4845
>>>
>>> But this is not the case with reshuffle (and also I do not get any
>>> output with reshuffle /hpcwork/df938257/natgen/B2 --chi=30, while I know
>>> there are such chi2's in the results)
>>>
>>> ya567666 at cluster:~[167]$ reshuffle /hpcwork/df938257/natgen/B2
>>> --snps=209602 --traits=329 --chi
>>> Finish iout_file read   0.11 sec
>>> Start_write_chi_data=0.14 sec
>>> End_write_chi_trait     spm_1_AND_spmp_23 0.14 sec
>>> Finish_write_chi_data   0.14 sec
>>> Finish reshuffling 0.14 sec
>>> ya567666 at cluster:~[168]$ cat chi_data.txt
>>> SNP     Trait   beta_1  beta_SNP        se_1    se_SNP  cov_SNP_1
>>> Chi2
>>> rs4902242       spm_1_AND_spmp_23       -0.00234050769358873
>>>  -0.0338250175118446     0.128280490636826   0.0329618416726589
>>> 0.0770578160881996      1.05306001371466
>>> ya567666 at cluster:~[169]$
>>>
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Yurii S. Aulchenko
>>
>> [ LinkedIn <http://nl.linkedin.com/in/yuriiaulchenko> ] [ Twitter<http://twitter.com/YuriiAulchenko>] [
>> Blog <http://yurii-aulchenko.blogspot.nl/> ]
>>
>


-- 
*_________________________________*
*
*With best regards

Sodbo Zh. Sharapov
Phone:  +79831347688
Email:    sharapovsodbo at gmail.com
             sharapov at bionet.nsc.ru
Skype:   sharapovsodbo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20130725/f3123d05/attachment.html>


More information about the genabel-devel mailing list