[GenABEL-dev] export.plink() uses too much memory in GenABEL v1.7-0 (beta)

L.C. Karssen l.karssen at erasmusmc.nl
Tue Jul 3 15:33:51 CEST 2012


That is certainly an option, although than the question is when is a
data set 'large'. Do we take the user's total RAM into account? Platform
(on 32 bit windows any process can only use 2GB RAM)?

Anyway, before implementing your suggestion I want to do some more
checks. Recently we exported a not too large dataset (~750 people, ~100k
SNPs) and that seemed to work ok, but checking the genotypes afterwards
showed that many homozygous ones (but not all as far as I can tell) were
incorrectly exported.
This may simply the same bug, not leading to a crash this time because
it was ran on a machine with 128GB RAM and finishing before completely
filling that up. But I'd like to be sure.



Lennart.

On 07/03/2012 02:23 PM, Yurii Aulchenko wrote:
> What about keeping default as before ("ped"), but introduce a message
> when "ped" is used warning that with large datas sets it is strongly
> recommended to use "tped" format.
> 
> Yurii
> 
> On Mon, Jul 2, 2012 at 7:59 PM, L.C. Karssen <lennart at karssen.org> wrote:
>> Dear List,
>>
>> Sorry for digging deep into the past, but this issue of export.plink()
>> still hasn't been resolved. After a recent question by e-mail I opened a
>> bug report and started a forum thread on the subject:
>> -
>> https://r-forge.r-project.org/tracker/?func=detail&atid=2058&aid=2055&group_id=505
>> - http://forum.genabel.org/viewtopic.php?f=6&t=652
>>
>> Since solving the issue costs more time than I can presently afford, I
>> suggest to make export.plink(..., transpose=TRUE) the default (instead
>> of transpose=FALSE).
>> Moving to a new default will need to be communicated very clearly, but I
>> think shipping a function that is broken by default is even worse.
>>
>> What are your opinions?
>>
>>
>> Best,
>>
>> Lennart.
>>
>> On 12/07/2011 05:36 PM, L.C. Karssen wrote:
>>> Dear Yurii,
>>>
>>> We just tried svn revision 827 and had the same problem. We killed the
>>> program at 51% memory usage (32GB). So, unfortunately I think the
>>> problem is not solved yet.
>>>
>>>
>>> Lennart
>>>
>>>
>>>
>>> On 07-12-11 11:36, Yury Aulchenko wrote:
>>>> Ok, should be fixed in r823 just committed.
>>>>
>>>> Let me know if the problem persist
>>>>
>>>> On Dec 7, 2011, at 11:13 AM, Yury Aulchenko wrote:
>>>>
>>>>> I think this is something I introduced in rev. 810-814 (a new without
>>>>> delete). Now (hopefully) fixed, will commit changes in next hour.
>>>>> -------------------------------------------------------
>>>>> Yurii Aulchenko, PhD, Dr. Habil.
>>>>> Independent researcher and consultant
>>>>> yurii [dot] aulchenko [at] gmail [dot] com
>>>>>
>>>>> On Dec 7, 2011, at 11:03 AM, L.C. Karssen wrote:
>>>>>
>>>>>> Dear list,
>>>>>>
>>>>>> We just tried to convert a GenABEL object to plink format using
>>>>>> export.plink() from GenABEL v 1.7-0 (still under development,
>>>>>> package built from SVN yesterday), and it nearly brought the machine
>>>>>> to a halt because it used all available memory (RAM + swap).
>>>>>>
>>>>>> Our GenABEL object contained almost 3700 individuals and about 700k
>>>>>> SNPS.
>>>>>>
>>>>>> Have others experienced this as well? I haven't looked at Yurii's
>>>>>> latest implementation of the function in C++ yet. Hopefully I will
>>>>>> be able to find some time later today. Does anyone here know how we
>>>>>> could limit memory usage in C++?
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Lennart.
>>>>
>>>>
>>>
>>
>>
>>
>>
>> _______________________________________________
>> genabel-devel mailing list
>> genabel-devel at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> 

-- 
-----------------------------------------------
dr. L.C. Karssen
Erasmus MC
Department of Epidemiology
Room Ee2224

Postbus 2040
3000 CA Rotterdam
The Netherlands

phone: +31-10-7044217
fax: +31-10-7044657
email: l.karssen at erasmusmc.nl
GPG key ID: 0E1D39E3
-----------------------------------------------



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20120703/86c8a440/attachment.sig>


More information about the genabel-devel mailing list