[GenABEL-dev] DatABEL

Benjamin Hofner benjamin.hofner at fau.de
Mon May 4 11:52:04 CEST 2015


Dear Lennart,

Thank you very much for your reply.

We will start copying all saved files for transferring the information. 
As we are using databel objects in more complex objects we will have to 
think about options to do this (semi-)automatically.

As a short follow up regarding paralleization: We are not estimating SNP 
effects in parallel but multiple models that combine many SNPs. So the 
option you raised might not be suitable. But again we seem to need to 
think about that. Perhaps it will help to look at the filevector 
documentation. Is it possible that you wanted to include a reference [1]?

Thanks for your help,
Benjamin

Am 01.05.2015 um 16:57 schrieb L.C. Karssen:
> Hi Benjamin,
>
> Thanks for your interest in DatABEL. Because most of DatABEL was
> developed before I took over maintenance of the package I have put our
> development mailing list in CC. Just in case one of the other developers
> wants to chime in.
>
> On 28-04-15 14:29, Benjamin Hofner wrote:
>> Hi Lennart,
>>
>> we are currently trying to use your package DatABEL to store the data
>> for complex GWAS analysis. We are not using your standard tool sets
>> implemented in GenABLE and co but are trying to implement a novel method
>> ourselves. Currently, we are facing several problems which are most
>> likely related to the fact that you store the data on the HDD and use
>> pointers (?) to access the data.
>
> The DatABEL package is basically an R interface to a lower-level library
> written in C++, which we call filevector [1]. Maybe it's worth looking
> at that as well. In the source code repo at [1] you will also find a few
> utilities written in C++ to convert text files to and from fvi/fvd files.
>
> When you create a DatABEL object in R it is indeed basically a pointer
> to the data in the backing file. The .fvi file contains index data which
> is then used to quickly read the actual data from the .fvd file.
>
>>
>> 1) How can one store and share databel objects? I.e. is it possible to
>> store a databel object using save("objectname", file = "data.Rda")? On
>> one system it works fine.
>
> So you say you basically create a DatABEL object using databel() and
> then want to save that object. Interesting, I never tried that.
>
>> It seems to be transferable if one moves the
>> Rda file together with the fvd and fvi  files (and do not rename these).
>
> Yes, that's what I expect. Because of the large amount of data the
> actual object (and therefore your .Rda file) will not be copied from the
> .fv{i,d} files when creating an object. As you surmised, it's only a
> pointer to the data (with some associated information like the buffer
> size).
>
>> Couldn't one include this file in the Rda file and or allow to alter the
>> path via
>>
>> backingfilename() <- "newpath/filename"
>>
>> 2) We are trying to use multicore aka mclapply techniques to speed up
>> computations.
>
> If I understand it correctly, you would like to share a (saved) DatABEL
> object among several processes where each process works on a subset of
> the data in that object. Is that correct?
>
> My first reaction is to say that (imputed) genetic data is usually
> already split into several hundred files (assuming 1kG imputed data), so
> you could simply use those for data parallelism.
> But I can see that parallel access to a subset of a DatABEL object has
> its use.
>
>> However, this does not work as the forked processes seem
>> to have lost the pointer to the databel file. Sequentially, i.e., using
>> lapply, everything works fine. Do you have any experiences here?
>
> Unfortunately not.
>
>
> Best regards,
>
> Lennart.
>
>
>> Can you
>> provide any help? If necessary, I can try to provide a minimal example
>> that reproduces this problem/error.
>>
>> Best regards,
>> Benjamin
>

-- 
******************************************************************************
Dr. rer. nat. Benjamin Hofner

Institut für Medizininformatik, Biometrie und Epidemiologie
Friedrich-Alexander-Universität Erlangen-Nürnberg
Waldstr. 6 - 91054 Erlangen - Germany

Tel: +49-9131-85-22707
Fax: +49-9131-85-25740

Büro:
   Raum 3.036
   Universitätsstraße 22
   (Eingang linke Seite des Gebäudes; Wegweiser IMBE)

benjamin.hofner at fau.de

http://www.imbe.med.uni-erlangen.de/cms/benjamin_hofner.html
http://www.benjaminhofner.de
******************************************************************************


More information about the genabel-devel mailing list