[GenABEL-dev] new approach for data storage in GenABEL package

Yury Aulchenko yurii.aulchenko at gmail.com
Mon Nov 18 12:54:16 CET 2013


On Nov 15, 2013, at 17:21 PM, L.C. Karssen <lennart at karssen.org> wrote:

> Hi Maksim,
> 
> On 14-11-13 22:38, Maksim Struchalin wrote:
>> In this email, I propose a new approach which allows to reduce total
>> size of data from 8Mb to 2Mb that reduce the entire GenABEL size from
>> 12Mb to 6Mb.
> 
> I gues you mean B (bytes) instead of b (bits) here :-).
> 
>> 
>> "R CMD check --as-cran" reports that the following sub-directories have
>> too big size: data (2.3Mb), exdata (5.7Mb) and libs (2.6Mb). After the
>> last GenABEL submission to CRAN, the maintainers suggested to create a
>> new package called GenABELdata and move all the data there. I run
>> through the data and found that:
>> 1) "exdata" directory can be compressed by gzip and reduced from 5.8Mb
>> -> 1.1Mb.
>>    - There is a function guzip() from library R.utils which can
>> decompress the files. It works on any OS.
>>    - Moreover: the native R function read.table() can read gzip files
>> without decompression.
>>    - Even more: it looks like that the biggest file "srgenos.dat" is
>> used only once a long time ago for generating "srdta.RData" and now it
>> is just sitting there and eating space needlessly.
> 
> Sounds like a waste of space!
> 
>> 2) We can delete some files from the "data" directory. The deleted files
>> will be generated on the user computer based on the files from exdata.
>> It can be done during INSTALLATION (a line in Makefile?) or on the first
>> load through (|run funcion .onAttach() in R/zzz.R|). 
> 
> This sounds like a perfectly acceptable option.


I suggest this is done in the "example" which make use of this data, NOT in the INSTALL etc. - we should make things as "robust" as possible and interfere as little as possible with the usual workflow (which is very much system-specific, in that we will need to to test on all platforms)


> 
>> It will reduce
>> total size of "data" directory from 2.3Mb to 800Kb.
> 
> Fantastic! If no one has other objections I say: go ahead.
> 
> 
> Best,
> 
> Lennart.
> 
> 
>> 
>> Any objections/suggestions?
>> 
>> best,
>> Maksim
>> 
>> 
>> _______________________________________________
>> genabel-devel mailing list
>> genabel-devel at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>> 
> 
> -- 
> -----------------------------------------------------------------
> L.C. Karssen
> Utrecht
> The Netherlands
> 
> lennart at karssen.org
> http://blog.karssen.org
> 
> Stuur mij aub geen Word of Powerpoint bestanden!
> Zie http://www.gnu.org/philosophy/no-word-attachments.nl.html
> ------------------------------------------------------------------
> 
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel



More information about the genabel-devel mailing list