[GenABEL-dev] new approach for data storage in GenABEL package

Maksim Struchalin m.v.struchalin at mail.ru
Thu Nov 14 22:38:52 CET 2013


In this email, I propose a new approach which allows to reduce total 
size of data from 8Mb to 2Mb that reduce the entire GenABEL size from 
12Mb to 6Mb.

"R CMD check --as-cran" reports that the following sub-directories have 
too big size: data (2.3Mb), exdata (5.7Mb) and libs (2.6Mb). After the 
last GenABEL submission to CRAN, the maintainers suggested to create a 
new package called GenABELdata and move all the data there. I run 
through the data and found that:
1) "exdata" directory can be compressed by gzip and reduced from 5.8Mb 
-> 1.1Mb.
     - There is a function guzip() from library R.utils which can 
decompress the files. It works on any OS.
     - Moreover: the native R function read.table() can read gzip files 
without decompression.
     - Even more: it looks like that the biggest file "srgenos.dat" is 
used only once a long time ago for generating "srdta.RData" and now it 
is just sitting there and eating space needlessly.
2) We can delete some files from the "data" directory. The deleted files 
will be generated on the user computer based on the files from exdata. 
It can be done during INSTALLATION (a line in Makefile?) or on the first 
load through (|run funcion .onAttach() in R/zzz.R|). It will reduce 
total size of "data" directory from 2.3Mb to 800Kb.

Any objections/suggestions?

best,
Maksim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20131115/dd07af45/attachment.html>


More information about the genabel-devel mailing list