[GenABEL-dev] function for conversion a plink format file to a GenABEL format file

Maksim Struchalin m.v.struchalin at mail.ru
Fri Nov 15 05:53:35 CET 2013


An easy way to write a function for conversion a plink format file to a 
GenABEL format file:

Use plink support of 'plug-in' functions 
(http://pngu.mgh.harvard.edu/~purcell/plink/rfunc.shtml). This allows us 
to write a simple R script (myscript.R) which is called by plink (plink 
--file mydata --R myscript.R). plink reads the file mydata (which is in 
plink format) and iteratively, SNP by SNP, trasfer all the data to a 
script myscript.R. This script contains a function 
Rplink(PHENO,GENO,CLUSTER,COVAR) which will take every SNP (GENO 
variable) and store it in a *flv format through calling DatABEL functions.

The whole process of conversion will look like this:

1) User asks GenA convert plink file to GenA file
2) GenA looks weather the plink is installed. If it is not installed, 
then GenA goes to a plink site and download/install it itself (use an R 
function "download.file" from "utils" package)
3) GenA run a simple line: system('plink --file mydata --R myscript.R')
4) Rplink function (from myscript.R) gets every SNP and stote it in *flv 
format. This function creates an flv file and then open and close it for 
saving every single SNP.
5) Work is Done

The only issue is how fast the converssion will run: how much time does 
it take to open a filvector file, store one SNP and close it? I can not 
find a DatABEL R function for adding SNP to a flv file. Is there a C 
DatABEL function which can do it?

best,
Maksim


More information about the genabel-devel mailing list