[GenABEL-dev] function for conversion a plink format file to a GenABEL format file
Maksim Struchalin
m.v.struchalin at mail.ru
Fri Nov 15 05:53:35 CET 2013
An easy way to write a function for conversion a plink format file to a
GenABEL format file:
Use plink support of 'plug-in' functions
(http://pngu.mgh.harvard.edu/~purcell/plink/rfunc.shtml). This allows us
to write a simple R script (myscript.R) which is called by plink (plink
--file mydata --R myscript.R). plink reads the file mydata (which is in
plink format) and iteratively, SNP by SNP, trasfer all the data to a
script myscript.R. This script contains a function
Rplink(PHENO,GENO,CLUSTER,COVAR) which will take every SNP (GENO
variable) and store it in a *flv format through calling DatABEL functions.
The whole process of conversion will look like this:
1) User asks GenA convert plink file to GenA file
2) GenA looks weather the plink is installed. If it is not installed,
then GenA goes to a plink site and download/install it itself (use an R
function "download.file" from "utils" package)
3) GenA run a simple line: system('plink --file mydata --R myscript.R')
4) Rplink function (from myscript.R) gets every SNP and stote it in *flv
format. This function creates an flv file and then open and close it for
saving every single SNP.
5) Work is Done
The only issue is how fast the converssion will run: how much time does
it take to open a filvector file, store one SNP and close it? I can not
find a DatABEL R function for adding SNP to a flv file. Is there a C
DatABEL function which can do it?
best,
Maksim
More information about the genabel-devel
mailing list