[GenABEL-dev] function for conversion a plink format file to a GenABEL format file

Maksim Struchalin m.v.struchalin at mail.ru
Fri Nov 22 04:44:37 CET 2013


Yes. Looks like it was a bad idea to use plink R-plugin for converting 
plink files to *ABEL format.
Maksim

On 18/11/2013 18:48, Yury Aulchenko wrote:
> I would say that in principle DatABEL::text2databel is the "natural" 
> way to go from text-files to DatABEL-files
>
> The problem is that 'regular' text input may be allele by allele, not 
> genotype by genotype... (e.g. data are in format "A G", or "A/G", not 
> "0" or "1" or "2").
>
> Y
>
> On Nov 15, 2013, at 17:48 PM, L.C. Karssen <lennart at karssen.org 
> <mailto:lennart at karssen.org>> wrote:
>
>> Hi Maksim,
>>
>> On 15-11-13 05:53, Maksim Struchalin wrote:
>>> An easy way to write a function for conversion a plink format file to a
>>> GenABEL format file:
>>>
>>> Use plink support of 'plug-in' functions
>>
>> Nice find. I didn't know that existed.
>>
>>> (http://pngu.mgh.harvard.edu/~purcell/plink/rfunc.shtml 
>>> <http://pngu.mgh.harvard.edu/%7Epurcell/plink/rfunc.shtml>). This 
>>> allows us
>>> to write a simple R script (myscript.R) which is called by plink (plink
>>> --file mydata --R myscript.R). plink reads the file mydata (which is in
>>> plink format) and iteratively, SNP by SNP, trasfer all the data to a
>>> script myscript.R. This script contains a function
>>> Rplink(PHENO,GENO,CLUSTER,COVAR) which will take every SNP (GENO
>>> variable) and store it in a *flv format through calling DatABEL 
>>> functions.
>>>
>>> The whole process of conversion will look like this:
>>>
>>> 1) User asks GenA convert plink file to GenA file
>>> 2) GenA looks weather the plink is installed. If it is not installed,
>>> then GenA goes to a plink site and download/install it itself (use an R
>>> function "download.file" from "utils" package)
>>> 3) GenA run a simple line: system('plink --file mydata --R myscript.R')
>>> 4) Rplink function (from myscript.R) gets every SNP and stote it in *flv
>>> format. This function creates an flv file and then open and close it for
>>> saving every single SNP.
>>> 5) Work is Done
>>
>> I'm not sure how portable it is to download and run plink. Also, the
>> plink page says: Currently, there is only support for R-plugins for
>> Linux-based and Mac OS PLINK distributions.
>>
>>>
>>> The only issue is how fast the converssion will run: how much time does
>>> it take to open a filvector file, store one SNP and close it? I can not
>>> find a DatABEL R function for adding SNP to a flv file. Is there a C
>>> DatABEL function which can do it?
>>
>> Wouldn't it be easier/possible to use plink to export to text (.csv) and
>> then use filevector's txt2fvf binary (of course this could be done from
>> R using system())?
>>
>> I'm also wondering if going per SNP is really necessary. If I understand
>> it correctly the R script (myscript.R) has to have a function called:
>> Rplink <- function(PHENO,GENO,CLUSTER,COVAR)
>> where GENO is the matrix of genotypes. So we could write that into a
>> DatABEL file at once. Of course you may want to do this per chromosome
>> to reduce memory consumption (not sure how plink/R would handle large
>> data sets).
>>
>> I agree completely with Maarten that opening a filevector file for each
>> SNP will be an I/O killer.
>>
>>
>> Lennart.
>>
>>>
>>> best,
>>> Maksim
>>> _______________________________________________
>>> genabel-devel mailing list
>>> genabel-devel at lists.r-forge.r-project.org 
>>> <mailto:genabel-devel at lists.r-forge.r-project.org>
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>
>> --
>> -----------------------------------------------------------------
>> L.C. Karssen
>> Utrecht
>> The Netherlands
>>
>> lennart at karssen.org <mailto:lennart at karssen.org>
>> http://blog.karssen.org <http://blog.karssen.org/>
>>
>> Stuur mij aub geen Word of Powerpoint bestanden!
>> Ziehttp://www.gnu.org/philosophy/no-word-attachments.nl.html
>> ------------------------------------------------------------------
>>
>> _______________________________________________
>> genabel-devel mailing list
>> genabel-devel at lists.r-forge.r-project.org 
>> <mailto:genabel-devel at lists.r-forge.r-project.org>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>
>
>
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20131122/04e076db/attachment.html>


More information about the genabel-devel mailing list