Too slow, too difficult for the user, or both? :)<br><br>On Friday, November 22, 2013, Maksim Struchalin wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<div>Yes. Looks like it was a bad idea to
use plink R-plugin for converting plink files to *ABEL format.<br>
Maksim<br>
<br>
On 18/11/2013 18:48, Yury Aulchenko wrote:<br>
</div>
<blockquote type="cite">
I would say that in principle DatABEL::text2databel is the
"natural" way to go from text-files to DatABEL-files
<div><br>
</div>
<div>The problem is that 'regular' text input may be allele by
allele, not genotype by genotype... (e.g. data are in format "A
G", or "A/G", not "0" or "1" or "2"). </div>
<div><br>
</div>
<div>Y<br>
<div><br>
<div>
<div>On Nov 15, 2013, at 17:48 PM, L.C. Karssen <<a>lennart@karssen.org</a>>
wrote:</div>
<br>
<blockquote type="cite">
<div style="font-size:12px;font-style:normal;font-variant:normal;font-weight:normal;letter-spacing:normal;line-height:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px">
Hi
Maksim,<br>
<br>
On 15-11-13 05:53, Maksim Struchalin wrote:<br>
<blockquote type="cite">An easy way to write a function
for conversion a plink format file to a<br>
GenABEL format file:<br>
<br>
Use plink support of 'plug-in' functions<br>
</blockquote>
<br>
Nice find. I didn't know that existed.<br>
<br>
<blockquote type="cite">(<a href="http://pngu.mgh.harvard.edu/%7Epurcell/plink/rfunc.shtml" target="_blank">http://pngu.mgh.harvard.edu/~purcell/plink/rfunc.shtml</a>).
This allows us<br>
to write a simple R script (myscript.R) which is
called by plink (plink<br>
--file mydata --R myscript.R). plink reads the file
mydata (which is in<br>
plink format) and iteratively, SNP by SNP, trasfer all
the data to a<br>
script myscript.R. This script contains a function<br>
Rplink(PHENO,GENO,CLUSTER,COVAR) which will take every
SNP (GENO<br>
variable) and store it in a *flv format through
calling DatABEL functions.<br>
<br>
The whole process of conversion will look like this:<br>
<br>
1) User asks GenA convert plink file to GenA file<br>
2) GenA looks weather the plink is installed. If it is
not installed,<br>
then GenA goes to a plink site and download/install it
itself (use an R<br>
function "download.file" from "utils" package)<br>
3) GenA run a simple line: system('plink --file mydata
--R myscript.R')<br>
4) Rplink function (from myscript.R) gets every SNP
and stote it in *flv<br>
format. This function creates an flv file and then
open and close it for<br>
saving every single SNP.<br>
5) Work is Done<br>
</blockquote>
<br>
I'm not sure how portable it is to download and run
plink. Also, the<br>
plink page says: Currently, there is only support for
R-plugins for<br>
Linux-based and Mac OS PLINK distributions.<br>
<br>
<blockquote type="cite"><br>
The only issue is how fast the converssion will run:
how much time does<br>
it take to open a filvector file, store one SNP and
close it? I can not<br>
find a DatABEL R function for adding SNP to a flv
file. Is there a C<br>
DatABEL function which can do it?<br>
</blockquote>
<br>
Wouldn't it be easier/possible to use plink to export to
text (.csv) and<br>
then use filevector's txt2fvf binary (of course this
could be done from<br>
R using system())?<br>
<br>
I'm also wondering if going per SNP is really necessary.
If I understand<br>
it correctly the R script (myscript.R) has to have a
function called:<br>
Rplink <- function(PHENO,GENO,CLUSTER,COVAR)<br>
where GENO is the matrix of genotypes. So we could write
that into a<br>
DatABEL file at once. Of course you may want to do this
per chromosome<br>
to reduce memory consumption (not sure how plink/R would
handle large<br>
data sets).<br>
<br></div></blockquote></div></div></div></blockquote></div>
</blockquote><br><br>-- <br>-----------------------------------------------------<br>Yurii S. Aulchenko<br><div><br></div><div>[ <a href="http://nl.linkedin.com/in/yuriiaulchenko" target="_blank">LinkedIn</a> ] [ <a href="http://twitter.com/YuriiAulchenko" target="_blank">Twitter</a> ] [ <a href="http://yurii-aulchenko.blogspot.nl/" target="_blank">Blog</a> ]</div>
<br>