From matthias.wuttke at uniklinik-freiburg.de Thu Jan 21 16:25:46 2016 From: matthias.wuttke at uniklinik-freiburg.de (Matthias Wuttke) Date: Thu, 21 Jan 2016 16:25:46 +0100 Subject: [GenABEL-dev] GenABEL impute2databel Patch Message-ID: <56A0F87A.6020900@uniklinik-freiburg.de> Hi! We imputed our genotypes to the 1000 Genomes project, phase 3. In order to be able to work with GenABEL/ProbABEL, I was required to shorten the long variant identifiers to less than 32 chars. Therefore, I added an option to do so to the impute2databel function that takes a file with new (shorter) variant identifiers. Please find attached a patch in case you like the new option. Best regards Matthias -------------- next part -------------- Index: pkg/GenABEL/R/impute2databel.R =================================================================== --- pkg/GenABEL/R/impute2databel.R (Revision 2045) +++ pkg/GenABEL/R/impute2databel.R (Arbeitskopie) @@ -14,6 +14,7 @@ #' @param old for developers' use only #' @param dataOutType the output data type, either "FLOAT" or "DOUBLE" (or #' another DatABEL/filevector type) +#' @param snpfile file with SNP names in each line (same order as columns in IMPUTE genotype file) #' #' @return 'databel-class' object #' @@ -21,7 +22,7 @@ impute2databel <- function(genofile, samplefile, outfile, - makeprob = TRUE, old = FALSE, dataOutType = "FLOAT") + makeprob = TRUE, old = FALSE, dataOutType = "FLOAT", snpfile = NA) { if (!require(DatABEL)) stop("this function requires the DatABEL package to be installed") @@ -36,6 +37,13 @@ warning("The non-float dataOutType is not fully supported; your outputs may be in 'FLOAT'...", immediate. = TRUE); + rowNamesSetting = 2 + if (!is.na(snpfile)) { + if (missing(snpfile)) + stop("snpfile file not found") + rowNamesSetting = snpfile + } + ## extract snp names (varnames) ## if (tmpname != "") ## text2databel(infile=genofile,outfile=outfile, @@ -47,7 +55,7 @@ tmpname <- get_temporary_file_name() tmp_fv <- text2databel(infile=genofile, outfile=tmpname, - rownames=2, + rownames=rowNamesSetting, skipcols=5, ## skiprows, transpose=TRUE, From lennart at karssen.org Mon Jan 25 11:06:42 2016 From: lennart at karssen.org (L.C. Karssen) Date: Mon, 25 Jan 2016 11:06:42 +0100 Subject: [GenABEL-dev] GenABEL impute2databel Patch In-Reply-To: <56A0F87A.6020900@uniklinik-freiburg.de> References: <56A0F87A.6020900@uniklinik-freiburg.de> Message-ID: <56A5F3B2.3060700@karssen.org> Hi Matthias, Thank you for submitting this patch! On 21-01-16 16:25, Matthias Wuttke wrote: > Hi! > > We imputed our genotypes to the 1000 Genomes project, phase 3. In order > to be able to work with GenABEL/ProbABEL, I was required to shorten the > long variant identifiers to less than 32 chars. Interesting. I haven't looked at the code in much detail (time is limited unfortunately), but can you tell me if this is a problem in ProbABEL, GenABEL or both? An alternative to your patch could of course be to allow longer variant names. > > Therefore, I added an option to do so to the impute2databel function > that takes a file with new (shorter) variant identifiers. Please find > attached a patch in case you like the new option. > I've committed it to SVN (r.2049), so it will be part of the next GenABEL release. Thanks again for your contributions to GenABEL and ProbABEL! They are much appreciated. Best regards, Lennart. > Best regards > Matthias > > > > _______________________________________________ > genabel-devel mailing list > genabel-devel at lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel > -- *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* L.C. Karssen Utrecht The Netherlands lennart at karssen.org http://blog.karssen.org GPG key ID: A88F554A -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 213 bytes Desc: OpenPGP digital signature URL: