[GenABEL-dev] ISNAN problem in filevector found by Jenkins

Maksim Struchalin m.v.struchalin at mail.ru
Mon Nov 25 19:21:57 CET 2013


That's good. The warnings are because of that the snp name in the second 
line of TEST10x15.geno file is the same as in the first one. Fixed in 
https://app.box.com/s/6irjkikm3mukwvtk1wgw version. Can you download it 
and run once again?

BTW1: The versions of GenABEL and DatABEL here are old (I generated this 
example a week ago). That's why there are some warnings on other functions.
BTW2: Remove the files TEST10x15_T.geno.dose.fvd  and 
TEST10x15_T.geno.dose.fvi after each run. Otherwise you get an error 
message (File 'TEST10x15_T.geno.dose' already exists.).

best,
Maksim


On 25/11/2013 19:28, Yury Aulchenko wrote:
> after renaming out-file-name I get (finished with warnings):
>
> > x <- impute2databel(geno="TEST10x15.geno", sample="impute.sample5", 
> out="TEST10x15_T1.geno", makeprob=FALSE, old=TRUE)
> Loading required package: DatABEL
> DatABEL v.0.9-4 (March 12, 2013) loaded
>
> Options in effect:
> --infile    = TEST10x15.geno
> --outfile   = ./tmp300070
> --skiprows  = OFF
> --skipcols  = 5
> --cnrow     = OFF
> --rncol     = ON, using column 2 of 'TEST10x15.geno'
> --transpose = ON
> --Rmatrix   = OFF
> --nanString = NA
> Number of lines in source file is 10
> Number of words in source file is 20
> skiprows = 0
> cnrow = 0
> skipcols = 5
> rncol = 2
> Rmatrix = 0
> numWords = 20
> Creating file with numRows = 10
> Creating file with numColumns = 15
> text2fvf finished.
> Loss of precision / loss of data during conversion from DOUBLE to FLOAT.
> Futher conversion warnings omitted.
> Read 4 items
> Read 4 items
> Read 4 items
> Read 20 items
> Warning messages:
> 1: In uninames(.Object at data) :
>   uninames: some column names are not unique; use 
> set_dimnames/get_dimnames for non-unique row/col names
> 2: In uninames(x at data) :
>   uninames: some column names are not unique; use 
> set_dimnames/get_dimnames for non-unique row/col names
> 3: In uninames(x at data) :
>   uninames: some column names are not unique; use 
> set_dimnames/get_dimnames for non-unique row/col names
> >
>
>
> On Nov 25, 2013, at 13:23 PM, L.C. Karssen <lennart at karssen.org 
> <mailto:lennart at karssen.org>> wrote:
>
>> I had the same issues as Yurii reported:
>> - compilation warnings when installing
>> - the 'file already exists' warning: this was one of the reasons why I
>> renamed the filename in the 'out=' option in test.R
>> - I also got the warning about not-unique column names, but I assumed
>> that wasn't the issue we were trying to fix here, so I didn't mention it.
>>
>> Yurii, does the script finish when you change the output file name in
>> test.R?
>>
>>
>>
>> Lennart.
>>
>> On 11/25/2013 12:26 PM, Yury Aulchenko wrote:
>>> Tried this on Mac OS X, see below
>>>
>>> On Nov 23, 2013, at 15:12 PM, Maksim Struchalin 
>>> <m.v.struchalin at mail.ru <mailto:m.v.struchalin at mail.ru>
>>> <mailto:m.v.struchalin at mail.ru>> wrote:
>>>
>>>> Hi Lennart,
>>>>
>>>> https://app.box.com/s/iy41ug5qg4sbylul9oyn
>>>>
>>>> This is an example demonstrating how a GenABEL function
>>>> "impute2databel" calls a function "iteratorDA" from DatABEL. Here,
>>>> GenABEL is compiled without flv's and iterator's code (I deleted it
>>>> from src).
>>>>
>>>> Could you run the test?:
>>>> 0) Dowload the file test_GenABEL_iterator.tar.gz from
>>>> https://app.box.com/s/iy41ug5qg4sbylul9oyn
>>>> 1) decompress test_GenABEL_iterator.tar.gz
>>>> 2) cd test_GenABEL_iterator
>>>> 3) R CMD INSTALL DatABEL_0.9-4.tar.gz
>>>> 4) R CMD INSTALL GenABEL_1.7-7.tar.gz
>>>
>>> getting many warnings at step (4)
>>>
>>>> 5) run test.R
>>>
>>> getting
>>>
>>>> x <- impute2databel(geno="TEST10x15.geno", sample="impute.sample5",
>>> out="TEST10x15_T.geno", makeprob=FALSE, old=TRUE)
>>> Loading required package: DatABEL
>>> DatABEL v.0.9-4 (March 12, 2013) loaded
>>>
>>> Options in effect:
>>> --infile    = TEST10x15.geno
>>> --outfile   = ./tmp333314
>>> --skiprows  = OFF
>>> --skipcols  = 5
>>> --cnrow     = OFF
>>> --rncol     = ON, using column 2 of 'TEST10x15.geno'
>>> --transpose = ON
>>> --Rmatrix   = OFF
>>> --nanString = NA
>>> Number of lines in source file is 10
>>> Number of words in source file is 20
>>> skiprows = 0
>>> cnrow = 0
>>> skipcols = 5
>>> rncol = 2
>>> Rmatrix = 0
>>> numWords = 20
>>> Creating file with numRows = 10
>>> Creating file with numColumns = 15
>>> text2fvf finished.
>>> File 'TEST10x15_T.geno.dose' already exists.
>>> ERROR in Rstuff:failed in ini_empty_FileMatrix_RError in !result :
>>> invalid argument type
>>> Calls: impute2databel -> apply2dfo -> make_empty_fvf
>>> In addition: Warning message:
>>> In uninames(.Object at data) :
>>>  uninames: some column names are not unique; use
>>> set_dimnames/get_dimnames for non-unique row/col names
>>> Execution halted
>>>
>>>
>>>>
>>>> It works on my Ubuntu. If it works on your Ubuntu, win and mac, then
>>>> we can delete from GenABEL the simlinks to flv and databel.
>>>>
>>>> best,
>>>> ??????
>>>>
>>>>
>>>> On 22/11/2013 21:34, L.C. Karssen wrote:
>>>>> Hi ??????,
>>>>>
>>>>>
>>>>> On 11/19/2013 03:17 PM, Maksim Struchalin wrote:
>>>>>> Hi Lennart,
>>>>>>
>>>>>> I see you are improving your Russian :-).
>>>>> Getting to know the Russian alphabet is step one :-).
>>>>>
>>>>>> I understand your arguments. I think we can combine our two 
>>>>>> approaches.
>>>>>> 1) We make a so/dll from filevector and let it use by
>>>>>> ProbABEL/OmicABEL/Another_not_R_softABEL.
>>>>>> 2) For GenABEL and other R packages, we make a DatABEL.
>>>>>>
>>>>>> The code of filevector is the same both for 1) and 2).
>>>>> But that doesn't solve the problem of having symlinks to the fvlib
>>>>> directory in our SVN tree... Which means that any update to filevector
>>>>> can make the depending package (DatABEL) become uncompilable.
>>>>>
>>>>> In the mean time I've set the first steps towards 'libfilevector' in
>>>>> SVN, see commits 1415 and 1416. This works (at least for 
>>>>> ProbABEL), but
>>>>> more polishing is needed.
>>>>>
>>>>>
>>>>>> We only add
>>>>>> preprocessor commands (#ifdef and so on) to surround R specific code
>>>>>> (ISNAN() and std::isnan). In this case, compiler choose itself 
>>>>>> weather
>>>>>> it buids the lib for R or for OS.
>>>>>>
>>>>>> If we will want to use only approach 1) for GenABEL in the future, we
>>>>>> can quickly swith to it later.
>>>>> True, for now this will work.
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>> Lennart.
>>>>>
>>>>>> best,
>>>>>> Maksim
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 19/11/2013 16:10, L.C. Karssen wrote:
>>>>>>> Hi ??????,
>>>>>>>
>>>>>>> (trying a Russian keyboard layout, no idea if this works...).
>>>>>>>
>>>>>>> On 19-11-13 09:44, Maksim Struchalin wrote:
>>>>>>>> It seems that your solution is workable but I see little 
>>>>>>>> difference with
>>>>>>>> what it is now. Now the filevector code is incorporated in each
>>>>>>>> packages.
>>>>>>> This is what I would like to change, indeed. Code that is reused 
>>>>>>> by so
>>>>>>> many packages should not be copied/symlinked into the code tree 
>>>>>>> of those
>>>>>>> packages. By symlinking it as we have now, there is no proper way of
>>>>>>> specifying a version number of the filevector code. Which, in 
>>>>>>> turn means
>>>>>>> that if something changes in the filevector code all other 
>>>>>>> packages need
>>>>>>> to be changed immediately (just like what happened with your latest
>>>>>>> change). If the filevector code have been a proper library we 
>>>>>>> could have
>>>>>>> simply said that ProbABEL still depends on the old filevector 
>>>>>>> version
>>>>>>> and take more time to make sure the two play nice together.
>>>>>>>
>>>>>>> Moreover, with the filevector code in a separate library the whole
>>>>>>> isnan() issue would not be a problem. We could simply use 
>>>>>>> std::isnan(),
>>>>>>> because CRAN wouldn't need to compile the .so/.dll, so no need 
>>>>>>> of ISNAN().
>>>>>>> When code is put in a library the internals don't matter as long 
>>>>>>> as the
>>>>>>> interface (function names + arguments) to the outside doesn't 
>>>>>>> change.
>>>>>>>
>>>>>>>> You propose to follow the same way but pack filelvector code
>>>>>>>> in one file (dll or so) and distribute 9 packages form GenABEL 
>>>>>>>> with the
>>>>>>>> same library.
>>>>>>> Indeed. The problem with incorporating it all in DatABEL is that 
>>>>>>> non-R
>>>>>>> packages like ProbABEL and OmicABEL depend on the stuff in the fvlib
>>>>>>> directory as well. Filevector is central to (almost) all 
>>>>>>> packages in the
>>>>>>> GenABEL suite, which is why I proposed to make a library out of 
>>>>>>> it. And,
>>>>>>> as noted above, this way packages can depend on different 
>>>>>>> version of the
>>>>>>> library.
>>>>>>>
>>>>>>> We can of course discuss whether we want to distribute this 
>>>>>>> .so/.dll as
>>>>>>> a separate (operating system) package or withing the R packages. 
>>>>>>> To me
>>>>>>> the first option is the 'correct' one, but I see that this may 
>>>>>>> impose on
>>>>>>> the user (except on Windows and maybe MacOS, where the .so/.dll is
>>>>>>> included in the R package).
>>>>>>>
>>>>>>>
>>>>>>>> Last time I proposed to move filevector in DatABEL. All other 
>>>>>>>> packages
>>>>>>>> (GenA and so on) will load DatAB in R and use filevector 
>>>>>>>> fucntions from
>>>>>>>> DatA. When DatABEL is loaded through library(DatABEL), the file
>>>>>>>> DatABEL.so is loaded as well.
>>>>>>> I think this is what should be done with the DAlib directory 
>>>>>>> (another
>>>>>>> symlinked dir).
>>>>>>>
>>>>>>>> Thus, you do not need to ask users to
>>>>>>>> install additional lib because it is in DatABEL already. I 
>>>>>>>> think this is
>>>>>>>> a workable approach that will allow us to delete the filevector 
>>>>>>>> code (or
>>>>>>>> filevector so/dll) from all the packages.
>>>>>>>>
>>>>>>>>
>>>>>>>> This is some quote from the R manual how to register functions 
>>>>>>>> to make
>>>>>>>> it available from DatAB to GenAB:
>>>>>>>>
>>>>>>>>
>>>>>>>>      _______________________________________________
>>>>>>>>
>>>>>>>>
>>>>>>>>      5.4 Registering native routines
>>>>>>>>
>>>>>>>> By 'native' routine, we mean an entry point in compiled code.
>>>>>>>>
>>>>>>>> In calls to |.C|, |.Call|, |.Fortran| and |.External|, R must 
>>>>>>>> locate the
>>>>>>>> specified native routine by looking in the appropriate shared
>>>>>>>> object/DLL. By default, R uses the operating system-specific 
>>>>>>>> dynamic
>>>>>>>> loader to lookup the symbol in all loaded DLLs and elsewhere.
>>>>>>>> Alternatively, the author of the DLL can explicitly register 
>>>>>>>> routines
>>>>>>>> with R and use a single, platform-independent mechanism for 
>>>>>>>> finding the
>>>>>>>> routines in the DLL. One can use this registration mechanism to 
>>>>>>>> provide
>>>>>>>> additional information about a routine, including the number 
>>>>>>>> and type of
>>>>>>>> the arguments, and also make it available to R programmers under a
>>>>>>>> different name. In the future, registration may be used to 
>>>>>>>> implement a
>>>>>>>> form of "secure" or limited native access.
>>>>>>>>
>>>>>>>> _____________________________________________________
>>>>>>>>
>>>>>>> Hmm, I will have to think about this. This seems to be about how 
>>>>>>> R finds
>>>>>>> out in which DLL a function is found (and maybe where the DLL is 
>>>>>>> found
>>>>>>> in the filesystem). I think this is separate from the point 
>>>>>>> below, but
>>>>>>> I'm not sure.
>>>>>>>
>>>>>>>> Your argument was from "5.8 Linking to other packages: It is not in
>>>>>>>> general possible to link a DLL in package *packA* to a DLL 
>>>>>>>> provided by
>>>>>>>> package *packB". *I do not quite understand what they mean 
>>>>>>>> under 'link'.
>>>>>>>> May be the mean link a library during intsalltion?
>>>>>>> Yes, as far as I understand, they mean linking to a library during
>>>>>>> installation/compilation.
>>>>>>>
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Lennart.
>>>>>>>> best,
>>>>>>>> Maksim
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 19/11/2013 15:14, L.C. Karssen wrote:
>>>>>>>>> Hi Maksim,
>>>>>>>>>
>>>>>>>>> Good question... The idea is to generate a .dll file for 
>>>>>>>>> Windows, but
>>>>>>>>> I'm not sure what would be the best way to distribute that. It 
>>>>>>>>> would be
>>>>>>>>> interesting to see how other packages do that. For example, 
>>>>>>>>> the XML
>>>>>>>>> package depends on libxml2:
>>>>>>>>> http://cran.r-project.org/web/packages/XML/index.html and the 
>>>>>>>>> Rcurl
>>>>>>>>> package depends on libcurl:
>>>>>>>>> http://cran.r-project.org/web/packages/RCurl/index.html
>>>>>>>>>
>>>>>>>>> In the XML package .zip file for Windows there is a directory 
>>>>>>>>> libs/x64
>>>>>>>>> and a directory libs/i386. Both contain XML.dll, so I think 
>>>>>>>>> that for
>>>>>>>>> Linux you simply specify a dependency on a library, whereas 
>>>>>>>>> for Windows
>>>>>>>>> the actual .dll is in the package (which is quite logical because
>>>>>>>>> Windows lacks the package repositories that most Linux distros 
>>>>>>>>> have).
>>>>>>>>> It seems that for MacOS the .tgz file also contains a lib 
>>>>>>>>> directory with
>>>>>>>>> the .so file.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>> Lennart.
>>>>>>>>>
>>>>>>>>> On 19-11-13 08:56, Maksim Struchalin wrote:
>>>>>>>>>> Hi Lennart,
>>>>>>>>>>
>>>>>>>>>> How the users under win will install such a library?
>>>>>>>>>>
>>>>>>>>>> best,
>>>>>>>>>> Maksim
>>>>>>>>>>
>>>>>>>>>> On 19/11/2013 14:46, L.C. Karssen wrote:
>>>>>>>>>>> Dear all,
>>>>>>>>>>>
>>>>>>>>>>> The Jenkins setup already shows its value: After Maksim 
>>>>>>>>>>> changed the call
>>>>>>>>>>> from std::isnan() to ISNAN() in fvlib's CastUtils.cpp an 
>>>>>>>>>>> automatic build
>>>>>>>>>>> of ProbABEL was triggered and it failed (because ISNAN() is 
>>>>>>>>>>> an R function).
>>>>>>>>>>>
>>>>>>>>>>> I guess this is one more reason to try to convert fvlib into 
>>>>>>>>>>> a real
>>>>>>>>>>> (shared) library.
>>>>>>>>>>> Does anyone have another workable solution?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Lennart.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> genabel-devel mailing list
>>>>>>>>>>> genabel-devel at lists.r-forge.r-project.org
>>>>>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>>>>>>> _______________________________________________
>>>>>>>>>> genabel-devel mailing list
>>>>>>>>>> genabel-devel at lists.r-forge.r-project.org
>>>>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> genabel-devel mailing list
>>>>>>>>> genabel-devel at lists.r-forge.r-project.org
>>>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>>>>> _______________________________________________
>>>>>>>> genabel-devel mailing list
>>>>>>>> genabel-devel at lists.r-forge.r-project.org 
>>>>>>>> <mailto:genabel-devel at lists.r-forge.r-project.org>
>>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> genabel-devel mailing list
>>>>>>> genabel-devel at lists.r-forge.r-project.org 
>>>>>>> <mailto:genabel-devel at lists.r-forge.r-project.org>
>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>>>
>>>>>> _______________________________________________
>>>>>> genabel-devel mailing list
>>>>>> genabel-devel at lists.r-forge.r-project.org 
>>>>>> <mailto:genabel-devel at lists.r-forge.r-project.org>
>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> genabel-devel mailing list
>>>>> genabel-devel at lists.r-forge.r-project.org 
>>>>> <mailto:genabel-devel at lists.r-forge.r-project.org>
>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>
>>>> _______________________________________________
>>>> genabel-devel mailing list
>>>> genabel-devel at lists.r-forge.r-project.org 
>>>> <mailto:genabel-devel at lists.r-forge.r-project.org>
>>>> <mailto:genabel-devel at lists.r-forge.r-project.org>
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>
>>>
>>>
>>> _______________________________________________
>>> genabel-devel mailing list
>>> genabel-devel at lists.r-forge.r-project.org 
>>> <mailto:genabel-devel at lists.r-forge.r-project.org>
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>
>>
>> --
>> *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
>> L.C. Karssen
>> Utrecht
>> The Netherlands
>>
>> lennart at karssen.org <mailto:lennart at karssen.org>
>> http://blog.karssen.org <http://blog.karssen.org/>
>> GPG key ID: A88F554A
>> -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
>>
>> _______________________________________________
>> genabel-devel mailing list
>> genabel-devel at lists.r-forge.r-project.org 
>> <mailto:genabel-devel at lists.r-forge.r-project.org>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>
>
>
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20131126/9f00ad06/attachment-0001.html>


More information about the genabel-devel mailing list