[GenABEL-dev] ISNAN problem in filevector found by Jenkins

L.C. Karssen lennart at karssen.org
Mon Nov 25 13:23:59 CET 2013


I had the same issues as Yurii reported:
- compilation warnings when installing
- the 'file already exists' warning: this was one of the reasons why I
renamed the filename in the 'out=' option in test.R
- I also got the warning about not-unique column names, but I assumed
that wasn't the issue we were trying to fix here, so I didn't mention it.

Yurii, does the script finish when you change the output file name in
test.R?



Lennart.

On 11/25/2013 12:26 PM, Yury Aulchenko wrote:
> Tried this on Mac OS X, see below
> 
> On Nov 23, 2013, at 15:12 PM, Maksim Struchalin <m.v.struchalin at mail.ru
> <mailto:m.v.struchalin at mail.ru>> wrote:
> 
>> Hi Lennart,
>>
>> https://app.box.com/s/iy41ug5qg4sbylul9oyn
>>
>> This is an example demonstrating how a GenABEL function
>> "impute2databel" calls a function "iteratorDA" from DatABEL. Here,
>> GenABEL is compiled without flv's and iterator's code (I deleted it
>> from src).
>>
>> Could you run the test?:
>> 0) Dowload the file test_GenABEL_iterator.tar.gz from
>> https://app.box.com/s/iy41ug5qg4sbylul9oyn
>> 1) decompress test_GenABEL_iterator.tar.gz
>> 2) cd test_GenABEL_iterator
>> 3) R CMD INSTALL DatABEL_0.9-4.tar.gz
>> 4) R CMD INSTALL GenABEL_1.7-7.tar.gz
> 
> getting many warnings at step (4)
> 
>> 5) run test.R
> 
> getting 
> 
>> x <- impute2databel(geno="TEST10x15.geno", sample="impute.sample5",
> out="TEST10x15_T.geno", makeprob=FALSE, old=TRUE)
> Loading required package: DatABEL
> DatABEL v.0.9-4 (March 12, 2013) loaded
> 
> Options in effect: 
> --infile    = TEST10x15.geno
> --outfile   = ./tmp333314
> --skiprows  = OFF
> --skipcols  = 5
> --cnrow     = OFF
> --rncol     = ON, using column 2 of 'TEST10x15.geno'
> --transpose = ON
> --Rmatrix   = OFF
> --nanString = NA
> Number of lines in source file is 10
> Number of words in source file is 20
> skiprows = 0
> cnrow = 0
> skipcols = 5
> rncol = 2
> Rmatrix = 0
> numWords = 20
> Creating file with numRows = 10
> Creating file with numColumns = 15
> text2fvf finished.
> File 'TEST10x15_T.geno.dose' already exists.
> ERROR in Rstuff:failed in ini_empty_FileMatrix_RError in !result :
> invalid argument type
> Calls: impute2databel -> apply2dfo -> make_empty_fvf
> In addition: Warning message:
> In uninames(.Object at data) :
>   uninames: some column names are not unique; use
> set_dimnames/get_dimnames for non-unique row/col names
> Execution halted
> 
> 
>>
>> It works on my Ubuntu. If it works on your Ubuntu, win and mac, then
>> we can delete from GenABEL the simlinks to flv and databel.
>>
>> best,
>> Максим
>>
>>
>> On 22/11/2013 21:34, L.C. Karssen wrote:
>>> Hi максим,
>>>
>>>
>>> On 11/19/2013 03:17 PM, Maksim Struchalin wrote:
>>>> Hi Lennart,
>>>>
>>>> I see you are improving your Russian :-).
>>> Getting to know the Russian alphabet is step one :-).
>>>
>>>> I understand your arguments. I think we can combine our two approaches.
>>>> 1) We make a so/dll from filevector and let it use by
>>>> ProbABEL/OmicABEL/Another_not_R_softABEL.
>>>> 2) For GenABEL and other R packages, we make a DatABEL.
>>>>
>>>> The code of filevector is the same both for 1) and 2). 
>>> But that doesn't solve the problem of having symlinks to the fvlib
>>> directory in our SVN tree... Which means that any update to filevector
>>> can make the depending package (DatABEL) become uncompilable.
>>>
>>> In the mean time I've set the first steps towards 'libfilevector' in
>>> SVN, see commits 1415 and 1416. This works (at least for ProbABEL), but
>>> more polishing is needed.
>>>
>>>
>>>> We only add
>>>> preprocessor commands (#ifdef and so on) to surround R specific code
>>>> (ISNAN() and std::isnan). In this case, compiler choose itself weather
>>>> it buids the lib for R or for OS.
>>>>
>>>> If we will want to use only approach 1) for GenABEL in the future, we
>>>> can quickly swith to it later.
>>> True, for now this will work.
>>>
>>>
>>> Best,
>>>
>>> Lennart.
>>>
>>>> best,
>>>> Maksim
>>>>
>>>>
>>>>
>>>> On 19/11/2013 16:10, L.C. Karssen wrote:
>>>>> Hi максим,
>>>>>
>>>>> (trying a Russian keyboard layout, no idea if this works...).
>>>>>
>>>>> On 19-11-13 09:44, Maksim Struchalin wrote:
>>>>>> It seems that your solution is workable but I see little difference with
>>>>>> what it is now. Now the filevector code is incorporated in each
>>>>>> packages. 
>>>>> This is what I would like to change, indeed. Code that is reused by so
>>>>> many packages should not be copied/symlinked into the code tree of those
>>>>> packages. By symlinking it as we have now, there is no proper way of
>>>>> specifying a version number of the filevector code. Which, in turn means
>>>>> that if something changes in the filevector code all other packages need
>>>>> to be changed immediately (just like what happened with your latest
>>>>> change). If the filevector code have been a proper library we could have
>>>>> simply said that ProbABEL still depends on the old filevector version
>>>>> and take more time to make sure the two play nice together.
>>>>>
>>>>> Moreover, with the filevector code in a separate library the whole
>>>>> isnan() issue would not be a problem. We could simply use std::isnan(),
>>>>> because CRAN wouldn't need to compile the .so/.dll, so no need of ISNAN().
>>>>> When code is put in a library the internals don't matter as long as the
>>>>> interface (function names + arguments) to the outside doesn't change.
>>>>>
>>>>>> You propose to follow the same way but pack filelvector code
>>>>>> in one file (dll or so) and distribute 9 packages form GenABEL with the
>>>>>> same library.
>>>>> Indeed. The problem with incorporating it all in DatABEL is that non-R
>>>>> packages like ProbABEL and OmicABEL depend on the stuff in the fvlib
>>>>> directory as well. Filevector is central to (almost) all packages in the
>>>>> GenABEL suite, which is why I proposed to make a library out of it. And,
>>>>> as noted above, this way packages can depend on different version of the
>>>>> library.
>>>>>
>>>>> We can of course discuss whether we want to distribute this .so/.dll as
>>>>> a separate (operating system) package or withing the R packages. To me
>>>>> the first option is the 'correct' one, but I see that this may impose on
>>>>> the user (except on Windows and maybe MacOS, where the .so/.dll is
>>>>> included in the R package).
>>>>>
>>>>>
>>>>>> Last time I proposed to move filevector in DatABEL. All other packages
>>>>>> (GenA and so on) will load DatAB in R and use filevector fucntions from
>>>>>> DatA. When DatABEL is loaded through library(DatABEL), the file
>>>>>> DatABEL.so is loaded as well.
>>>>> I think this is what should be done with the DAlib directory (another
>>>>> symlinked dir).
>>>>>
>>>>>>  Thus, you do not need to ask users to
>>>>>> install additional lib because it is in DatABEL already. I think this is
>>>>>> a workable approach that will allow us to delete the filevector code (or
>>>>>> filevector so/dll) from all the packages.
>>>>>>
>>>>>>
>>>>>> This is some quote from the R manual how to register functions to make
>>>>>> it available from DatAB to GenAB:
>>>>>>
>>>>>>
>>>>>>       _______________________________________________
>>>>>>
>>>>>>
>>>>>>       5.4 Registering native routines
>>>>>>
>>>>>> By ‘native’ routine, we mean an entry point in compiled code.
>>>>>>
>>>>>> In calls to |.C|, |.Call|, |.Fortran| and |.External|, R must locate the
>>>>>> specified native routine by looking in the appropriate shared
>>>>>> object/DLL. By default, R uses the operating system-specific dynamic
>>>>>> loader to lookup the symbol in all loaded DLLs and elsewhere.
>>>>>> Alternatively, the author of the DLL can explicitly register routines
>>>>>> with R and use a single, platform-independent mechanism for finding the
>>>>>> routines in the DLL. One can use this registration mechanism to provide
>>>>>> additional information about a routine, including the number and type of
>>>>>> the arguments, and also make it available to R programmers under a
>>>>>> different name. In the future, registration may be used to implement a
>>>>>> form of “secure” or limited native access.
>>>>>>
>>>>>> _____________________________________________________
>>>>>>
>>>>> Hmm, I will have to think about this. This seems to be about how R finds
>>>>> out in which DLL a function is found (and maybe where the DLL is found
>>>>> in the filesystem). I think this is separate from the point below, but
>>>>> I'm not sure.
>>>>>
>>>>>> Your argument was from "5.8 Linking to other packages: It is not in
>>>>>> general possible to link a DLL in package *packA* to a DLL provided by
>>>>>> package *packB". *I do not quite understand what they mean under 'link'.
>>>>>> May be the mean link a library during intsalltion?
>>>>> Yes, as far as I understand, they mean linking to a library during
>>>>> installation/compilation.
>>>>>
>>>>>
>>>>> Best,
>>>>>
>>>>> Lennart.
>>>>>> best,
>>>>>> Maksim
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 19/11/2013 15:14, L.C. Karssen wrote:
>>>>>>> Hi Maksim,
>>>>>>>
>>>>>>> Good question... The idea is to generate a .dll file for Windows, but
>>>>>>> I'm not sure what would be the best way to distribute that. It would be
>>>>>>> interesting to see how other packages do that. For example, the XML
>>>>>>> package depends on libxml2:
>>>>>>> http://cran.r-project.org/web/packages/XML/index.html and the Rcurl
>>>>>>> package depends on libcurl:
>>>>>>> http://cran.r-project.org/web/packages/RCurl/index.html
>>>>>>>
>>>>>>> In the XML package .zip file for Windows there is a directory libs/x64
>>>>>>> and a directory libs/i386. Both contain XML.dll, so I think that for
>>>>>>> Linux you simply specify a dependency on a library, whereas for Windows
>>>>>>> the actual .dll is in the package (which is quite logical because
>>>>>>> Windows lacks the package repositories that most Linux distros have).
>>>>>>> It seems that for MacOS the .tgz file also contains a lib directory with
>>>>>>> the .so file.
>>>>>>>
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Lennart.
>>>>>>>
>>>>>>> On 19-11-13 08:56, Maksim Struchalin wrote:
>>>>>>>> Hi Lennart,
>>>>>>>>
>>>>>>>> How the users under win will install such a library?
>>>>>>>>
>>>>>>>> best,
>>>>>>>> Maksim
>>>>>>>>
>>>>>>>> On 19/11/2013 14:46, L.C. Karssen wrote:
>>>>>>>>> Dear all,
>>>>>>>>>
>>>>>>>>> The Jenkins setup already shows its value: After Maksim changed the call
>>>>>>>>> from std::isnan() to ISNAN() in fvlib's CastUtils.cpp an automatic build
>>>>>>>>> of ProbABEL was triggered and it failed (because ISNAN() is an R function).
>>>>>>>>>
>>>>>>>>> I guess this is one more reason to try to convert fvlib into a real
>>>>>>>>> (shared) library.
>>>>>>>>> Does anyone have another workable solution?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Lennart.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> genabel-devel mailing list
>>>>>>>>> genabel-devel at lists.r-forge.r-project.org
>>>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>>>>> _______________________________________________
>>>>>>>> genabel-devel mailing list
>>>>>>>> genabel-devel at lists.r-forge.r-project.org
>>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> genabel-devel mailing list
>>>>>>> genabel-devel at lists.r-forge.r-project.org
>>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>>> _______________________________________________
>>>>>> genabel-devel mailing list
>>>>>> genabel-devel at lists.r-forge.r-project.org
>>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>>>
>>>>> _______________________________________________
>>>>> genabel-devel mailing list
>>>>> genabel-devel at lists.r-forge.r-project.org
>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>
>>>> _______________________________________________
>>>> genabel-devel mailing list
>>>> genabel-devel at lists.r-forge.r-project.org
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>>>
>>>
>>>
>>> _______________________________________________
>>> genabel-devel mailing list
>>> genabel-devel at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
>>
>> _______________________________________________
>> genabel-devel mailing list
>> genabel-devel at lists.r-forge.r-project.org
>> <mailto:genabel-devel at lists.r-forge.r-project.org>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> 
> 
> 
> _______________________________________________
> genabel-devel mailing list
> genabel-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/genabel-devel
> 

-- 
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
L.C. Karssen
Utrecht
The Netherlands

lennart at karssen.org
http://blog.karssen.org
GPG key ID: A88F554A
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: OpenPGP digital signature
URL: <http://lists.r-forge.r-project.org/pipermail/genabel-devel/attachments/20131125/bf3bdbdb/attachment.sig>


More information about the genabel-devel mailing list