[Rcpp-devel] Handling NAs in RcppMatrix

Sun Feb 7 14:11:48 CET 2010

D'oh, I should have guessed it was an integer/double issue -- I
actually tried to check by testing something like c(1.0:2.2,NA), but
this of course still produced an integer.

So as a general question -- would you recommend using the new API
(Rcpp::CharacterVector, Rcpp::NumericVector) over RcppMatrix et al?..
Are there any particularly useful docs / code samples illustrating the
use of it?..

On Sun, Feb 7, 2010 at 4:33 AM, Romain Francois
<romain.francois at dbmail.com> wrote:
> I've logged this as a bug in Rcpp's bug tracker :
> https://r-forge.r-project.org/tracker/?atid=637&group_id=155&func=browse
>
> Romain
>
> On 02/07/2010 10:15 AM, Romain Francois wrote:
>>
>> Hello,
>>
>> I think the problem is that c(1:2, NA) produces an __integer__ vector,
>> and RcppVector<double>  does not take care of coercion properly.
>>
>>  >  typeof( c(1:2,NA) )
>> [1] "integer"
>>  >  typeof( c(1,2,NA) )
>> [1] "double"
>>
>> The "garbage value" you get I suppose is -2^31, which is how R
>> represents NA for integer vectors. Because RcppVector<>  of RcppMatrix<>
>> does not deal with coercion, you just get essentially a cast to -2^31 as
>> a double.
>>
>> The good news is that the new API does take care of coercion, so when
>> you create a Rcpp::NumericVector and feed it with an integer vector, the
>> integer NA is coerced to a double NA.
>>
>> Also note that despite the name NumericVector, the class also offers
>> matrix-like indexing. So if you pass a matrix to NumericVector, you can
>> use matri indexing down in C++. See for example the
>> 'test.NumericVector.matrix.indexing' unit test in :
>>
>>  >  system.file( "unitTests", "runit.NumericVector.R", package = "Rcpp" )
>> [1] "/usr/local/lib/R/library/Rcpp/unitTests/runit.NumericVector.R"
>>
>> Also, as from Rcpp 0.7.1 you don't need the
>> try/catch/copyMessageToR/Rf_error idom anymore because if you don't
>> catch an exception in C++, it is recycled as an R exception, so you can
>> catch it in R instead.
>>
>> Romain
>>
>> On 02/07/2010 07:42 AM, Leo Alekseyev wrote:
>>>
>>> The behavior I'm seeing is a bit bizarre.  The NaN conversion seems to
>>> fail for RcppMatrix and RcppVector, but not for NumericVector.  The
>>> most interesting bit is this: if I run the test code on e.g.
>>> c(NA,1,2), RcppVector works (NA is printed as nan), whereas if I use
>>> c(NA,1:2), instead of nan it prints a garbage value.
>>>
>>> Here's the code I was using (just a bunch of Rprintfs, really):
>>>
>>> RcppExport SEXP rcpp_test3(SEXP N1, SEXP V1, SEXP M1, SEXP parms) {
>>>
>>>    SEXP rl=R_NilValue;
>>>    char* exceptionMesg=NULL;
>>>
>>>    try {
>>>
>>>      RcppMatrix<double>   m1(M1);// double mtx
>>>      RcppVector<double>   v1(V1);// double vector
>>>      RcppParams rparam(parms);
>>>      RcppResultSet rs;
>>>      Rcpp::RObject n1sexp(N1);
>>>      double n1 = n1sexp.asDouble();
>>>      Rcpp::NumericVector nv1(V1);
>>>
>>>      Rprintf("Printing number:\n%f\n",n1);
>>>      Rprintf("Printing matrix:\n");
>>>      for(int i=0; i<   m1.rows(); i++){
>>>        for(int j=0; j<   m1.cols(); j++)
>>>          Rprintf("(%d,%d):%f ",i,j,m1(i,j));
>>>        Rprintf("\n");
>>>      }
>>>      Rprintf("Printing vector:\n");
>>>      for(int i=0; i<   v1.size(); i++)
>>>        Rprintf("%f, ",v1(i));
>>>      Rprintf("\n");
>>>      Rprintf("Printing Numeric vector:\n");
>>>      for(int i=0; i<   nv1.size(); i++)
>>>        Rprintf("%f, ",nv1(i));
>>>      Rprintf("\n");
>>>
>>>      rl = rs.getReturnList();
>>>
>>>    } catch(std::exception&   ex) {
>>>      exceptionMesg = copyMessageToR(ex.what());
>>>    } catch(...) {
>>>      exceptionMesg = copyMessageToR("unknown reason");
>>>    }
>>>
>>>    if (exceptionMesg != NULL)
>>>      Rf_error(exceptionMesg);
>>>
>>>    return rl;
>>> }
>>>
>>>
>>>
>>> On Sat, Feb 6, 2010 at 11:19 PM, Dirk Eddelbuettel<edd at debian.org>
>>> wrote:
>>>>
>>>> On 6 February 2010 at 22:08, Leo Alekseyev wrote:
>>>> | Thanks for the response.  Your new unit test runs, but I still see the
>>>> | problem (running 0.7.4 installed from CRAN).  Consider this version of
>>>> | the test -- the only difference is that it uses a 2x2 matrix and
>>>> | checks for NAs:
>>>> |
>>>> | test.RcppMatrix.double.na<- function() {
>>>> |   src<- 'RcppMatrix<double>   m(x);
>>>> |            RcppResultSet rs;
>>>> |            rs.add("na_11",  R_IsNA(m(0,0)));
>>>> |            rs.add("na_12",  R_IsNA(m(0,1)));
>>>> |            rs.add("na_21",  R_IsNA(m(1,0)));
>>>> |            rs.add("na_22",  R_IsNA(m(1,1)));
>>>> |            return rs.getReturnList();';
>>>> |   funx<- cfunction(signature(x="numeric"), src, Rcpp=TRUE)
>>>> |   M<- matrix(1:4,2,2,byrow=TRUE)
>>>> |   M[2,1]<- NA
>>>> |   checkEquals(funx(x=M),
>>>> |               list(na_11=0, na_12=0, na_21=1, na_22=0),
>>>> |               msg = "RcppMatrix.double.na")
>>>> | }
>>>> |
>>>> | This fails for me:
>>>> |
>>>> | Error in checkEquals(funx(x = M), list(na_11 = 0, na_12 = 0, na_21 =
>>>> 1,  :
>>>> |   Component 3: Mean absolute difference: 1RcppMatrix.double.na
>>>> | test.RcppMatrix.double.na.nan: (1 checks) ... OK (0.92 seconds)
>>>>
>>>> Yes, I see that too. Not sure yet why.
>>>>
>>>> Did you try any of the other types like Rcpp::NumericVector?  Is there
>>>> anything specific to RcppMatrix from what you cam see?
>>>>
>>>> | As far as the compilation error messages -- looks like they were
>>>> | indeed byproducts of the CLINK_CPPFLAGS and didn't affect anything.
>>>>
>>>> Ok.
>>>>
>>>> Dirk
>>>>
>>>> | Thanks,
>>>> | --Leo
>>>> |
>>>> |
>>>> | On Sat, Feb 6, 2010 at 8:01 PM, Dirk Eddelbuettel<edd at debian.org>
>>>> wrote:
>>>> |>
>>>> |>   Hi Leo,
>>>> |>
>>>> |>   On 6 February 2010 at 17:56, Leo Alekseyev wrote:
>>>> |>   | Howdy folks,
>>>> |>   | I started playing around with Rcpp recently, and so far haven't
>>>> been
>>>> |>   | able to figure out how to detect NAs in a matrix that I pass to
>>>> the
>>>> |>   | C++ code.
>>>> |>   | My test code looks something like this:
>>>> |>   | RcppExport SEXP rcpp_test(SEXP N1, SEXP V1, SEXP M1, SEXP parms)
>>>> {
>>>> |>   | .......
>>>> |>   |     RcppMatrix<double>   m1(M1);// double mtx based on M1
>>>> |>   |     RcppVector<double>   v1(V1);// double vector based on V1
>>>> |>   |     RcppParams rparam(parms);       // parameter from R based on
>>>> parms
>>>> |>   |     RcppResultSet rs;
>>>> |>   |     Rcpp::RObject n1sexp(N1);
>>>> |>   |
>>>> |>   |     double n1 = n1sexp.asDouble();
>>>> |>   |     Rprintf("The value of isna is %d\n",R_IsNA(n1));
>>>> |>   |
>>>> |>   |     double element1 = v1(0);
>>>> |>   |     if(R_IsNA(element1))
>>>> |>   |       Rprintf("Found NA in vector!\n");
>>>> |>   |
>>>> |>   |     double m00 = m1(0,0);
>>>> |>   |     if(R_IsNA(m00))
>>>> |>   |       Rprintf("Found NA in matrix!\n");
>>>> |>   | ................
>>>> |>
>>>> |>   That looks like the right approach.
>>>> |>
>>>> |>   So to give this some meat on the bone and some comparability, I
>>>> added a new
>>>> |>   unit test:
>>>> |>
>>>> |>   test.RcppMatrix.double.na.nan<- function() {
>>>> |>       src<- 'RcppMatrix<double>   m(x);
>>>> |>               RcppResultSet rs;
>>>> |>               rs.add("na_21",  R_IsNA(m(1,0)));
>>>> |>               rs.add("na_22",  R_IsNA(m(1,1)));
>>>> |>               rs.add("nan_31", R_IsNaN(m(2,0)));
>>>> |>               rs.add("nan_32", R_IsNaN(m(2,1)));
>>>> |>               return rs.getReturnList();';
>>>> |>       funx<- cfunction(signature(x="numeric"), src, Rcpp=TRUE)
>>>> |>       M<- matrix(1:6,3,2,byrow=TRUE)
>>>> |>       M[2,1]<- NA
>>>> |>       M[3,1]<- NaN
>>>> |>       checkEquals(funx(x=M),
>>>> |>                   list(na_21=1, na_22=0, nan_31=1, nan_32=0),
>>>> |>                   msg = "RcppMatrix.double.na.nan")
>>>> |>   }
>>>> |>
>>>> |>   This defines a function 'test.RcppMatrix.double.na.nan'. In it, we
>>>> create a
>>>> |>   function funx that evaluates the C++ code in src -- it tests
>>>> positions (2,1),
>>>> |>   (2,2), (3,1) and (3,2).  We then create this matrix
>>>> |>
>>>> |>     R>   M<- matrix(1:6,3,2,byrow=TRUE)
>>>> |>     R>   M[2,1]<- NA
>>>> |>     R>   M[3,1]<- NaN
>>>> |>     R>   M
>>>> |>          [,1] [,2]
>>>> |>     [1,]    1    2
>>>> |>     [2,]   NA    4
>>>> |>     [3,]  NaN    6
>>>> |>     R>
>>>> |>
>>>> |>   and the compare with the result specified in the checkEquals call.
>>>>  And that
>>>> |>   pans out:
>>>> |>
>>>> |>     R>        checkEquals(funx(x=M),
>>>> |>     +                 list(na_21=1, na_22=0, nan_31=1, nan_32=0),
>>>> |>     +                 msg = "RcppMatrix.double.na.nan")
>>>> |>     [1] TRUE
>>>> |>     R>
>>>> |>
>>>> |>   What version were you running?  I am at current SVN so somewhere
>>>> past 0.7.4
>>>> |>   and not quite release 0.7.5.
>>>> |>
>>>> |>   | The R_IsNA() appears to work fine on scalars and vectors, but
>>>> fails to
>>>> |>   | detect NA present in a matrix.  Is this a bug?.. feature?..  Is
>>>> the
>>>> |>   | handling of missing values documented anywhere other than the R
>>>> |>   | Extensions manual?..
>>>> |>
>>>> |>   We do not transform anything.  R tests for NA and NaN with (IIRC) a
>>>> |>   combination of IEEE754 (for NA) and a special value for NaN.  That
>>>> works here
>>>> |>   as we use the R tests for ot.
>>>> |>
>>>> |>   | On a (probably) unrelated note, I get the following error when
>>>> |>   | compiling my C++ code:
>>>> |>   | Error in RcppCxx0xFlags() : could not find function
>>>> "capture.output"
>>>> |>   | Calls:<Anonymous>   ->   cat ->   RcppCxx0xFlags
>>>> |>
>>>> |>   Not here either:
>>>> |>
>>>> |>     R>   Rcpp:::RcppCxx0xFlags()
>>>> |>     [1] "-std=c++0x"
>>>> |>     R>
>>>> |>
>>>> |>   | However, the code seems to compile fine.  The compilation script
>>>> is simply
>>>> |>   | #!/bin/bash
>>>> |>   | export PKG_CPPFLAGS="-I. "$(r -e "Rcpp:::CxxFlags()" )
>>>> |>   | export PKG_LIBS=$(r -e "Rcpp:::LdFlags()" )
>>>> |>   | export CLINK_CPPFLAGS=$(r -e "Rcpp:::Cxx0xFlags()" )
>>>> |>   | R CMD SHLIB "$@"
>>>> |>
>>>> |>   You may not need the third for CLINK_CPPFLAGS. We experimented with
>>>> C++0x but
>>>> |>   there are side effects on other OSs so we parked this for now.
>>>> |>
>>>> |>   | Complete output from compilation is pasted below -- is this an
>>>> error I
>>>> |>   | should be concerned about?..
>>>> |>   |  ./buildAndRun.sh rcpp_test.cpp --clean
>>>> |>   | Error in RcppCxx0xFlags() : could not find function
>>>> "capture.output"
>>>> |>   | Calls:<Anonymous>   ->   cat ->   RcppCxx0xFlags
>>>> |>   | Execution halted
>>>> |>   | g++ -I/usr/share/R/include
>>>> -I/usr/local/lib/R/site-library/Rcpp/lib
>>>> |>   |  -fpic  -g -O2 -c rcpp_test.cpp -o rcpp_test.o
>>>> |>   | Error in RcppCxx0xFlags() : could not find function
>>>> "capture.output"
>>>> |>   | Calls:<Anonymous>   ->   cat ->   RcppCxx0xFlags
>>>> |>   | Execution halted
>>>> |>   | Error in RcppCxx0xFlags() : could not find function
>>>> "capture.output"
>>>> |>   | Calls:<Anonymous>   ->   cat ->   RcppCxx0xFlags
>>>> |>   | Execution halted
>>>> |>   | g++ -shared -o rcpp_test.so rcpp_test.o
>>>> |>   | -L/usr/local/lib/R/site-library/Rcpp/lib -lRcpp
>>>> |>   | -Wl,-rpath,/usr/local/lib/R/site-library/Rcpp/lib
>>>> -L/usr/lib/R/lib -lR
>>>> |>   | Error in RcppCxx0xFlags() : could not find function
>>>> "capture.output"
>>>> |>   | Calls:<Anonymous>   ->   cat ->   RcppCxx0xFlags
>>>> |>   | Execution halted
>>>> |>   |
>>>> |>   |
>>>> |>   | Thanks in advance for any help (and please let me know if this is
>>>> not
>>>> |>   | the correct mailing list for these sorts of questions!)
>>>> |>
>>>> |>   Very much the right list! Thanks for asking here.
>>>> |>
>>>> |>   If you post your full shell script / test code I can have a go at
>>>> testing it.
>>>> |>
>>>> |>   Dirk
>
>
>
> --
> Romain Francois
> Professional R Enthusiast
> +33(0) 6 28 91 30 30
> http://romainfrancois.blog.free.fr
> |- http://tr.im/MPYc : RProtoBuf: protocol buffers for R
> |- http://tr.im/KfKn : Rcpp 0.7.2
> `- http://tr.im/JOlc : External pointers with Rcpp
>
>
>