[Rcpp-devel] Handling NAs in RcppMatrix

Dirk Eddelbuettel edd at debian.org
Sun Feb 7 18:54:03 CET 2010


On 7 February 2010 at 14:58, francoisromain at free.fr wrote:
| Yes. The new api is our current focus. 

Right. It is simply 'much newer' so there are more examples for the old
API. Which we will not withdraw, and which is less likely to undergo changes,
but yes -- the new one is better.
 
| You can find many real examples as part of our unit tests (in the "unitTests" installed directory of the package). 
| 
| Some other examples in our blogs
| 
| A draft article in the "papers" sub directory of the rforge project of rcpp. 

Yup.

Cheers, Dirk

| -----Original Message-----
| From: Leo Alekseyev <dnquark at gmail.com>
| Date: Sun, 7 Feb 2010 08:11:48 
| To: Romain Francois<romain.francois at dbmail.com>
| Cc: <rcpp-devel at lists.r-forge.r-project.org>
| Subject: Re: [Rcpp-devel] Handling NAs in RcppMatrix
| 
| D'oh, I should have guessed it was an integer/double issue -- I
| actually tried to check by testing something like c(1.0:2.2,NA), but
| this of course still produced an integer.
| 
| So as a general question -- would you recommend using the new API
| (Rcpp::CharacterVector, Rcpp::NumericVector) over RcppMatrix et al?..
| Are there any particularly useful docs / code samples illustrating the
| use of it?..
| 
| 
| On Sun, Feb 7, 2010 at 4:33 AM, Romain Francois
| <romain.francois at dbmail.com> wrote:
| > I've logged this as a bug in Rcpp's bug tracker :
| > https://r-forge.r-project.org/tracker/?atid=637&group_id=155&func=browse
| >
| > Romain
| >
| > On 02/07/2010 10:15 AM, Romain Francois wrote:
| >>
| >> Hello,
| >>
| >> I think the problem is that c(1:2, NA) produces an__integer__ vector,
| >> and RcppVector<double>  does not take care of coercion properly.
| >>
| >>  >  typeof( c(1:2,NA) )
| >> [1] "integer"
| >>  >  typeof( c(1,2,NA) )
| >> [1] "double"
| >>
| >> The "garbage value" you get I suppose is -2^31, which is how R
| >> represents NA for integer vectors. Because RcppVector<>  of RcppMatrix<>
| >> does not deal with coercion, you just get essentially a cast to -2^31 as
| >> a double.
| >>
| >> The good news is that the new API does take care of coercion, so when
| >> you create a Rcpp::NumericVector and feed it with an integer vector, the
| >> integer NA is coerced to a double NA.
| >>
| >> Also note that despite the name NumericVector, the class also offers
| >> matrix-like indexing. So if you pass a matrix to NumericVector, you can
| >> use matri indexing down in C++. See for example the
| >> 'test.NumericVector.matrix.indexing' unit test in :
| >>
| >>  >  system.file( "unitTests", "runit.NumericVector.R", package = "Rcpp" )
| >> [1] "/usr/local/lib/R/library/Rcpp/unitTests/runit.NumericVector.R"
| >>
| >> Also, as from Rcpp 0.7.1 you don't need the
| >> try/catch/copyMessageToR/Rf_error idom anymore because if you don't
| >> catch an exception in C++, it is recycled as an R exception, so you can
| >> catch it in R instead.
| >>
| >> Romain
| >>
| >> On 02/07/2010 07:42 AM, Leo Alekseyev wrote:
| >>>
| >>> The behavior I'm seeing is a bit bizarre.  The NaN conversion seems to
| >>> fail for RcppMatrix and RcppVector, but not for NumericVector.  The
| >>> most interesting bit is this: if I run the test code on e.g.
| >>> c(NA,1,2), RcppVector works (NA is printed as nan), whereas if I use
| >>> c(NA,1:2), instead of nan it prints a garbage value.
| >>>
| >>> Here's the code I was using (just a bunch of Rprintfs, really):
| >>>
| >>> RcppExport SEXP rcpp_test3(SEXP N1, SEXP V1, SEXP M1, SEXP parms) {
| >>>
| >>>    SEXP rl=R_NilValue;
| >>>    char* exceptionMesg=NULL;
| >>>
| >>>    try {
| >>>
| >>>      RcppMatrix<double>   m1(M1);// double mtx
| >>>      RcppVector<double>   v1(V1);// double vector
| >>>      RcppParams rparam(parms);
| >>>      RcppResultSet rs;
| >>>      Rcpp::RObject n1sexp(N1);
| >>>      double n1 = n1sexp.asDouble();
| >>>      Rcpp::NumericVector nv1(V1);
| >>>
| >>>      Rprintf("Printing number:\n%f\n",n1);
| >>>      Rprintf("Printing matrix:\n");
| >>>      for(int i=0; i<   m1.rows(); i++){
| >>>        for(int j=0; j<   m1.cols(); j++)
| >>>          Rprintf("(%d,%d):%f ",i,j,m1(i,j));
| >>>        Rprintf("\n");
| >>>      }
| >>>      Rprintf("Printing vector:\n");
| >>>      for(int i=0; i<   v1.size(); i++)
| >>>        Rprintf("%f, ",v1(i));
| >>>      Rprintf("\n");
| >>>      Rprintf("Printing Numeric vector:\n");
| >>>      for(int i=0; i<   nv1.size(); i++)
| >>>        Rprintf("%f, ",nv1(i));
| >>>      Rprintf("\n");
| >>>
| >>>      rl = rs.getReturnList();
| >>>
| >>>    } catch(std::exception&   ex) {
| >>>      exceptionMesg = copyMessageToR(ex.what());
| >>>    } catch(...) {
| >>>      exceptionMesg = copyMessageToR("unknown reason");
| >>>    }
| >>>
| >>>    if (exceptionMesg != NULL)
| >>>      Rf_error(exceptionMesg);
| >>>
| >>>    return rl;
| >>> }
| >>>
| >>>
| >>>
| >>> On Sat, Feb 6, 2010 at 11:19 PM, Dirk Eddelbuettel<edd at debian.org>
| >>> wrote:
| >>>>
| >>>> On 6 February 2010 at 22:08, Leo Alekseyev wrote:
| >>>> | Thanks for the response.  Your new unit test runs, but I still see the
| >>>> | problem (running 0.7.4 installed from CRAN).  Consider this version of
| >>>> | the test -- the only difference is that it uses a 2x2 matrix and
| >>>> | checks for NAs:
| >>>> |
| >>>> | test.RcppMatrix.double.na<- function() {
| >>>> |   src<- 'RcppMatrix<double>   m(x);
| >>>> |            RcppResultSet rs;
| >>>> |            rs.add("na_11",  R_IsNA(m(0,0)));
| >>>> |            rs.add("na_12",  R_IsNA(m(0,1)));
| >>>> |            rs.add("na_21",  R_IsNA(m(1,0)));
| >>>> |            rs.add("na_22",  R_IsNA(m(1,1)));
| >>>> |            return rs.getReturnList();';
| >>>> |   funx<- cfunction(signature(x="numeric"), src, Rcpp=TRUE)
| >>>> |   M<- matrix(1:4,2,2,byrow=TRUE)
| >>>> |   M[2,1]<- NA
| >>>> |   checkEquals(funx(x=M),
| >>>> |               list(na_11=0, na_12=0, na_21=1, na_22=0),
| >>>> |               msg = "RcppMatrix.double.na")
| >>>> | }
| >>>> |
| >>>> | This fails for me:
| >>>> |
| >>>> | Error in checkEquals(funx(x = M), list(na_11 = 0, na_12 = 0, na_21 =
| >>>> 1,  :
| >>>> |   Component 3: Mean absolute difference: 1RcppMatrix.double.na
| >>>> | test.RcppMatrix.double.na.nan: (1 checks) ... OK (0.92 seconds)
| >>>>
| >>>> Yes, I see that too. Not sure yet why.
| >>>>
| >>>> Did you try any of the other types like Rcpp::NumericVector?  Is there
| >>>> anything specific to RcppMatrix from what you cam see?
| >>>>
| >>>> | As far as the compilation error messages -- looks like they were
| >>>> | indeed byproducts of the CLINK_CPPFLAGS and didn't affect anything.
| >>>>
| >>>> Ok.
| >>>>
| >>>> Dirk
| >>>>
| >>>> | Thanks,
| >>>> | --Leo
| >>>> |
| >>>> |
| >>>> | On Sat, Feb 6, 2010 at 8:01 PM, Dirk Eddelbuettel<edd at debian.org>
| >>>> wrote:
| >>>> |>
| >>>> |>   Hi Leo,
| >>>> |>
| >>>> |>   On 6 February 2010 at 17:56, Leo Alekseyev wrote:
| >>>> |>   | Howdy folks,
| >>>> |>   | I started playing around with Rcpp recently, and so far haven't
| >>>> been
| >>>> |>   | able to figure out how to detect NAs in a matrix that I pass to
| >>>> the
| >>>> |>   | C++ code.
| >>>> |>   | My test code looks something like this:
| >>>> |>   | RcppExport SEXP rcpp_test(SEXP N1, SEXP V1, SEXP M1, SEXP parms)
| >>>> {
| >>>> |>   | .......
| >>>> |>   |     RcppMatrix<double>   m1(M1);// double mtx based on M1
| >>>> |>   |     RcppVector<double>   v1(V1);// double vector based on V1
| >>>> |>   |     RcppParams rparam(parms);       // parameter from R based on
| >>>> parms
| >>>> |>   |     RcppResultSet rs;
| >>>> |>   |     Rcpp::RObject n1sexp(N1);
| >>>> |>   |
| >>>> |>   |     double n1 = n1sexp.asDouble();
| >>>> |>   |     Rprintf("The value of isna is %d\n",R_IsNA(n1));
| >>>> |>   |
| >>>> |>   |     double element1 = v1(0);
| >>>> |>   |     if(R_IsNA(element1))
| >>>> |>   |       Rprintf("Found NA in vector!\n");
| >>>> |>   |
| >>>> |>   |     double m00 = m1(0,0);
| >>>> |>   |     if(R_IsNA(m00))
| >>>> |>   |       Rprintf("Found NA in matrix!\n");
| >>>> |>   | ................
| >>>> |>
| >>>> |>   That looks like the right approach.
| >>>> |>
| >>>> |>   So to give this some meat on the bone and some comparability, I
| >>>> added a new
| >>>> |>   unit test:
| >>>> |>
| >>>> |>   test.RcppMatrix.double.na.nan<- function() {
| >>>> |>       src<- 'RcppMatrix<double>   m(x);
| >>>> |>               RcppResultSet rs;
| >>>> |>               rs.add("na_21",  R_IsNA(m(1,0)));
| >>>> |>               rs.add("na_22",  R_IsNA(m(1,1)));
| >>>> |>               rs.add("nan_31", R_IsNaN(m(2,0)));
| >>>> |>               rs.add("nan_32", R_IsNaN(m(2,1)));
| >>>> |>               return rs.getReturnList();';
| >>>> |>       funx<- cfunction(signature(x="numeric"), src, Rcpp=TRUE)
| >>>> |>       M<- matrix(1:6,3,2,byrow=TRUE)
| >>>> |>       M[2,1]<- NA
| >>>> |>       M[3,1]<- NaN
| >>>> |>       checkEquals(funx(x=M),
| >>>> |>                   list(na_21=1, na_22=0, nan_31=1, nan_32=0),
| >>>> |>                   msg = "RcppMatrix.double.na.nan")
| >>>> |>   }
| >>>> |>
| >>>> |>   This defines a function 'test.RcppMatrix.double.na.nan'. In it, we
| >>>> create a
| >>>> |>   function funx that evaluates the C++ code in src -- it tests
| >>>> positions (2,1),
| >>>> |>   (2,2), (3,1) and (3,2).  We then create this matrix
| >>>> |>
| >>>> |>     R>   M<- matrix(1:6,3,2,byrow=TRUE)
| >>>> |>     R>   M[2,1]<- NA
| >>>> |>     R>   M[3,1]<- NaN
| >>>> |>     R>   M
| >>>> |>          [,1] [,2]
| >>>> |>     [1,]    1    2
| >>>> |>     [2,]   NA    4
| >>>> |>     [3,]  NaN    6
| >>>> |>     R>
| >>>> |>
| >>>> |>   and the compare with the result specified in the checkEquals call.
| >>>>  And that
| >>>> |>   pans out:
| >>>> |>
| >>>> |>     R>        checkEquals(funx(x=M),
| >>>> |>     +                 list(na_21=1, na_22=0, nan_31=1, nan_32=0),
| >>>> |>     +                 msg = "RcppMatrix.double.na.nan")
| >>>> |>     [1] TRUE
| >>>> |>     R>
| >>>> |>
| >>>> |>   What version were you running?  I am at current SVN so somewhere
| >>>> past 0.7.4
| >>>> |>   and not quite release 0.7.5.
| >>>> |>
| >>>> |>   | The R_IsNA() appears to work fine on scalars and vectors, but
| >>>> fails to
| >>>> |>   | detect NA present in a matrix.  Is this a bug?.. feature?..  Is
| >>>> the
| >>>> |>   | handling of missing values documented anywhere other than the R
| >>>> |>   | Extensions manual?..
| >>>> |>
| >>>> |>   We do not transform anything.  R tests for NA and NaN with (IIRC) a
| >>>> |>   combination of IEEE754 (for NA) and a special value for NaN.  That
| >>>> works here
| >>>> |>   as we use the R tests for ot.
| >>>> |>
| >>>> |>   | On a (probably) unrelated note, I get the following error when
| >>>> |>   | compiling my C++ code:
| >>>> |>   | Error in RcppCxx0xFlags() : could not find function
| >>>> "capture.output"
| >>>> |>   | Calls:<Anonymous>   ->   cat ->   RcppCxx0xFlags
| >>>> |>
| >>>> |>   Not here either:
| >>>> |>
| >>>> |>     R>   Rcpp:::RcppCxx0xFlags()
| >>>> |>     [1] "-std=c++0x"
| >>>> |>     R>
| >>>> |>
| >>>> |>   | However, the code seems to compile fine.  The compilation script
| >>>> is simply
| >>>> |>   | #!/bin/bash
| >>>> |>   | export PKG_CPPFLAGS="-I. "$(r -e "Rcpp:::CxxFlags()" )
| >>>> |>   | export PKG_LIBS=$(r -e "Rcpp:::LdFlags()" )
| >>>> |>   | export CLINK_CPPFLAGS=$(r -e "Rcpp:::Cxx0xFlags()" )
| >>>> |>   | R CMD SHLIB "$@"
| >>>> |>
| >>>> |>   You may not need the third for CLINK_CPPFLAGS. We experimented with
| >>>> C++0x but
| >>>> |>   there are side effects on other OSs so we parked this for now.
| >>>> |>
| >>>> |>   | Complete output from compilation is pasted below -- is this an
| >>>> error I
| >>>> |>   | should be concerned about?..
| >>>> |>   |  ./buildAndRun.sh rcpp_test.cpp --clean
| >>>> |>   | Error in RcppCxx0xFlags() : could not find function
| >>>> "capture.output"
| >>>> |>   | Calls:<Anonymous>   ->   cat ->   RcppCxx0xFlags
| >>>> |>   | Execution halted
| >>>> |>   | g++ -I/usr/share/R/include
| >>>> -I/usr/local/lib/R/site-library/Rcpp/lib
| >>>> |>   |  -fpic  -g -O2 -c rcpp_test.cpp -o rcpp_test.o
| >>>> |>   | Error in RcppCxx0xFlags() : could not find function
| >>>> "capture.output"
| >>>> |>   | Calls:<Anonymous>   ->   cat ->   RcppCxx0xFlags
| >>>> |>   | Execution halted
| >>>> |>   | Error in RcppCxx0xFlags() : could not find function
| >>>> "capture.output"
| >>>> |>   | Calls:<Anonymous>   ->   cat ->   RcppCxx0xFlags
| >>>> |>   | Execution halted
| >>>> |>   | g++ -shared -o rcpp_test.so rcpp_test.o
| >>>> |>   | -L/usr/local/lib/R/site-library/Rcpp/lib -lRcpp
| >>>> |>   | -Wl,-rpath,/usr/local/lib/R/site-library/Rcpp/lib
| >>>> -L/usr/lib/R/lib -lR
| >>>> |>   | Error in RcppCxx0xFlags() : could not find function
| >>>> "capture.output"
| >>>> |>   | Calls:<Anonymous>   ->   cat ->   RcppCxx0xFlags
| >>>> |>   | Execution halted
| >>>> |>   |
| >>>> |>   |
| >>>> |>   | Thanks in advance for any help (and please let me know if this is
| >>>> not
| >>>> |>   | the correct mailing list for these sorts of questions!)
| >>>> |>
| >>>> |>   Very much the right list! Thanks for asking here.
| >>>> |>
| >>>> |>   If you post your full shell script / test code I can have a go at
| >>>> testing it.
| >>>> |>
| >>>> |>   Dirk
| >
| >
| >
| > --
| > Romain Francois
| > Professional R Enthusiast
| > +33(0) 6 28 91 30 30
| > http://romainfrancois.blog.free.fr
| > |- http://tr.im/MPYc : RProtoBuf: protocol buffers for R
| > |- http://tr.im/KfKn : Rcpp 0.7.2
| > `- http://tr.im/JOlc : External pointers with Rcpp
| >
| >
| >
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

-- 
  Registration is open for the 2nd International conference R / Finance 2010
  See http://www.RinFinance.com for details, and see you in Chicago in April!


More information about the Rcpp-devel mailing list