[Rcpp-devel] Segfault, is it because of iterators/pointers?

Romain Francois romain at r-enthusiasts.com
Wed Feb 12 14:34:03 CET 2014


Le 12 févr. 2014 à 13:36, Alessandro Mammana <mammana at molgen.mpg.de> a écrit :

> Ah wait, my bad (as always T.T), I found a much simpler explanation:
> 
> colset <- sample(3e7-nr, 1e7)
> storage.mode(colset)
> [1] "integer"
> storage.mode(colset-1)
> [1] "double"
> 
> So when I was unwrapping colset I allocated new memory in Rcpp to
> convert from double to integer, which was no longer valid when I went
> out of scope.
> I think it is a bit dangerous that you never know if you are
> allocating memory or just wrapping R objects when parsing arguments in
> Rcpp.
> Is there a way of ensuring that NOTHING gets copied when parsing
> arguments? Can you throw an exception if the type you try to cast to
> is not the one you expect?
> You might imagine that with large datasets this is important.

Silent coercion was added by design. Rcpp does not give you a « strict » mode. 

One thing you can do is something like this: 

#include <Rcpp.h>
using namespace Rcpp ;

template <typename T>
class Strict : public T {
public:
  Strict( SEXP x ) {
    if( TYPEOF(x) != T::r_type::value )
      stop( "not compatible" ) ;
    T::Storage::set__(x) ;
  }
    
} ;

// [[Rcpp::export]]
int foo( Strict<NumericVector> v ){
  return v.size() ;
}

You’d get e.g. 

> foo(rnorm(10))
[1] 10

> foo(1:10)
Error in eval(expr, envir, enclos) : not compatible
Calls: sourceCpp ... source -> withVisible -> eval -> eval -> foo -> <Anonymous>
Execution halted



> Sorry for bothering and thanks again,
> Ale
> 
> 
> On Wed, Feb 12, 2014 at 1:10 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
>> 
>> On 12 February 2014 at 11:47, Alessandro Mammana wrote:
>> | Ok I was able to find the code causing the bug. So it looks like the
>> 
>> Thanks for the added detail.
>> 
>> | pointers you get from an Rcpp::Vector using .begin() become invalid
>> | after that the Rcpp::Vector goes out of scope (and this makes sense),
>> | what I do not understand is that this Rcpp::Vector was allocated in R
>> | and should still be "living" during the execution of the Rcpp call
>> | (that's why I wasn't expecting the pointer to be invalid).
>> |
>> | This is the exact code (the one above is probably fine):
>> | @@@@@@@@@@@@@@ in CPP @@@@@@@@@@@@@@i
>> |
>> | struct GapMat {
>> |     int* ptr;
>> |     int* colset;
>> |     int nrow;
>> |     int ncol;
>> |
>> |
>> |     inline int* colptr(int col){
>> |         return ptr + colset[col];
>> |     }
>> |
>> |     GapMat(){}
>> |
>> |     GapMat(int* _ptr, int* _colset, int _nrow, int _ncol):
>> |         ptr(_ptr), colset(_colset), nrow(_nrow), ncol(_ncol){}
>> | };
>> |
>> |
>> | GapMat getGapMat(Rcpp::List gapmat){
>> |     IntegerVector vec = gapmat["vec"];
>> |     IntegerVector pos = gapmat["colset"];
>> |     int nrow = gapmat["nrow"];
>> |
>> |     return GapMat(vec.begin(), pos.begin(), nrow, pos.length());
>> | }
>> |
>> | // [[Rcpp::export]]
>> | IntegerVector colSumsGapMat(Rcpp::List gapmat){
>> |
>> |     GapMat mat = getGapMat(gapmat);
>> |     IntegerVector res(mat.ncol);
>> |
>> |     for (int i = 0; i < mat.ncol; ++i){
>> |         for (int j = 0; j < mat.nrow; ++j){
>> |             res[i] += mat.colptr(i)[j];
>> |         }
>> |     }
>> |
>> |     return res;
>> | }
>> |
>> | @@@@@@@@@@@@@@ in R (with gdb debugger as suggested by Dirk) @@@@@@@@@@@@@@i
>> | library(Rcpp)
>> | sourceCpp("scratchpad.cpp")
>> |
>> | vec <- rnbinom(3e7, mu=0.1, size=1); storage.mode(vec) <- "integer"
>> | nr <- 80
>> |
>> | colset <- sample(3e7-nr, 1e7)
>> | foo <- vec[colset] #this is only to trigger some obscure garbage
>> | collection mechanisms...
>> |
>> | for (i in 1:10){
>> |     colset <- sample(3e7-nr, 1e7)
>> |     gapmat <- list(vec=vec, nrow=nr, colset=colset-1)
>> |     cs <- colSumsGapMat(gapmat)
>> |     print(sum(cs))
>> | }
>> |
>> | [1] 80000000
>> | [1] 80000000
>> | [1] 80016890
>> | [1] 80008144
>> | [1] 80016022
>> | [1] 80021609
>> |
>> | Program received signal SIGSEGV, Segmentation fault.
>> | 0x00007ffff18a5455 in GapMat::colptr (this=0x7fffffffc120, col=0) at
>> | scratchpad.cpp:295
>> | 295            return ptr + colset[col];
>> |
>> | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>> |
>> | Why did it happen? What should I do to make sure that my pointers
>> | remain valid? My goal is to convert safely some vectors or matrices
>> | that "exist" in R to some pointers, how can I do that?
>> 
>> Not sure. It looks fine at first instance. But then it's early in the morning
>> and I had very little coffee yet...
>> 
>> Maybe the fact that you tickle the gc() via vec[colset] has something to do
>> with it, maybe it has not.  Maybe I would try the decomposition of the List
>> object inside the colSumsGapMat() function to keep it simpler.  Or if you
>> _really_ want an external object to iterate over, memcpy it out.
>> 
>> With really large object, you may be stressing parts of the code that have
>> not been stressed the same way.  If it breaks, you do get to keep both pieces.
>> 
>> Dirk
>> 
>> --
>> Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
> 
> 
> 
> -- 
> Alessandro Mammana, PhD Student
> Max Planck Institute for Molecular Genetics
> Ihnestraße 63-73
> D-14195 Berlin, Germany
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel



More information about the Rcpp-devel mailing list