[Rcpp-devel] Segfault, is it because of iterators/pointers?
Romain Francois
romain at r-enthusiasts.com
Wed Feb 12 14:34:03 CET 2014
Le 12 févr. 2014 à 13:36, Alessandro Mammana <mammana at molgen.mpg.de> a écrit :
> Ah wait, my bad (as always T.T), I found a much simpler explanation:
>
> colset <- sample(3e7-nr, 1e7)
> storage.mode(colset)
> [1] "integer"
> storage.mode(colset-1)
> [1] "double"
>
> So when I was unwrapping colset I allocated new memory in Rcpp to
> convert from double to integer, which was no longer valid when I went
> out of scope.
> I think it is a bit dangerous that you never know if you are
> allocating memory or just wrapping R objects when parsing arguments in
> Rcpp.
> Is there a way of ensuring that NOTHING gets copied when parsing
> arguments? Can you throw an exception if the type you try to cast to
> is not the one you expect?
> You might imagine that with large datasets this is important.
Silent coercion was added by design. Rcpp does not give you a « strict » mode.
One thing you can do is something like this:
#include <Rcpp.h>
using namespace Rcpp ;
template <typename T>
class Strict : public T {
public:
Strict( SEXP x ) {
if( TYPEOF(x) != T::r_type::value )
stop( "not compatible" ) ;
T::Storage::set__(x) ;
}
} ;
// [[Rcpp::export]]
int foo( Strict<NumericVector> v ){
return v.size() ;
}
You’d get e.g.
> foo(rnorm(10))
[1] 10
> foo(1:10)
Error in eval(expr, envir, enclos) : not compatible
Calls: sourceCpp ... source -> withVisible -> eval -> eval -> foo -> <Anonymous>
Execution halted
> Sorry for bothering and thanks again,
> Ale
>
>
> On Wed, Feb 12, 2014 at 1:10 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
>>
>> On 12 February 2014 at 11:47, Alessandro Mammana wrote:
>> | Ok I was able to find the code causing the bug. So it looks like the
>>
>> Thanks for the added detail.
>>
>> | pointers you get from an Rcpp::Vector using .begin() become invalid
>> | after that the Rcpp::Vector goes out of scope (and this makes sense),
>> | what I do not understand is that this Rcpp::Vector was allocated in R
>> | and should still be "living" during the execution of the Rcpp call
>> | (that's why I wasn't expecting the pointer to be invalid).
>> |
>> | This is the exact code (the one above is probably fine):
>> | @@@@@@@@@@@@@@ in CPP @@@@@@@@@@@@@@i
>> |
>> | struct GapMat {
>> | int* ptr;
>> | int* colset;
>> | int nrow;
>> | int ncol;
>> |
>> |
>> | inline int* colptr(int col){
>> | return ptr + colset[col];
>> | }
>> |
>> | GapMat(){}
>> |
>> | GapMat(int* _ptr, int* _colset, int _nrow, int _ncol):
>> | ptr(_ptr), colset(_colset), nrow(_nrow), ncol(_ncol){}
>> | };
>> |
>> |
>> | GapMat getGapMat(Rcpp::List gapmat){
>> | IntegerVector vec = gapmat["vec"];
>> | IntegerVector pos = gapmat["colset"];
>> | int nrow = gapmat["nrow"];
>> |
>> | return GapMat(vec.begin(), pos.begin(), nrow, pos.length());
>> | }
>> |
>> | // [[Rcpp::export]]
>> | IntegerVector colSumsGapMat(Rcpp::List gapmat){
>> |
>> | GapMat mat = getGapMat(gapmat);
>> | IntegerVector res(mat.ncol);
>> |
>> | for (int i = 0; i < mat.ncol; ++i){
>> | for (int j = 0; j < mat.nrow; ++j){
>> | res[i] += mat.colptr(i)[j];
>> | }
>> | }
>> |
>> | return res;
>> | }
>> |
>> | @@@@@@@@@@@@@@ in R (with gdb debugger as suggested by Dirk) @@@@@@@@@@@@@@i
>> | library(Rcpp)
>> | sourceCpp("scratchpad.cpp")
>> |
>> | vec <- rnbinom(3e7, mu=0.1, size=1); storage.mode(vec) <- "integer"
>> | nr <- 80
>> |
>> | colset <- sample(3e7-nr, 1e7)
>> | foo <- vec[colset] #this is only to trigger some obscure garbage
>> | collection mechanisms...
>> |
>> | for (i in 1:10){
>> | colset <- sample(3e7-nr, 1e7)
>> | gapmat <- list(vec=vec, nrow=nr, colset=colset-1)
>> | cs <- colSumsGapMat(gapmat)
>> | print(sum(cs))
>> | }
>> |
>> | [1] 80000000
>> | [1] 80000000
>> | [1] 80016890
>> | [1] 80008144
>> | [1] 80016022
>> | [1] 80021609
>> |
>> | Program received signal SIGSEGV, Segmentation fault.
>> | 0x00007ffff18a5455 in GapMat::colptr (this=0x7fffffffc120, col=0) at
>> | scratchpad.cpp:295
>> | 295 return ptr + colset[col];
>> |
>> | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>> |
>> | Why did it happen? What should I do to make sure that my pointers
>> | remain valid? My goal is to convert safely some vectors or matrices
>> | that "exist" in R to some pointers, how can I do that?
>>
>> Not sure. It looks fine at first instance. But then it's early in the morning
>> and I had very little coffee yet...
>>
>> Maybe the fact that you tickle the gc() via vec[colset] has something to do
>> with it, maybe it has not. Maybe I would try the decomposition of the List
>> object inside the colSumsGapMat() function to keep it simpler. Or if you
>> _really_ want an external object to iterate over, memcpy it out.
>>
>> With really large object, you may be stressing parts of the code that have
>> not been stressed the same way. If it breaks, you do get to keep both pieces.
>>
>> Dirk
>>
>> --
>> Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
>
>
>
> --
> Alessandro Mammana, PhD Student
> Max Planck Institute for Molecular Genetics
> Ihnestraße 63-73
> D-14195 Berlin, Germany
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
More information about the Rcpp-devel
mailing list