[Rcpp-devel] Possible unprotected memory problems

Dirk Eddelbuettel edd at debian.org
Thu Jul 21 21:02:43 CEST 2011


On 21 July 2011 at 14:15, Steve Lianoglou wrote:
| On Thu, Jul 21, 2011 at 2:04 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
| I'm glad you brought this up!
| 
| I've been meaning to ask if there is a way to do this successfully, or
| is it impossible?
| 
| Where "it" is the ability to not copy the contents of (a potentially
| large) numeric vector that we pass into a C/++ function, but rather
| just "pass the pointer/data off" to the C side of the equation, and
| let that worry about GC'ing the data when appropriate.

As Doug said in his reply, we do that all the time.  Here is an example where
we do it twice: fastLm.cpp in the RcppArmadillo sources:


extern "C" SEXP fastLm(SEXP Xs, SEXP ys) {

    try {
	Rcpp::NumericVector yr(ys);                     // creates Rcpp vector from SEXP
	Rcpp::NumericMatrix Xr(Xs);                     // creates Rcpp matrix from SEXP
	int n = Xr.nrow(), k = Xr.ncol();
	arma::mat X(Xr.begin(), n, k, false);           // reuses memory and avoids extra copy
	arma::colvec y(yr.begin(), yr.size(), false);

        [....]

We first create two Rcpp objects, and these are lightweight wrappers around
the underlying SEXP. No copying.

We then use the begin() accessor styled after the STL to acces the same
memory, as well as size attributes and a boolean to signal to Armadillo that
"it can trust us" and form its object in a similar leightweight manner.

So now we have X and y as Armadillo objects, and still nothing got copied.

(In fact, we also have RcppArmadillo methods for as<>() but those are not as
smart and copy, much to my dismay. I had added those in fastLm() only to see
the performance drop somewhat due to the copying.  The two-step shown here is
better ....)
 
| In theory, I guess it would be like having an unbalanced
| PROTECT/UNPROTECT going on.

We don't do PROTECT / UNPROTECT but a bit gets set that corresponds to the
same. I would have to look up the details as it has been a while....
 
| The "hand off" of the data/pointer to a C library would be like
| calling PROTECT. After your C function returns control back to R, it
| would still claim ownership/usage of the data. Things would hum along

I think you have it inverse.  If you create an object in C++ and hand it to
R, you typically do not expect to see that object ever again in C++ -- and
hence you let R go about its business and even gc it.

Only when you have functions where you go back and forth, or if you have
long-lived object then do you tell R to leave the object alone -- and that is
commonly done with the external pointers, or Rcpp::XPtr here.

| "as usual", but the data in that part of memory wouldn't be GC'd by R
| until your C library decides to call its UNPROTECT on that some point
| later, at which point the normal R GC functionality would happen when
| it happens.
| 
| Is that even possible?

Sure: 

  R> fortune("Yoda")

  Evelyn Hall: I would like to know how (if) I can extract 
  some of the information from the summary of my nlme.
  Simon Blomberg: This is R. There is no if. Only how.
     -- Evelyn Hall and Simon 'Yoda' Blomberg
        R-help (April 2005)

 

:)

I may have misunderstood your last paragraph, though. If that is the case,
try again.  100 degree weather here in Illinois has its impact....

Dirk

-- 
Gauss once played himself in a zero-sum game and won $50.
                      -- #11 at http://www.gaussfacts.com


More information about the Rcpp-devel mailing list