[Rcpp-devel] Best way to return raw array

Dirk Eddelbuettel edd at debian.org
Fri Sep 2 18:13:35 CEST 2011


On 2 September 2011 at 10:42, Douglas Bates wrote:
| On Fri, Sep 2, 2011 at 7:19 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
| >
| > On 2 September 2011 at 11:10, Darren Cook wrote:
| > | > | I've extended Christian Gunning's speed test with an STL and C/C++
| > | > | version; I was about to post but then I got a bit stuck with using
| > | > | Rcpp::wrap() for a raw block of memory. I'm using this: | |
| > | > src1cpp<-' | int nn=as<int>(n); | double *p=new double[nn]; | ... |
| > | > NumericVector ret(p,p+nn); | delete p; | return ret; | '
| > | >
| > | > That strikes me as plain wrong code.
| > |
| > | Hello Dirk,
| > | Perhaps I can squeeze an answer out of you by changing it to this:
| > |
| > | double *p=third_party_function(nn);
| > | NumericVector ret(p,p+nn);
| > | delete p;
| > | return ret;
| >
| > Still strikes me as wrong; look eg RcppGSL to see how we deal with a C API.
| >
| > Maybe this would do
| >
| >  double *p=third_party_function(nn);
| >  NumericVector ret(nn);           // new memory
| >  copy(ret.begin(), ret.end(), p); // untested
| >  delete p;
| >  return ret;
| 
| Not only untested but also wrong :-).  The semantics of std::copy are
| the reverse of memcopy, etc. so you are copying from ret to p in this
| code.  Instead you want (unless I am making an embarrassing error)
| 
| copy(p, p + nn, ret.begin());

Ooops, and thanks for waving the cluebat. I had in fact first typed it using
memcpy() which is worse because you do not really know that the sizeof() is
identical.  std::copy() is better, but I missed the argument reversal.
 
| I agree with Darren to a certain extent that there could be occasions
| where you want to work with a double[] instead of a
| std::vector<double> or Rcpp::NumericVector.  There is no doubt that
| access to elements in a double[] will be done as quickly as the
| compiler can manage, whereas the STL and STL-like containers have the
| very helpful layer of abstraction provided by iterators that may get
| in the way of a request to "just give me the address of the i'th
| element, damn it".
| 
| Of course the scenario shown above is an even better reason to know
| how to install the contents of a double[] into an Rcpp::NumericVector.
| 
| What puzzles me is why Darren's original version doesn't work.  I
| believe the constructor
| 
| Rcpp::NumericVector ret(p,p+nn);
| 
| allocates a new vector and copies the contents into it.  In fact, I
| would argue that it must do so because you can't count on the storage
| from p to p+nn having been allocated by R.

If it does then the approach may indeed work. 

But I read it as Darren "hoping and praying" that he could just take a random
C vector and assume it would become part of R's memory management "just by
wishing it would".

And he was then surprised that he got random errors.  I strongly suspect the
errors are related to the very forbidden mixing of memory.

| The inline definition of that constructor (line 248 of
| Rcpp/inst/include/Rcpp/vector/Vector.h) calls assign which calls wrap
| which will allocate a new SEXP and storage for the vector in this
| case, I believe.

When the arguments are iterators.  In 

      double *p=third_party_function(nn);
      NumericVector ret(p,p+nn);

we have just pointers.  Are you sure those get automagically cast to
iterators? 

Dirk
 
| > | where third_party_function() is C legacy code that is documented as
| > | returning a block of memory of size nn that the client should take
| > | ownership of.
| >
| > Yes -- "client should take ownership of" is paramount, and for that we need
| > memory managed by R.  Rcpp data structures do that, just doing a random C
| > level allocation does not.
| >
| > | How do I return it?
| > |
| > | (I took a look at the convolve examples but they all build up the result
| > | in a Rcpp object. I cannot see an example where you have the result
| > | ready-made in a block of memory and just need to return it.)
| >
| > Maybe there is reason for that? Consider my last email... ;-)
| >
| > | > c) The whole point of what we do with Rcpp is to NOT have to deal
| > | > with new / delete and or malloc / free.  Even if you think it's cool
| > | > and know how to it in plain, it is simply against the whole spirit
| > | > ...STL idioms are really much better.
| > |
| > | You'll enjoy my timing post then (as the STL does not just equal the raw
| > | array version, it beats it).
| > |
| > | But I think we see the raison d'etre of Rcpp differently; for me it is:
| > |   * Optimizing key R code;
| > |   * Interfacing with 3rd party C/C++ libraries;
| > |   * Doing the above two while bypassing the ugly verbose code of the
| > | usual way to write R extensions.
| > |
| > | Or, in a sound bite: "Rcpp is not just for C++ newbies" ;-)
| >
| > Obviously agreed on all point, but there is no need to mislead the newbies,
| > and to poison them with bad C habits just because that's what may have
| > happened in your and my rough youth.
| >
| > Dirk
| >
| > --
| > Two new Rcpp master classes for R and C++ integration scheduled for
| > New York (Sep 24) and San Francisco (Oct 8), more details are at
| > http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10
| > http://www.revolutionanalytics.com/products/training/public/rcpp-master-class.php
| > _______________________________________________
| > Rcpp-devel mailing list
| > Rcpp-devel at lists.r-forge.r-project.org
| > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| >

-- 
Two new Rcpp master classes for R and C++ integration scheduled for 
New York (Sep 24) and San Francisco (Oct 8), more details are at
http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10
http://www.revolutionanalytics.com/products/training/public/rcpp-master-class.php


More information about the Rcpp-devel mailing list