[Rcpp-devel] Forcing a shallow versus deep copy

Dirk Eddelbuettel edd at debian.org
Fri Jul 12 07:42:03 CEST 2013


On 11 July 2013 at 19:21, Gabor Grothendieck wrote:
| 1. Just to be clear what we have been discussing here is not just how to
| avoid copying but how to avoid copying while using as and wrap
| or approaches that automatically generate as and wrap.  I was already
| aware of how to avoid copying using Armadillo how to use Armadillo types
| as arguments and return values to autogen as and wrap.  The problem is
| not that but that these two things cannot be done at once - its either or.

I must still be misunderstanding as this still reads to me as if you are
suspecting that we somehow keep layers making extra copies. 

We're not. And I've known you long enough to know that you are not likely to
suspect this either.  So what is it then?

As Romain said, some of the choice have to do with the representation on both
the R and C++ side -- for Rcpp itself we can be lightweight and efficient via
proxy classes, but this does not mean we can do this for _any arbitrary C++
class_ coming from another project. As eg Armadillo.  RcppArmadilo already
does pretty well, and code review may make it better.  We do not know of any
fat to cut, or we'd cut it ourselves.  We care about a few things, but
performance is clearly among them.
 
| 2. Regarding the quesiton of performance impact there are two situations
| which should be distinguished:
| 
| i. We call C++ from R and it does some processing and then returns and
| we don't call it again. In that case its likely that copying or not won't
| make a big difference or at least it won't if the actual C++ computation
| time is large coimpared to the time spent in copying.
| 
| ii. We factor out the inner loop of the code and only recode that in C++
| and repeatedly call it many times.  In that case the copying is multiplied
| by the number of iterations and might very well have a significant impact.

In case ii) I'd try to use a different design and make it more like i): You
generally do not want to call down from R to object code a bazillion times as
there is always some overhead, and multiplying even something rather
efficient by a veryBigNumber can make small times large in the aggregate.

Dirk

| 
| On Thu, Jul 11, 2013 at 6:55 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
| >
| > Everybody has this existing example in their copy of Armadillo.
| >
| > I am running it here from SVN rather than the installed directory, but this
| > should not make a difference. Machine is my not-overly-powerful thinkpad used
| > for traveling:
| >
| > edd at don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$ r fastLm.r
| > Loading required package: methods
| >
| > Attaching package: ‘Rcpp’
| >
| > The following object is masked from ‘package:inline’:
| >
| >     registerPlugin
| >
| >                        test replications relative elapsed user.self sys.self
| > 2         fLmTwoCasts(X, y)         5000    1.000   0.184     0.204    0.164
| > 1          fLmOneCast(X, y)         5000    1.011   0.186     0.200    0.172
| > 4   fastLmPureDotCall(X, y)         5000    1.141   0.210     0.236    0.184
| > 3          fastLmPure(X, y)         5000    2.027   0.373     0.412    0.332
| > 6              lm.fit(X, y)         5000    2.685   0.494     0.528    0.456
| > 5 fastLm(frm, data = trees)         5000   36.380   6.694     7.332    6.028
| > 7     lm(frm, data = trees)         5000   42.734   7.863     8.628    7.068
| > edd at don:~/svn/rcpp/pkg/RcppArmadillo/inst/examples$
| >
| > What we are talking about here is the difference between 'fLmTwoCasts' and
| > 'fLmOneCasts'.  If you use larger objects, the different with be larger.  But
| > the relative differences are tiny.
| >
| > It would be nice to make this more elegant, and I look forward to Romain's
| > proposals, but methinks that we may well have bigger fish to fry.
| >
| > Dirk, still in Sydney
| >
| > --
| > Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
| > _______________________________________________
| > Rcpp-devel mailing list
| > Rcpp-devel at lists.r-forge.r-project.org
| > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
| 
| 
| 
| --
| Statistics & Software Consulting
| GKX Group, GKX Associates Inc.
| tel: 1-877-GKX-GROUP
| email: ggrothendieck at gmail.com

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com


More information about the Rcpp-devel mailing list