[Rcpp-devel] accelerate XTS manipulations with Rcpp

Dirk Eddelbuettel edd at debian.org
Wed Sep 11 13:51:00 CEST 2019


Sorry, forgot about replying here for a few days.

On 4 September 2019 at 13:13, Vladimir Morozov wrote:
| Dear Rcpp experts.
| I'm absolutely new to Rcpp and RcppXts.
| 
| I want to speed up my R function that generates and rbinds a few new rows
| to an XTS object. This function us called often and I think it takes a
| lot of time.
| 
| Could you please tell me in general if the following switch to Rcpp will
| speed-up my work (currently all my work is in R).
| 
| I have a function  function_R(time T, value X) in R that takes a
| PosixCT time and integer value X, performs some time-consuming manipulations,
| then modifies global objects XTS series1 through XTS seriesN. Specifically,
| I rbind some new rows to each of series1 through seriesN in global
| environment.
| To avoid copying XTS series between the function and the global enviroment,
| the function attempts to modify the seriesX in the global environment, and
| not return XTS on function return (due to how R works maybe it's a moot
| point, as Dirk already alluded...)
| 
| This function is rather slow.
| 
| To accelerate it I want to write fuction_RCPP(time T, value X) in C++.
| I think it will perform internal manipulations of time T and value X a little
| faster.
| 
| I want to achieve the biggest speed-up by doing rbind in C++ using package
| RcppXts.
| I plan to use the following function in RcppXts package:
|     function("xtsRbind",
|              &xtsRbind,
|              List::create(Named("x"), Named("y"), Named("dup")),
|              "Combine two xts objects row-wise");
| 
| Then I want to use Rcpp function assign( name, x ) to assign the created
| object back to GlobalEnvironment.
| 
| This way my other tasks in R can access these XTS series1 through seriesN.
| 
| Do you think the above way is a good way to grow dynamical XTS series inside
| Rcpp?
| Do you think it will provide significant acceleration compared to pure R ?

I sense that you have the same correct intuitive feeling here: the "growing"
is the bottleneck.

R is fantastic at what it does, and uses a relatively simple architecture.
This means for example that vectors are simple contiguous 'chunks' of memory
(plus some optimization, but let's ignore that) so "growing" always means
full copies.  That is not good.

So I tend to grow my data structures on the C++ side as STL objects, often
vectors. Once done, I return as Rcpp / R vectors, often in a data.frame. That
works pretty well.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org


More information about the Rcpp-devel mailing list