[Rcpp-devel] Modules and Boost and larger data sets

Dirk Eddelbuettel edd at debian.org
Fri Sep 6 14:20:01 CEST 2013


On 6 September 2013 at 13:46, Simon Zehnder wrote:
| Dear Rcpp-Users and Rcpp-Devels,
| 
| this goes especially to Dirk and Romain, the developers of RcppBDT. 

Well its's mostly me for the scope of it, with numerous invaluable assists
from Romain.  The released version is far behind the SVN version;
unfortunately the SVN version is far from release-ready.
 
| I am right now writing on a package for market microstructure data -
| usually large tick datasets with trade times and security symbols.

Interesting. I do that for a living too.

| I read the Rcpp Book about Modules and when starting as usual with S4
| classes in R, the Modules came into my mind. As I am operating on datasets
| with usually around 1 Mio. rows I am wondering, if maybe the implementation
| via Modules is the better (better in regard to performance) one - in

That is not usually the motivation for modules.

"Straight up" functions, coded via inline or attributes, will be as fast.

| comparison to the usual S4 class implementation directly in R. With Modules

"The usual S4 class implementation"?  

I have done R for over a decade and I still hardly use S4, so "the usual" is,
errmm, "unusual".

| I am able to define all functions on the datasets in C++ - which I expect
| to be faster. Sorting the data and filtering the data in regard to
| dates/times are of course one of the main tasks to be covered.

I have some trouble with the logic of your argument, but accept the end
result that Boost Date.Time is good for dates and times. :)
 
| In RcppBDT I read in the DESCRIPTION file, that the Boost Header Files for
| Date.Time must be included.

"On the system on which RcppBDT is to be compiled" -- different from where it
is used (Windows, say). _No run-time depends_.

| As I have to choose one library for Date/Time formats in C++, boost just
| seems so appropriate. But for usage in the Market Microstructure community
| it is impossible to expect them to install Boost on their system.

Sorry but one has nothing to do with the other.

Also please look at the CRAN package BH -- it _provides_ Boost headers for
this very purpose. Several packages already use it.

| So, I would like to provide Boost already within the package.

Just don't do it. Seriously. Use a "Depends: BH"

| As everything what you two do makes sense, I think I haven't grabbed yet the
| reason, why Boost is not provided in the RcppBDT right alongside. Is there
| something which restricts me from doing this?

It's inefficient. We don't ship the headers of the C library either. 

It's just a Depends. 

Better to hand-off to the system, and with R, we can (at least for pure
template headers) via the BH package we created.
 
| I am very thankful for thoughts and opinions on my idea and my question. 

Sure, no problem.

Dirk

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com


More information about the Rcpp-devel mailing list