[Rcpp-devel] RcppArmadillo
Simon Zehnder
szehnder at uni-bonn.de
Sun Feb 3 15:26:11 CET 2013
Hi Dirk,
The info about R's single-threaded heritance is very interesting. Something I haven't read about so far. The poor PhD student though! :) That must have been tough work.
Regarding the OpenMP then in C++ by working with arma::mats generated via Rcpp::as<> in loops is something that will work surely.
To gain even more performance on specific matrix/vector operations, one should avoid any non-local memory access at the different cores. For this I would rather generate a new arma::mat object.
But before, the memory should be allocated in the same manner as the threads will work later on. That is we run a #pragma for loop over the initialisation of a
double array (for example) and pass the resulting array to the arma::mat to use as its underlying data array.
Then the threads working on different cores must only access local memory attached to their cores.
So, the step I am not aware of yet is the one that passes the double array to the arma::mat via .memptr(). I do not know if this function can only be
used for getting or as well for setting? On first sight I would guess it is possible to pass the array towards the arma::mat object. Just get the pointer
via memptr() set it to the pointer of our double array and then arma::mat should manipulate values in these memory addresses.
What do you think? Should it work?
Best
Simon
On Feb 3, 2013, at 3:06 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
>
> Hi Simon,
>
> On 3 February 2013 at 14:23, Simon Zehnder wrote:
> | Thanks Dirk!
> |
> | I worked a lot on it already yesterday.
> |
> | Regarding OpenMP: I haven't yet found the underlying data structure for SEXPs but there should be something in the R Internals Guide.
>
> I think you misunderstood. Let me be more clear:
>
> 1) R is single-threaded. Always has been, probably always will be.
>
> (As the story as far as I know goes, someone now in R Core and rather
> well-known for deep understanding of R / S internals spent a good chunk
> of his PhD research time trying to alter that and could not, even while
> working under the direction of one the R / S founder.)
>
> 2) So you need to 'set a mutex' when you leave R, invoke parallel code which
> you can get via OpenMP from the compiler, remove the mutex and return to R.
>
> In fact based on the work Luke Tierney did years ago in pnmath and pnmath0,
> parts of R's libraries now do this (very carefully) themselves. It is
> expected that more parts of R will as the multicore world beckons, but
> this is hard work given the overall constraint.
>
> 3) For the reasons outlined above, there cannot be SEXP type as there is no
> native multithreading.
>
> | As the first-touch-principle has to be applied when the memory is allocated, I would rather copy the data from an SEXP to an arma::mat and run the '#pragma' over the arma::mat. So for allocating I have to pass an arma::mat an array, that is already allocated via first-touch. memptr() should do this work. I hope, though, that it does not copy the array :)
>
> As we said before, that is all fine.
>
> You can get memory from R, take the SEXP down to C++ level via Rcpp and
> inject it into Armadillo types. The most efficient way is the two-step we
> use in the fastLm.cpp example which does all that without copying by invoking
> the most appropriate constructor from Armadillo:
>
> extern "C" SEXP fastLm(SEXP Xs, SEXP ys) {
>
> try {
> Rcpp::NumericVector yr(ys); // creates Rcpp vector from SEXP
> Rcpp::NumericMatrix Xr(Xs); // creates Rcpp matrix from SEXP
> int n = Xr.nrow(), k = Xr.ncol();
> arma::mat X(Xr.begin(), n, k, false); // reuses memory and avoids extra copy
> arma::colvec y(yr.begin(), yr.size(), false);
>
> [....]
>
> At this point you can go crazy and OpenMP away to your heart's content. Just
> do not call back into R while in an OpenMP threaded context or some kittens
> may get killed somewhere. Or R will likely crash. At least one these is true.
>
> Ok?
>
> Dirk
>
> | Best
> |
> | Simon
> |
> | On Feb 2, 2013, at 4:17 PM, Dirk Eddelbuettel <edd at debian.org> wrote:
> |
> | >
> | > Hi Simon,
> | >
> | > Welcome to the list.
> | >
> | > On 2 February 2013 at 14:47, Simon Zehnder wrote:
> | > | Dear Rcpp-Devels,
> | > |
> | > | this list was suggested to me by Dirk Eddelbüttel in regard to a question using C++ Extensions in relation with the Armadillo library.
> | > |
> | > | At first I have to make compliments to the developers of Rcpp/RcppArmadillo. Dirk, Francois, this is a marvelous work! As someone programming a lot in C++ and using R Extensions regularly, it is cleaning away all this cumbersome programming dirt connected to SEXPs.
> | >
> | > Thanks.
> | >
> | > | There are still remaining questions for me, which can be surely answered by the subscribers to this list:
> | > |
> | > | 1. I saw the Rcpp.package.skeleton function and I ask myself, if a similar offer is made for RcppArmadillo for including automatically the Armadillo library in a package?
> | >
> | > Sure -- there is RcppArmadillo.package.skeleton() in the RcppArmadillo
> | > package. Dito for RcppGSL and RcppEigen.
> | >
> | > | 2. If I want to use Armadillo solely in the C++ files, not via inline functions as shown in the RcppArmadillo paper, should I use solely the Rcpp package? No, right? The RcppArmadillo package provides objects wrappers for Armadillo objects to be passed to R?
> | >
> | > If you want just C++ from R, use Rcpp.
> | >
> | > If you want C++ and Armadillo from R, use RcppArmadillo. It depends on Rcpp
> | > and brings it in.
> | >
> | > Etc pp for other libraries. Rcpp and RcppArmadillo should provide the
> | > scaffolding to build upon.
> | >
> | > | 3. My package is based on S4 classes and I saw the S4 class wrapping in the Rcpp package. I miss an example on this. Can you refer to any document or website for this issue?
> | >
> | > Look at the Rcpp documentation, including the recently introduced Rcpp
> | > Gallery at http://gallery.rcpp.org for some examples. I don't use S4 all
> | > that much so I do not write many examples, but eg in RcppEigen context a few
> | > more are found (as some of that work was motivated bty sparse matrices which
> | > already have an S4 representation in R).
> | >
> | > | 4. Further: What is your experience regarding performance with S4 classes and OOP in C++: Does it make a difference mapping the S4 class to a struct in C++ or using directly the attributes of the S4 class (like vectors, etc.) as Armadillo vectors etc. in C++?
> | > | As I work a lot on the HPC in Aachen/Germany together with some of the contributors to the OpenMP API, I am highly influenced by the parsimonious approach, i.e. use only basic objects in C++ to get high performance (although I know, that one of the main work now in the OpenMP API is the extension to complex/user-defined objects inside the #pragmas).
> | >
> | > As I said, I do not use S4 all that much. But I am in favour of OOP :)
> | >
> | > | 5. Using OpenMP with RcppArmadillo: Up to now I used almost exclusively the Scythe Statistical Library (http://scythe.wustl.edu), which is pretty fast. I encountered lately problems with it, using parallel computing (OpenMP). Also important in this regard is the possibility to apply a 'first-touch-principle' where all my approaches failed in Scythe, due to the object structure.
> | > | Now, I would like to use RccpArmadillo with OpenMP and
> | > | a) I want to get best performance: how would I proceed?
> | >
> | > Just do it. We have OpenMP examples in the Rcpp docs. Ensure you have locks
> | > from R, do not call back, and it tends to just work to be "multithreaded in
> | > chunks" at the C++ level..
> | >
> | > Scythe was very important and a first C++ library for R when it came out. I
> | > could imagine that Armadillo is faster, but I have not seen comparisons.
> | >
> | > | b) I want to apply the 'first-touch-principle': where do I apply it? What is the internal data structure in Rcpp-/RcppArmadillo-Objects that allocates the memory?
> | >
> | > It's whatever you would do with native R objects at the C level, only easier :)
> | >
> | > | I am very excited now to start work with the Rcpp/RcppArmadillo package in my own one, which is at least planned to be pushed to CRAN one day.
> | >
> | > We look forward to your contributions. These are open projects, so if you
> | > can think of patches to code or documentation, please let us know.
> | >
> | > | I am looking forward to your answers
> | >
> | > Hope this helps.
> | >
> | > Dirk
> | >
> | > |
> | > | Best
> | > |
> | > | Simon
> | > | _______________________________________________
> | > | Rcpp-devel mailing list
> | > | Rcpp-devel at lists.r-forge.r-project.org
> | > | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
> | >
> | > --
> | > Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
> |
>
> --
> Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
More information about the Rcpp-devel
mailing list