[Rcpp-devel] New self-contained OpenMP example (Was: Stack imbalance warning when using Rcpp and OpenMP)

Dirk Eddelbuettel edd at debian.org
Fri Aug 26 02:32:45 CEST 2011


Michael,

On 19 August 2011 at 14:09, Michael Braun wrote:
| Hi.  Just one last clarifying question on this issue before I dive back in.
| 
| Suppose I declared a new Rcpp::List object in my C++ code, and copied the list elements from either the SEXP or the original Rcpp::List.  Since the new memory is allocated in C++, would I still have the same problem because of the way Rcpp allocated the memory?  Or would the copy be thread-safe?
| 
| Similarly, what if I were to create an STL container of Rcpp::Lists, and operate on each element of the container in parallel?  Same problem?
| 
| From your helpful responses, it seems like the best alternative is to explicitly copy the contents of each SEXP in the list to a totally non-Rcpp object. I'm just wondering if keeping some of the data in the original classes might still work.
| 
| And finally, since I am still relatively new to C++, are there any standard classes that might make more sense than others?  I'm considering an STL vector of either Eigen or Armadillo matrices, for example.  Good idea, or bad?

OpenMP is tricky. I would definitely recommend reading up on tutorials _just
on the C++ side_ and working with some self-contained C++ examples.

Only once you feel you have a reasonable handle, move on to Rcpp and OpenMP.
There are new issues, as the `Stack imbalance' issue you have seen which can
arise when you return prematurely while other threads still chew on R data
structures.

Long story short I just committed a new self-contained example to the Rcpp
source which you can look at (and copy) via the URL

  https://r-forge.r-project.org/scm/viewvc.php/pkg/Rcpp/inst/examples/OpenMP/OpenMPandInline.r?view=markup&root=rcpp

I simply takes a vector of size two million, set up as the sequence from 1 ..
N and then computes the log of each element.  In other words it is pretty
light on the actual computation.

The results bear this out. On my (standard i7, four cores hyperthreaded) box:

edd at max:~$ r ~/svn/rcpp/pkg/Rcpp/inst/examples/OpenMP/OpenMPandInline.r
Loading required package: methods
              test replications elapsed relative user.self sys.self
2     funOpenMP(z)          100   3.219 1.000000     25.26     0.07
3 funSerialRcpp(z)          100   9.030 2.805219      9.43     0.32
4  funSugarRcpp(z)          100   9.423 2.927307      9.06     0.35
1     funSerial(z)          100   9.601 2.982603      9.59     0.00
edd at max:~$ 

So OpenMP 'wins' but the gain is sublinear at a factor of three -- we need to
compare to method 'funSerial' which also uses a C++ vector. This indicates some
communications overhead between the threads.  

Rcpp sugar has no real leg up on manual loops, but is the shortest
implementation in two lines. Looping over an Rcpp vector is a little faster
than looping over a C++ STL vector (which incurs a copy).

Hope this helps,  Dirk


-- 
Two new Rcpp master classes for R and C++ integration scheduled for 
New York (Sep 24) and San Francisco (Oct 8), more details are at
http://dirk.eddelbuettel.com/blog/2011/08/04#rcpp_classes_2011-09_and_2011-10
http://www.revolutionanalytics.com/products/training/public/rcpp-master-class.php


More information about the Rcpp-devel mailing list