[Rcpp-devel] parallel distance matrix calculation

Jonathan Olmsted jolmsted at princeton.edu
Sat Jul 12 06:41:29 CEST 2014


My attempt at marginal usefulness (compared to DE and JJE's comments) would
be to offer some code examples that I've got using OpenMP--based
parallelism. There are examples in the gallery, of course. But I've got a
number of others for euclidean distance and one for an EM algorithm. If you
are interested, just shoot me an email.

Somewhere in all that I have a some benchmarks showing the time costs
depending on where you use the pragmas. But, if you are tied to
RcppParallel or you are using a compiler that doesn't support OpenMP, then
I've got nothing for you :-(

And, then there is the issue of whether you find it easier to think about
the code in terms of the RcppParallel abstractions or something else.


On Fri, Jul 11, 2014 at 10:01 PM, Dirk Eddelbuettel <edd at debian.org> wrote:

> On 11 July 2014 at 21:19, JJ Allaire wrote:
> | (2) The premise of RcppParallel is that you are reading and writing
> directly
> | into C arrays in background threads (it's not safe to call into R and
> therefore
> | not really safe to call into Rcpp). So to interact with a Matrix/Vector
> you
> | need to calculate the appropriate offsets from matrix.begin() to get the
> slices
> | (rows/columns) of the matrix you want.
> |
> | #2 is based on my conservative assumption about what's thread-safe in
> Rcpp.
> | Romain may tell us that it's perfectly safe to call vector iterators and
> | matrix::operator(,) from a background thread but I don't want to assume
> that's
> | okay without confirmation. If those things _aren't_ okay (or might not
> be okay
> | in the future) then we either need to provide good examples for
> offsetting into
> | vectors and matrixes or perhaps provide some lightweight helper classes
> or
> | functions for doing the same.
> When I was working for a while with OpenMP, I found it easier (in terms of
> my
> own mental task switches) to simply not assume anything R-related in the
> parallel code.  It is too easy to briefly forget the multithreaded context
> and use an idiom which may involve R data structures -- and the possibility
> of subsequent data corruption, or worse, is simply not worth it.
> So yes, as you suggest, "in theory" we can use lightweight wrappers. "In
> practice", this may be fraught with nasty surprises.
> Parallel programming is hard. Which to me is yet another reason to start
> defensively.  But we'll all learn more as we move along, so thanks again to
> you for RcppParallel / TBB and all that for getting us going!
> Dirk
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


J.P. Olmsted

029 Corwin
Politics Department
Princeton University
Princeton, NJ 08544

t: 609.258.6202
f: 609.258.1110
jolmsted at princeton.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20140712/c95b2d99/attachment.html>

More information about the Rcpp-devel mailing list