[Rcpp-devel] parallel distance matrix calculation

Jonathan Olmsted jolmsted at princeton.edu
Sat Jul 12 06:41:29 CEST 2014


James,

My attempt at marginal usefulness (compared to DE and JJE's comments) would
be to offer some code examples that I've got using OpenMP--based
parallelism. There are examples in the gallery, of course. But I've got a
number of others for euclidean distance and one for an EM algorithm. If you
are interested, just shoot me an email.

Somewhere in all that I have a some benchmarks showing the time costs
depending on where you use the pragmas. But, if you are tied to
RcppParallel or you are using a compiler that doesn't support OpenMP, then
I've got nothing for you :-(

And, then there is the issue of whether you find it easier to think about
the code in terms of the RcppParallel abstractions or something else.

-Jonathan


On Fri, Jul 11, 2014 at 10:01 PM, Dirk Eddelbuettel <edd at debian.org> wrote:

>
> On 11 July 2014 at 21:19, JJ Allaire wrote:
> | (2) The premise of RcppParallel is that you are reading and writing
> directly
> | into C arrays in background threads (it's not safe to call into R and
> therefore
> | not really safe to call into Rcpp). So to interact with a Matrix/Vector
> you
> | need to calculate the appropriate offsets from matrix.begin() to get the
> slices
> | (rows/columns) of the matrix you want.
> |
> | #2 is based on my conservative assumption about what's thread-safe in
> Rcpp.
> | Romain may tell us that it's perfectly safe to call vector iterators and
> | matrix::operator(,) from a background thread but I don't want to assume
> that's
> | okay without confirmation. If those things _aren't_ okay (or might not
> be okay
> | in the future) then we either need to provide good examples for
> offsetting into
> | vectors and matrixes or perhaps provide some lightweight helper classes
> or
> | functions for doing the same.
>
> When I was working for a while with OpenMP, I found it easier (in terms of
> my
> own mental task switches) to simply not assume anything R-related in the
> parallel code.  It is too easy to briefly forget the multithreaded context
> and use an idiom which may involve R data structures -- and the possibility
> of subsequent data corruption, or worse, is simply not worth it.
>
> So yes, as you suggest, "in theory" we can use lightweight wrappers. "In
> practice", this may be fraught with nasty surprises.
>
> Parallel programming is hard. Which to me is yet another reason to start
> defensively.  But we'll all learn more as we move along, so thanks again to
> you for RcppParallel / TBB and all that for getting us going!
>
> Dirk
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>



-- 

J.P. Olmsted

029 Corwin
Politics Department
Princeton University
Princeton, NJ 08544

t: 609.258.6202
f: 609.258.1110
jolmsted at princeton.edu
olmjo.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20140712/c95b2d99/attachment.html>


More information about the Rcpp-devel mailing list