[Rcpp-devel] Populate a Matrix in Parallel (Text Version)
Joseph Wood
jwood000 at gmail.com
Fri Sep 28 03:26:28 CEST 2018
Thank you Jeff for your reply,
I understand that parallelism is not a magic wand. Have you read my
original post? I have managed to parallelize generating permutations
by taking advantage of the fact that I can generate the ith
permutation via nthPerm. My question is about making this thread safe
not if it is possible.
I encourage you to run my code above.
Thanks again.
Joseph Wood
On Thu, Sep 27, 2018 at 8:59 PM Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
>
> You cannot parallelize a serial calculation. Sorry, parallelism is not a magic wand that you can wave at any problem. If you find portions of your calculations are independent, then you can parallelize those portions and do the rest serially.
>
> On September 27, 2018 5:04:52 PM PDT, Joseph Wood <jwood000 at gmail.com> wrote:
> >Hey Ralf and Alexis,
> >
> >Thank you so much for your replies. All of the resources and
> >information are very much appreciated.
> >
> >Alexis, yes I have looked at many examples including the one you
> >linked to dealing with matrix transform. I'm really glad you linked
> >that, because this is a perfect example of why this will not work for
> >my situation. I will first note that all of the examples I have seen
> >thus far, are carrying out operations that are independent of the
> >other entries in the matrix.
> >
> >My situation is fundamentally different. The algorithm that fills the
> >matrix does so in a way that relies on the previous row and more
> >importantly, has a starter vector that isn't apart of the populate
> >process. That is, in the example I gave, you will notice I get the ith
> >entry point by calling nthPerm, and once I call PopulateMatrix, the
> >matrix is populated calling std::next_permutation which is dependent
> >on the previous permutation. The std::next_permutation is just an
> >example, but perfectly represents my challenge here. This algorithm
> >can't produce the 10th permutation without first being fed the 9th
> >permutation. In the matrix transform example, the square root of the
> >10th row has nothing to do with the 9th row whatsoever.
> >
> >I hope this clarifies my question above, and I'm sorry for not
> >explaining more fully.
> >
> >Again, I really appreciate the advice.
> >
> >Joseph Wood
> >On Wed, Sep 26, 2018 at 1:05 PM Alexis Sarda <alexis.sarda at gmail.com>
> >wrote:
> >>
> >> I'm not sure I understand the problem. You've found RcppParallel,
> >have you looked at the examples? There's one with handling matrices
> >here:
> >>
> >> http://gallery.rcpp.org/articles/parallel-matrix-transform/
> >>
> >> In the call to parallelFor() you can change the values you iterate
> >over and change your logic based on it.
> >> In my package I do a lot of cross-distance matrix calculations with
> >RcppParallel, see e.g.:
> >>
> >>
> >https://github.com/asardaes/dtwclust/blob/master/src/distmat/fillers.cpp
> >> https://github.com/asardaes/dtwclust/blob/master/CONTRIBUTING.md
> >>
> >> Regards,
> >> Alexis.
> >>
> >>
> >> On Wed, Sep 26, 2018 at 6:37 PM Ralf Stubner
> ><ralf.stubner at daqana.com> wrote:
> >>>
> >>> I haven’t read your message in detail, but the second example from
> >here might be helpful:
> >https://cran.r-project.org/web/packages/dqrng/vignettes/parallel.html
> >>>
> >>> There a matrix is filled in parallel by splitting the columns among
> >the threads. Splitting by columns is helpful since matrices in R are
> >stored that way. You could use the same method to obtain the transpose
> >of your matrix.
> >>>
> >>> Greetings
> >>> Ralf
> >>>
> >>> —
> >>>
> >>> Ralf Stubner
> >>>
> >>> Senior Software Engineer / Trainer
> >>>
> >>>
> >>>
> >>> daqana GmbH
> >>>
> >>> Dortustraße 48
> >>>
> >>> 14467 Potsdam
> >>>
> >>>
> >>>
> >>> T: +49 331 23 70 81 66
> >>>
> >>> F: +49 331 23 70 81 67
> >>>
> >>> M: +49 162 20 91 196
> >>>
> >>> Mail: ralf.stubner at r-institute.com
> >>>
> >>>
> >>>
> >>> Sitz: Potsdam
> >>>
> >>> Register: AG Potsdam HRB 27966 P
> >>>
> >>> Ust.-IdNr: DE300072622
> >>>
> >>> Geschäftsführer: Prof. Dr. Dr. Karl-Kuno Kunze
> >>>
> >>>
> >>>
> >>>
> >>> Am 26.09.2018 um 18:17 schrieb Joseph Wood <jwood000 at gmail.com>:
> >>>
> >>> As the subject states, this question is regarding populating a
> >matrix
> >>> in parallel. I am currently reading "C++ Concurrency in Action:
> >>> Practical Multithreading" as I'd like to take some algorithms I have
> >>> to the next level. I have looked at the RcppParallel package, but
> >the
> >>> features offered there do not seem to apply to this situation. I
> >will
> >>> explain my reasoning further down. First, we set up our scenario.
> >>>
> >>> 1. We have an empty matrix with the number of rows equal to numRows
> >>> 2. We are able to generate the ith row of the matrix at will
> >>> 3. Our underlying subroutine populates the matrix from any
> >>> particular starting point one by one.
> >>>
> >>> This scenario easily extends to a parallel setup. We have each entry
> >>> in our matrix being visited exactly one time by only one thread. The
> >>> idea is that if we have m threads, we can split up the work so that
> >>> each thread is responsible for populating roughly (numRows / m)
> >number
> >>> of rows of our matrix.
> >>>
> >>> Here is a simplified example that represents my real situation (In
> >my
> >>> project I don't have the cpp11 plugin as I take care of this in
> >>> Makevars file with CXX_STD = CXX11):
> >>>
> >>> #include <Rcpp.h>
> >>> #include <thread>
> >>>
> >>> // [[Rcpp::plugins(cpp11)]]
> >>>
> >>> int myFactorial(int n) {
> >>> int res = 1;
> >>> for (int i = 1; i <= n; ++i)
> >>> res *= i;
> >>>
> >>> return res;
> >>> }
> >>>
> >>> std::vector<int> nthPerm(int n, int index) {
> >>> int temp = myFactorial(n);
> >>> std::vector<int> indexVec(n), res(n);
> >>> std::iota(indexVec.begin(), indexVec.end(), 1);
> >>>
> >>> for (int k = 0, r = n; k < n; ++k, --r) {
> >>> temp /= r;
> >>> int j = (int) index / temp;
> >>> res[k] = indexVec[j];
> >>> index -= (temp * j);
> >>> indexVec.erase(indexVec.begin() + j);
> >>> }
> >>>
> >>> return res;
> >>> }
> >>>
> >>> // Simplified version for demonstration only. The real subroutines
> >>> // that carry out this task are more complicated
> >>> void PopulateMatrix(Rcpp::IntegerMatrix permuteMatrix,
> >>> std::vector<int> z, int count, int nRows, int n)
> >{
> >>> for (; count < nRows; ++count) {
> >>> for (int j = 0; j < n; ++j)
> >>> permuteMatrix(count, j) = z[j];
> >>>
> >>> std::next_permutation(z.begin(), z.end());
> >>> }
> >>> }
> >>>
> >>> // [[Rcpp::export]]
> >>> SEXP ParallelPerms(int n, int userThrds = 0) {
> >>> int nThreads = std::thread::hardware_concurrency() - 1;
> >>> std::vector<std::thread> myThreads;
> >>> nThreads = (userThrds > 0) ? std::min(userThrds, nThreads) :
> >nThreads;
> >>>
> >>> int step = 0, numRows = myFactorial(n);
> >>> int stepSize = numRows / nThreads;
> >>> int nextStep = stepSize;
> >>> std::vector<int> z(n);
> >>> std::iota(z.begin(), z.end(), 1);
> >>>
> >>> Rcpp::IntegerMatrix myMat = Rcpp::no_init_matrix(numRows, n);
> >>>
> >>> for (std::size_t j = 0; j < (nThreads - 1); ++j) {
> >>> myThreads.emplace_back(PopulateMatrix, std::ref(myMat), z,
> >>> step, nextStep, n);
> >>> step += stepSize;
> >>> nextStep += stepSize;
> >>> z = nthPerm(n, step);
> >>> }
> >>>
> >>> // Guarantee that we get all the rows. N.B. We are passing
> >numRows
> >>> // instead of nextStep... they aren't guaranteed to be the same
> >>> myThreads.emplace_back(PopulateMatrix, std::ref(myMat), z, step,
> >>> numRows, n);
> >>>
> >>> for (auto& thr: myThreads)
> >>> thr.join();
> >>>
> >>> return myMat;
> >>> }
> >>>
> >>> I have read that Rcpp objects are not thread safe as they make
> >>> unpredictable calls to the garbage collector
> >>> (https://github.com/RcppCore/RcppParallel/issues/17), however Romain
> >>> Francois states:
> >>>
> >>> "As soon as you don't use references for Rcpp types, you are not
> >safe.
> >>> If you use references, it all depends on what you do with them."
> >>>
> >>> I have a couple of questions regarding this.
> >>>
> >>> My initial thought was I thought Rcpp objects were passed by
> >reference
> >>> by default. Secondly, if this isn't the case, is it as simple as
> >>> adding an ampersand to all of my Rcpp objects in the function
> >>> parameters?
> >>>
> >>> The project that I'm implementing this in (RcppAlgos) builds fine on
> >>> win-builder as well as all of the various rhub checks with no errors
> >>> (even check_with_sanitizers() and check_with_valgrind()). When I
> >>> submitted v 2.1.0 to CRAN, there were sporadic build errors on some
> >of
> >>> the linux platforms. By sporadic, I mean sometimes it passes, and
> >>> other times it would fail with the error : segfault from C stack
> >>> overflow. The current version (v 2.2.0) still has the argument for
> >>> parallel computing, but it doesn't do anything. It is only there for
> >>> backwards compatibility.
> >>>
> >>> When I initially submitted, it should be noted that I did not have
> >my
> >>> matrices wrapped with std::ref in the call to create new threads, so
> >>> I'm not sure what effect this will have on those builds if I were to
> >>> submit to CRAN now. I will say that after I saw the errors on the
> >>> CRAN checks, I immediately sought to replicate them. I was
> >successful
> >>> in extreme situations. For example, if I called the parallel
> >>> implementation thousands of times I could generate the error. I
> >would
> >>> put these extreme tests in my tests folder and check them in a unit
> >>> test environment so as not to crash my r session.
> >>>
> >>> I then sought out a more robust solution to my situation and found
> >>> that thread function arguments are by default pass by value, and if
> >>> you have a particular variable that is expected to be passed by
> >>> reference, then you must add std::ref (See
> >>> https://en.cppreference.com/w/cpp/thread/thread/thread). I have done
> >>> this and have noticed that I can't generate the issues with the
> >>> extreme tests above. HOWEVER, If I call it say 50000 times 10 times
> >in
> >>> a row, I can sometimes generate an issue (not necessarily the
> >segfault
> >>> above.. most of the time it is a stack imbalance warning... the
> >>> warning you get when the number of UNPROTECTS doesn't match the
> >number
> >>> of PROTECTS in the R C API).
> >>>
> >>> I then revisited the RcppParallel package to see if there were any
> >>> solutions there. I know that RcppParallel implements fully thread
> >safe
> >>> objects like RMatrix<T>, however I can't see a way to set up a
> >Worker
> >>> to populate my matrix. I guess the issue I see is that if you look
> >at
> >>> my set-up above, we first get the starter vector with a call to
> >>> nthPerm from the parent function, then I pass this to PopulateMatrix
> >>> which proceeds to populate rows of my matrix for a given number of
> >>> rows. With the examples I've seen, there is no dependency on an
> >>> external function to get a specific entry point. I thought about
> >>> bypassing the RcppParallel::Worker altogether and simply use the
> >>> RMatrix<T> object, however I don't think this is how that object is
> >to
> >>> be utilized. For example, I have not seen any RcppParallel examples
> >>> that preallocate an RMatrix<T> object.
> >>>
> >>> My question is, is there a way to make my current set up
> >thread-safe?
> >>> If not, is it possible at all to simply populate an object in
> >parallel
> >>> safely?
> >>>
> >>> Alternatively, if my question seems a bit naive, a nudge in the
> >right
> >>> direction would be greatly appreciated. I don't mind going back to
> >>> square one, if need be.
> >>>
> >>> Thanks,
> >>> Joseph Wood
> >>> _______________________________________________
> >>> Rcpp-devel mailing list
> >>> Rcpp-devel at lists.r-forge.r-project.org
> >>>
> >https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
> >>>
> >>> _______________________________________________
> >>> Rcpp-devel mailing list
> >>> Rcpp-devel at lists.r-forge.r-project.org
> >>>
> >https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
> >_______________________________________________
> >Rcpp-devel mailing list
> >Rcpp-devel at lists.r-forge.r-project.org
> >https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>
> --
> Sent from my phone. Please excuse my brevity.
More information about the Rcpp-devel
mailing list