[Rcpp-devel] When calling same Rcpp function several times different results are returned

Vaidas Zemlys mpiktas at gmail.com
Wed Jul 15 14:37:27 CEST 2015


Hi,

I’ve finally managed to produce code that works. I hope. The key was to
leave output variable out of constructor:

#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppParallel)]]
#include <RcppParallel.h>

using namespace RcppParallel;
struct SumsInGroups5: public Worker
{
    const RVector<double> v;
    const RVector<int> g;

    std::vector<double> s;

    SumsInGroups5(const NumericVector v, const IntegerVector g): v(v),
g(g),  s(*std::max_element(g.begin(), g.end()) + 1, 0.0){ }

    SumsInGroups5(const SumsInGroups5& p, Split): v(p.v), g(p.g),
s(*std::max_element(g.begin(), g.end()) + 1, 0.0) {}

    void operator()(std::size_t begin, std::size_t end) {
        for (std::size_t i = begin; i < end; ++i) {
            s[g[i]]+=v[i];
        }

    }

    void join(const SumsInGroups5& rhs) {
        for(std::size_t i = 0; i < s.size(); i++) {
            s[i] += rhs.s[i];
        }
    }
};

// [[Rcpp::export]]
NumericVector sg5(NumericVector v, IntegerVector g) {
    SumsInGroups5 p(v, g);
    parallelReduce(0, v.length(), p);
    return wrap(p.s);
}

/*** R
a <- 1:10
g <- c(rep(0,5),rep(1,5))


bb <- lapply(1:10000,function(x)sg5(a,g))
cc<-do.call("rbind",bb)
table(cc[,1])
table(cc[,2])
*/

Looking at the example
http://gallery.rcpp.org/articles/parallel-distance-matrix/, the ouput is
initialized in constructor and everything works. So if anybody can explain
why in this case leaving out the output from the constructor made the code
work I would be all ears.

Vaidotas Zemlys


Le mar. 14 juil. 2015 à 15:22, Danas Zuokas <danas.zuokas at gmail.com> a
écrit :

> I have tried this with no luck.
>
> 2015-07-14 14:34 GMT+03:00 Romain Francois <romain at r-enthusiasts.com>:
>
>> Or just use a std::vector<double> for sg. That should do it.
>>
>> Romain
>>
>> Envoyé de mon iPhone
>>
>> > Le 14 juil. 2015 à 13:21, JJ Allaire <jj.allaire at gmail.com> a écrit :
>> >
>> > The examples in the RcppParallel documentation assume that access to
>> > vectors and matrixes are *aligned* (i.e. fall into neat buckets
>> > whereby reading and writing doesn't overlap between worker instances).
>> > Your example appears to access arbitrary elements of sg (depending on
>> > what's passed in gi) which probably creates overlapping reads/writes.
>> > You should also study the documentation for join carefully.
>> >
>> > There's nothing incorrect about RcppParallel's behavior here, rather
>> > you need to think more carefully about the access patterns of your
>> > data and how they might conflict. You may need to introduce locking to
>> > overcome the conflicts, which in turn could kill the performance
>> > benefit you gain from parallelism. No easy answers here :-\
>> >
>> >> On Tue, Jul 14, 2015 at 7:15 AM, Danas Zuokas <danas.zuokas at gmail.com>
>> wrote:
>> >> Yes it is the same question on SO and I did consider RHertel's
>> comments.
>> >> But this problem (sums by group id) is not parallelFor it is
>> parallelReduce:
>> >> I split vector, calculate sums and then aggregate those sums.
>> >> Please correct me if I am wrong.
>> >>
>> >> 2015-07-14 13:54 GMT+03:00 Dirk Eddelbuettel <edd at debian.org>:
>> >>>
>> >>>
>> >>> On 14 July 2015 at 09:25, Danas Zuokas wrote:
>> >>> | I have written parallel implementation of sums in groups using
>> >>> RcppParallel.
>> >>>
>> >>> Isn't this the same question as
>> >>>
>> >>>
>> http://stackoverflow.com/questions/31318419/when-calling-same-rcpp-function-several-times-different-results-are-returned
>> >>>
>> >>> You got some excellent comments there by SO user 'RHertel'. Did you
>> >>> consider those?
>> >>>
>> >>> Dirk
>> >>>
>> >>> | // [[Rcpp::depends(RcppParallel)]]
>> >>> | #include <Rcpp.h>
>> >>> | #include <RcppParallel.h>
>> >>> | using namespace Rcpp;
>> >>> | using namespace RcppParallel;
>> >>> |
>> >>> | struct SumsG: public Worker
>> >>> | {
>> >>> |   const RVector<double> v;
>> >>> |   const RVector<int> gi;
>> >>> |
>> >>> |   RVector<double> sg;
>> >>> |
>> >>> |   SumsG(const NumericVector v, const IntegerVector gi, NumericVector
>> >>> sg): v(v), gi(gi), sg(sg) {}
>> >>> |   SumsG(const SumsG& p, Split): v(p.v), gi(p.gi), sg(p.sg) {}
>> >>> |
>> >>> |   void operator()(std::size_t begin, std::size_t end) {
>> >>> |     for (std::size_t i = begin; i < end; i++) {
>> >>> |       sg[gi[i]] += v[i];
>> >>> |     }
>> >>> |   }
>> >>> |
>> >>> |   void join(const SumsG& p) {
>> >>> |     for(std::size_t i = 0; i < sg.length(); i++) {
>> >>> |       sg[i] += p.sg[i];
>> >>> |     }
>> >>> |   }
>> >>> | };
>> >>> |
>> >>> | // [[Rcpp::export]]
>> >>> | List sumsingroups(NumericVector v, IntegerVector gi, int ni) {
>> >>> |   NumericVector sg(ni);
>> >>> |   SumsG p(v, gi, sg);
>> >>> |   parallelReduce(0, v.length(), p);
>> >>> |
>> >>> |   return List::create(_["sg"] = p.sg);
>> >>> | }
>> >>> |
>> >>> | It compiles using Rcpp::sourceCpp. Now when I call it from R
>> >>> sumsingroups(1:10,
>> >>> | rep(0:1, each = 5), 2) several times I get the right answer (15 40)
>> and
>> >>> then
>> >>> | something different (usually some multiplicative of the right
>> answer).
>> >>> Running
>> >>> |
>> >>> |
>> >>> | res <- sumsingroups(1:10, rep(0:1, each = 5), 2)
>> >>> | for(i in 1:1000) {
>> >>> |     tmp <- sumsingroups(1:10, rep(0:1, each = 5), 2)
>> >>> |     if(res[[1]][1] != tmp[[1]][1]) break
>> >>> |     Sys.sleep(0.1)
>> >>> | }
>> >>> |
>> >>> | breaks at random iteration returning
>> >>> |
>> >>> | $sg
>> >>> | [1]  60 160
>> >>> |
>> >>> | or
>> >>> |
>> >>> | $sg
>> >>> | [1] 30 80
>> >>> |
>> >>> | I am new to Rcpp and RcppParallel and do not know what could cause
>> such
>> >>> | behavior.
>> >>> |
>> >>> | Things that did not help:
>> >>> |
>> >>> |  1. Added for (std::size_t i = 0; i < sg.length(); i++) sg[i] = 0;
>> to
>> >>> both of
>> >>> |     constructors.
>> >>> |  2. Changed names so that they are different in Worker definition
>> and in
>> >>> |     function implementation.
>> >>> |
>> >>> | _______________________________________________
>> >>> | Rcpp-devel mailing list
>> >>> | Rcpp-devel at lists.r-forge.r-project.org
>> >>> |
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>> >>>
>> >>> --
>> >>> http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Rcpp-devel mailing list
>> >> Rcpp-devel at lists.r-forge.r-project.org
>> >>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>> > _______________________________________________
>> > Rcpp-devel mailing list
>> > Rcpp-devel at lists.r-forge.r-project.org
>> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>
>
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20150715/cacc03ef/attachment.html>


More information about the Rcpp-devel mailing list