[Rcpp-devel] Rcpp function results are different on different systems

Sokol Serguei serguei.sokol at gmail.com
Thu Aug 19 19:01:04 CEST 2021


Le 19/08/2021 à 17:41, Sokol Serguei a écrit :
> Le 19/08/2021 à 17:04, Naeem Khoshnevis a écrit :
>>
>> Thank you so much, everyone, for responding to this email.
>>
>>
>> Dirk,
>>
>>   * I didn't think about testing _equality_ of doubles because the
>>     numbers are significantly different (e.g., instead of 0.5,
>>     chooses 1.5). However, that is a valid point, and I should be
>>     aware of that.
>>   * You are right about the serial runs. Whenever I deactivate
>>     OpenMP, the results are correct.
>>
>>
>> Serguei,
>>
>>   * Thanks for the comments. Yes, I agree. It seems outer is a better
>>     option. We have started with outer. However, outer builds the
>>     entire matrix of differences first, then finds the minimum index.
>>     In our application, it requires 200 GB of memory to build that
>>     matrix.
>>
> This problem was hardly foreseeable from examples in hand. But I still 
> think that a pure R solution can be a runner:
>
>    ac=a*sc
>    bc=b*sc
>    out=vapply(seq_along(bc), function(i) 
> which.min(abs(ac-bc[i])+cd[i]), integer(1))
>
Further optimizing: "+cd[i]" does not change the result of which.min() 
so it can be dropped. The same reasoning can be applied to "*sc".
As the problem is reduced to searching the index of the closest value in 
'a' to 'b[i]', it can be solved in O(log(n)) time (instead of O(n)) by 
binary search if 'a' can be sorted before operations. If 'b' can also be 
sorted, then the whole timing can be further reduced. E.g. we can use 
findInterval() implementing such optimal searching. However, its result 
should be post-processed to find value indexes instead of interval ones 
and get back to original indexes instead of sorted ones. Something like:

    oa=order(a)
    ob=order(b)
    ao=a[oa]
    bo=b[ob]
    i=findInterval(bo, ao) # main work is done here
    ii=i+ifelse(i < n & ao[i+1]-bo < bo-ao[i], 1, 0)
    out=oa[ii]
    out[ob]=out

Serguei.

> Best,
> Serguei.
>
>>   * Rcpp does the job with around 10 MB. That is why I switched to
>>     Rcpp. Please let me know your thoughts.
>>
>>
>>       Iñaki,
>>
>>   * Thanks for your suggestion. Yes, the problem is shared values,
>>     and it resolved the issue. I really appreciate it.
>>
>>
>> Best regards,
>> Naeem
>>
>> On Thu, Aug 19, 2021 at 4:56 AM Iñaki Ucar <iucar at fedoraproject.org 
>> <mailto:iucar at fedoraproject.org>> wrote:
>>
>>     On Thu, 19 Aug 2021 at 04:53, Dirk Eddelbuettel <edd at debian.org
>>     <mailto:edd at debian.org>> wrote:
>>     >
>>     >
>>     > Naeem,
>>     >
>>     > I would simplify, simplify, simplify -- as 'Rcpp FAQ 7.31'
>>     reminds us all,
>>     > testing _equality_ of doubles is challenging anyway.
>>     >
>>     > Besides, it may make sense to would ascertain first you get
>>     what you want in
>>     > _purely serial modes_ and then move to OpenMP.
>>
>>     Exactly. Serial execution should be fine. I.e., if you set the number
>>     of threads to 1, then all platforms will return the same result.
>>     However, you have defined a number of variables outside the parallel
>>     region, and then you modify them inside the parallel region. OpenMP
>>     takes those variables as shared by default, which leads to the
>>     unexpected results you are getting. You need to tell OpenMP that
>>     those
>>     variables are threadprivate. Or you could just define them inside the
>>     parallel region, so that OpenMP knows that they are private without
>>     additional hints.
>>
>>     -- 
>>     Iñaki Úcar
>>
>>
>> _______________________________________________
>> Rcpp-devel mailing list
>> Rcpp-devel at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20210819/b2f9b433/attachment-0001.html>


More information about the Rcpp-devel mailing list