[Rcpp-devel] Rcpp function results are different on different systems

Serguei Sokol serguei.sokol at gmail.com
Fri Aug 20 09:47:26 CEST 2021


Le 19/08/2021 à 19:01, Sokol Serguei a écrit :
> Le 19/08/2021 à 17:41, Sokol Serguei a écrit :
>> Le 19/08/2021 à 17:04, Naeem Khoshnevis a écrit :
>>>
>>> Thank you so much, everyone, for responding to this email.
>>>
>>>
>>> Dirk,
>>>
>>>   * I didn't think about testing _equality_ of doubles because the
>>>     numbers are significantly different (e.g., instead of 0.5,
>>>     chooses 1.5). However, that is a valid point, and I should be
>>>     aware of that.
>>>   * You are right about the serial runs. Whenever I deactivate
>>>     OpenMP, the results are correct.
>>>
>>>
>>> Serguei,
>>>
>>>   * Thanks for the comments. Yes, I agree. It seems outer is a better
>>>     option. We have started with outer. However, outer builds the
>>>     entire matrix of differences first, then finds the minimum index.
>>>     In our application, it requires 200 GB of memory to build that
>>>     matrix.
>>>
>> This problem was hardly foreseeable from examples in hand. But I still 
>> think that a pure R solution can be a runner:
>>
>>    ac=a*sc
>>    bc=b*sc
>>    out=vapply(seq_along(bc), function(i) 
>> which.min(abs(ac-bc[i])+cd[i]), integer(1))
>>
> Further optimizing: "+cd[i]" does not change the result of which.min() 
> so it can be dropped. The same reasoning can be applied to "*sc".
> As the problem is reduced to searching the index of the closest value in 
> 'a' to 'b[i]', it can be solved in O(log(n)) time (instead of O(n)) by 
> binary search if 'a' can be sorted before operations. If 'b' can also be 
> sorted, then the whole timing can be further reduced. E.g. we can use 
> findInterval() implementing such optimal searching. However, its result 
> should be post-processed to find value indexes instead of interval ones 
> and get back to original indexes instead of sorted ones. Something like:
> 
>     oa=order(a)
>     ob=order(b)
>     ao=a[oa]
>     bo=b[ob]
>     i=findInterval(bo, ao) # main work is done here
>     ii=i+ifelse(i < n & ao[i+1]-bo < bo-ao[i], 1, 0)
oops... Here 'n' is length(b). It was defined in my environment, so the 
snipet worked as expected when I copy/pasted it. But it may be not your 
case. Sorry.

Best,
Serguei.

>     out=oa[ii]
>     out[ob]=out
> 
> Serguei.
> 
>> Best,
>> Serguei.
>>
>>>   * Rcpp does the job with around 10 MB. That is why I switched to
>>>     Rcpp. Please let me know your thoughts.
>>>
>>>
>>>       Iñaki,
>>>
>>>   * Thanks for your suggestion. Yes, the problem is shared values,
>>>     and it resolved the issue. I really appreciate it.
>>>
>>>
>>> Best regards,
>>> Naeem
>>>
>>> On Thu, Aug 19, 2021 at 4:56 AM Iñaki Ucar <iucar at fedoraproject.org 
>>> <mailto:iucar at fedoraproject.org>> wrote:
>>>
>>>     On Thu, 19 Aug 2021 at 04:53, Dirk Eddelbuettel <edd at debian.org
>>>     <mailto:edd at debian.org>> wrote:
>>>     >
>>>     >
>>>     > Naeem,
>>>     >
>>>     > I would simplify, simplify, simplify -- as 'Rcpp FAQ 7.31'
>>>     reminds us all,
>>>     > testing _equality_ of doubles is challenging anyway.
>>>     >
>>>     > Besides, it may make sense to would ascertain first you get
>>>     what you want in
>>>     > _purely serial modes_ and then move to OpenMP.
>>>
>>>     Exactly. Serial execution should be fine. I.e., if you set the number
>>>     of threads to 1, then all platforms will return the same result.
>>>     However, you have defined a number of variables outside the parallel
>>>     region, and then you modify them inside the parallel region. OpenMP
>>>     takes those variables as shared by default, which leads to the
>>>     unexpected results you are getting. You need to tell OpenMP that
>>>     those
>>>     variables are threadprivate. Or you could just define them inside the
>>>     parallel region, so that OpenMP knows that they are private without
>>>     additional hints.
>>>
>>>     -- 
>>>     Iñaki Úcar
>>>
>>>
>>> _______________________________________________
>>> Rcpp-devel mailing list
>>> Rcpp-devel at lists.r-forge.r-project.org
>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>
>>
> 



More information about the Rcpp-devel mailing list