[Rcpp-devel] Question on performance and strategy

Jordi Molins jordi.molins.coronado at gmail.com
Sat Sep 22 20:41:33 CEST 2018


I have access to a machine (not a desktop) with quite a few CPUs and quite
a few GPUs. So, if for example there are 100 CPU cores and 100,000 GPU
cores, I guess that I could do a foreach for these 100 CPU cores for an R
function, and then if this R function calls RcppArrayFire, RcppArrayFire
could call 1,000 GPU cores for each call, to make the whole 100,000 GPU
cores, no? Or is everything more complex than that?

Jordi Molins i Coronado
+34 69 38 000 59


On Sat, Sep 22, 2018 at 7:35 PM Dirk Eddelbuettel <edd at debian.org> wrote:

>
> On 22 September 2018 at 17:52, Jordi Molins wrote:
> | In relation to doing "CPU x GPU": what would happen if I have 3 variables
> | to be parallelized (independent from each other, no interdependencies)
> and
> | then I create an R function, using RcppArrayFire, to GPU-parallelize two
> of
> | them. Then, I use foreach (or similar) in R to CPU-paralellize the third
> | one (and for each variable of the third one, the R function is called,
> and
> | then internally, RcppArrayFire uses GPUs).
>
> Just because you want to access ONE gpu device N times does not make it N
> gpus.
>
> And as you have only one GPU, if you call it N times "in parallel" (we
> know:
> time sliced) you get contention.
>
> No different from having OpenBLAS or Intel MKL using ALL your cores for
> matrix algebra.  If you can that from any of the R process parallel helpers
> you get contention.  All this is well documented.
>
> Dirk
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20180922/7a3273ab/attachment.html>


More information about the Rcpp-devel mailing list