[Rcpp-devel] Fwd: Rcpp too good to be true?

Jeffrey Pollock jeffpollock9 at gmail.com
Wed Dec 14 01:16:53 CET 2011


Massive apologies.... I had indeed made a mistake by messing up my gobal
environment! Here are the correct benchmarks, only a ~70x speedup.....(and
in the previous benchmark the "Rcpp" function was slower too, was actually
an old really inefficient R function I had written)

> n <- 5e4
> lam <- 5
> nu <- 1
> rbind(table(rcomp(n, lam, nu))[1:10] / n, table(Rrcomp(n, lam, nu))[1:10]
/ n, dpois(0:9, lam))
               0          1          2         3         4
5         6
[1,] 0.007220000 0.03274000 0.08472000 0.1394000 0.1776200 0.1745400
0.1451200
[2,] 0.006220000 0.03342000 0.08604000 0.1396800 0.1763200 0.1735800
0.1471000
[3,] 0.006737947 0.03368973 0.08422434 0.1403739 0.1754674 0.1754674
0.1462228
             7          8          9
[1,] 0.1064800 0.06638000 0.03574000
[2,] 0.1041400 0.06554000 0.03654000
[3,] 0.1044449 0.06527804 0.03626558
> benchmark(rcomp(n, lam, nu), Rrcomp(n, lam, nu), replications = 100,
order = "relative")
                test replications elapsed relative user.self sys.self
1  rcomp(n, lam, nu)          100    0.26  1.00000      0.24     0.03
2 Rrcomp(n, lam, nu)          100   18.77 72.19231     18.59     0.02

---------- Forwarded message ----------
From: Jeffrey Pollock <jeffpollock9 at gmail.com>
Date: Tue, Dec 13, 2011 at 11:48 PM
Subject: Rcpp too good to be true?
To: r-devel at r-project.org


Hello all,

I've been working on a package to do various things related to the
Conway-Maxwell-Poisson distribution and wanted to be able to make fast
random draws from the distribution. My R code was running quite slow so I
decided to give Rcpp a bash. I had used it before but only for extremely
basic stuff and always using inline. This time I decided to give making a
proper package a go.

First of all I should say that this was incredibly easy due to
Rcpp.package.skeleton() and the countless answers to quesions online and
documentation!

Secondly, I'm worried that my speedup has been so massive (over 500x !!!)
that I think I've made a mistake, hence my post here.

Here is all my code, if someone has a minute to point out anything wrong
(or even if its correct and there is room for improvement, im pretty new to
this) it would be much appreciated. I've had a simple look at the results
and they look fine, but seriously, 500x faster?!

function in R;
library(compiler)

Rrcomp <- cmpfun(
        function(n, lam, nu, max = 100L) {
            ans <- integer(n)
            dist <- dcomp(0:max, lam, nu, max)
            u <- runif(n)
            for (i in 1:n) {
                count <- 0L
                pr <- dist[1L]
                while (pr < u[i]) {
                    count <- count + 1L
                    pr <- pr + dist[count + 1L]
                }
                ans[i] <- count
            }
            return(ans)
        }
)

dcomp <- function(y, lam, nu, max = 100L) {
    Z <- function(lam, nu, max) {
        sum <- 0L
        for(i in 0L:max) {
            sum <- sum + lam^i / factorial(i)^nu
        }
        return(sum)
    }
    return(lam^y / (factorial(y)^nu * Z(lam, nu, max)))
}

function in Rcpp;
header file;

#include <Rcpp.h>

RcppExport SEXP rcomp(SEXP n_, SEXP dist_);

source file;

#include "rcomp.h"

SEXP rcomp(SEXP n_, SEXP dist_) {
    using namespace Rcpp ;

    int n = as<int>(n_);
    NumericVector dist(dist_);

    NumericVector ans(n);
    int count;
    double pr;
    RNGScope scope;
    NumericVector u = runif(n);

    for (int i = 0; i < n; ++i) {
        count = 0;
        pr = dist[0];
        while (pr < u[i]) {
            count++;
            pr += dist[count];
        }
        ans[i] = count;
    }
    return ans;
}

R call;

rcomp <- function(n, lam, nu, max = 100){
    dist <- dcomp(0:max, lam, nu, max)
    .Call("rcomp", n = n, dist = dist, PACKAGE = "glmcomp")
}

Here are some results;
> n <- 50000
> lam <- 5
> nu <- 1
> rbind(table(rcomp(n, lam, nu))[1:10] / n, table(Rrcomp(n, lam, nu))[1:10]
/ n, dpois(0:9, lam))
               0          1          2         3         4
5         6
[1,] 0.006440000 0.03124000 0.08452000 0.1392200 0.1747800 0.1755200
0.1490000
[2,] 0.006660000 0.03232000 0.08412000 0.1425400 0.1779600 0.1748400
0.1445600
[3,] 0.006737947 0.03368973 0.08422434 0.1403739 0.1754674 0.1754674
0.1462228
             7          8          9
[1,] 0.1063000 0.06538000 0.03534000
[2,] 0.1039800 0.06492000 0.03624000
[3,] 0.1044449 0.06527804 0.03626558

(for nu = 1 the com-poisson distribution is the same as normal poisson)

> benchmark(rcomp(n, lam, nu), Rrcomp(n, lam, nu), replications = 10, order
= "relative")
                test replications elapsed relative user.self sys.self
2 Rrcomp(n, lam, nu)           10    2.03   1.0000      1.96     0.00
1  rcomp(n, lam, nu)           10 1172.51 577.5911   1164.50     0.08

Thanks in advance if anyone has any time to have a look at this :)

Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20111214/acbe9d5f/attachment.htm>


More information about the Rcpp-devel mailing list