[Rcpp-devel] loading a souceCpp-ed function

Antonio Piccolboni antonio at piccolboni.info
Tue Jan 29 22:20:07 CET 2013


Hi,
I defined a little C++ function and sourced it with sourceCpp, tested,
works great. It so happens that I develop a package to interface R and
Hadoop, and I need  to use the same function in the map or reduce parameter
of a mapreduce call. This means that the definition of the function will
have to be saved, distributed to a cluster and loaded and executed in
multiple R instances. This works fine with R functions, but not
sourceCpp-ed functions. In fact, the same error happens just by restarting
the interpreter in RStudio, so I am pretty sure there is nothing specific
to my code, but I wanted to give you the context to explain why this
matters to me and the users of my package. This is a session in RStudio

> sourceCpp(file="rmr2/pkg/src/psum.cpp")
> psum(list(1:4, 1:5))
[1] 10 15

Restarting R session...

> psum(list(1:4, 1:5))
Error in .External(list(name = "InternalFunction_invoke", address =
<pointer: 0x0>,  :
  NULL value passed as symbol address

the C code, probably irrelevant to this issue

#include <vector>
#include <Rcpp.h>

// [[Rcpp::export]]
std::vector<double> psum(Rcpp::List xx) {
  std::vector<double> results(xx.size());
  for(int i = 0; i < xx.size(); i ++) {
    std::vector<double> x = Rcpp::as<std::vector<double> >(xx[i]);
    for(int j = 0; j < x.size(); j++) {
      results[i] += x[j];}}
  return results;}


I remember something of this sort happening also with cxxfunction and that
someone recommended to create a package for C extensions that aren't a
one-off thing. But I would like to make the case that this is not a
satisfactory solution because it raises the bar for users who can write a C
function but may not be ready to write a complete package. For instance,
the may just want to replace a

sapply(data, sum)

with the above function to see what kind of speed boost they can get. If
they are doing this in the context of a mapreduce job, they can't as they
will get the above error. They need to write a package around that single
function to test their idea. That seems a bit too much. If they are not
using RHadoop but, say, developing in RStudio, every time they rebuild a
package they have to re-source their C code, not a show-stopper but an
inconvenience. Am I missing something? Suggestions? Thanks


Antonio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130129/f40c33d9/attachment.html>


More information about the Rcpp-devel mailing list