I wonder if (2) is possible why not have a default path. It seems to me leaving the R object broken is not the best user experience. It would also help me on the RHadoop end since I could pack the directory and broadcast it to the cluster using the distributed cache feature of RHadoop. But I may also be imagining a need where there is none, how many R users are willing to write ad hoc C functions for one-off use? If one is putting in the work to write in C, he may as well be willing to organize that work in a package. Thanks<div>
<br></div><div><br></div><div>Antonio<br><br><div class="gmail_quote">On Tue, Jan 29, 2013 at 1:48 PM, JJ Allaire <span dir="ltr"><<a href="mailto:jj.allaire@gmail.com" target="_blank">jj.allaire@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Antonio,<br>
<br>
You are correct in your analysis: currently C++ functions do not<br>
persist across sessions and require recompilation. The answer is of<br>
course to create a package, but as you point out for ad-hoc work this<br>
might be considered too heavyweight.<br>
<br>
Two possible longer term solutions exist:<br>
<br>
(1) Save the contents of the shared library along with the function<br>
(this is trickier to do properly than it sounds)<br>
<br>
(2) More likely, we could allow you to optionally provide an external<br>
(non-temporary) directory for build output. Since that directory will<br>
survive across sessions, your C++ function would of course also.<br>
<br>
J.J.<br>
<div><div class="h5"><br>
On Tue, Jan 29, 2013 at 4:20 PM, Antonio Piccolboni<br>
<<a href="mailto:antonio@piccolboni.info">antonio@piccolboni.info</a>> wrote:<br>
> Hi,<br>
> I defined a little C++ function and sourced it with sourceCpp, tested, works<br>
> great. It so happens that I develop a package to interface R and Hadoop, and<br>
> I need to use the same function in the map or reduce parameter of a<br>
> mapreduce call. This means that the definition of the function will have to<br>
> be saved, distributed to a cluster and loaded and executed in multiple R<br>
> instances. This works fine with R functions, but not sourceCpp-ed functions.<br>
> In fact, the same error happens just by restarting the interpreter in<br>
> RStudio, so I am pretty sure there is nothing specific to my code, but I<br>
> wanted to give you the context to explain why this matters to me and the<br>
> users of my package. This is a session in RStudio<br>
><br>
>> sourceCpp(file="rmr2/pkg/src/psum.cpp")<br>
>> psum(list(1:4, 1:5))<br>
> [1] 10 15<br>
><br>
> Restarting R session...<br>
><br>
>> psum(list(1:4, 1:5))<br>
> Error in .External(list(name = "InternalFunction_invoke", address =<br>
> <pointer: 0x0>, :<br>
> NULL value passed as symbol address<br>
><br>
> the C code, probably irrelevant to this issue<br>
><br>
> #include <vector><br>
> #include <Rcpp.h><br>
><br>
> // [[Rcpp::export]]<br>
> std::vector<double> psum(Rcpp::List xx) {<br>
> std::vector<double> results(xx.size());<br>
> for(int i = 0; i < xx.size(); i ++) {<br>
> std::vector<double> x = Rcpp::as<std::vector<double> >(xx[i]);<br>
> for(int j = 0; j < x.size(); j++) {<br>
> results[i] += x[j];}}<br>
> return results;}<br>
><br>
><br>
> I remember something of this sort happening also with cxxfunction and that<br>
> someone recommended to create a package for C extensions that aren't a<br>
> one-off thing. But I would like to make the case that this is not a<br>
> satisfactory solution because it raises the bar for users who can write a C<br>
> function but may not be ready to write a complete package. For instance, the<br>
> may just want to replace a<br>
><br>
> sapply(data, sum)<br>
><br>
> with the above function to see what kind of speed boost they can get. If<br>
> they are doing this in the context of a mapreduce job, they can't as they<br>
> will get the above error. They need to write a package around that single<br>
> function to test their idea. That seems a bit too much. If they are not<br>
> using RHadoop but, say, developing in RStudio, every time they rebuild a<br>
> package they have to re-source their C code, not a show-stopper but an<br>
> inconvenience. Am I missing something? Suggestions? Thanks<br>
><br>
><br>
> Antonio<br>
><br>
</div></div>> _______________________________________________<br>
> Rcpp-devel mailing list<br>
> <a href="mailto:Rcpp-devel@lists.r-forge.r-project.org">Rcpp-devel@lists.r-forge.r-project.org</a><br>
> <a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel</a><br>
</blockquote></div><br></div>