[Rcpp-devel] loading a souceCpp-ed function

Jiqiang Guo guojq28 at gmail.com
Wed Jan 30 00:47:47 CET 2013


If (2) is possible, I do not think (1) is harder than that.  Actually, I
implemented something in
http://cran.r-project.org/web/packages/cxxfunplus/index.html
But nobody is interested in it and it is always suggested to created a
package instead of using inline (or I guess sourceCpp now), I do not think
about improving it.

My idea is to store the binary file as raw type in R's object, which then
can be saved across sessions.  Then if the function is needed/called, we
first check if the library is loaded.  If not, we write the raw data to a
temporary file and load that.

--
Jiqiang

On Tue, Jan 29, 2013 at 6:35 PM, Antonio Piccolboni <antonio at piccolboni.info
> wrote:

> I wonder if (2) is possible why not have a default path. It seems to me
>  leaving the R object broken is not the best user experience. It would also
> help me on the RHadoop end since I could pack the directory and broadcast
> it to the cluster using the distributed cache feature of RHadoop. But I may
> also be imagining a need where there is none, how many R users are willing
> to write ad hoc C functions for one-off use? If one is putting in the work
> to write in C, he may as well be willing to organize that work in a
> package. Thanks
>
>
> Antonio
>
>
> On Tue, Jan 29, 2013 at 1:48 PM, JJ Allaire <jj.allaire at gmail.com> wrote:
>
>> Hi Antonio,
>>
>> You are correct in your analysis: currently C++ functions do not
>> persist across sessions and require recompilation. The answer is of
>> course to create a package, but as you point out for ad-hoc work this
>> might be considered too heavyweight.
>>
>> Two possible longer term solutions exist:
>>
>> (1) Save the contents of the shared library along with the function
>> (this is trickier to do properly than it sounds)
>>
>> (2) More likely, we could allow you to optionally provide an external
>> (non-temporary) directory for build output. Since that directory will
>> survive across sessions, your C++ function would of course also.
>>
>> J.J.
>>
>> On Tue, Jan 29, 2013 at 4:20 PM, Antonio Piccolboni
>> <antonio at piccolboni.info> wrote:
>> > Hi,
>> > I defined a little C++ function and sourced it with sourceCpp, tested,
>> works
>> > great. It so happens that I develop a package to interface R and
>> Hadoop, and
>> > I need  to use the same function in the map or reduce parameter of a
>> > mapreduce call. This means that the definition of the function will
>> have to
>> > be saved, distributed to a cluster and loaded and executed in multiple R
>> > instances. This works fine with R functions, but not sourceCpp-ed
>> functions.
>> > In fact, the same error happens just by restarting the interpreter in
>> > RStudio, so I am pretty sure there is nothing specific to my code, but I
>> > wanted to give you the context to explain why this matters to me and the
>> > users of my package. This is a session in RStudio
>> >
>> >> sourceCpp(file="rmr2/pkg/src/psum.cpp")
>> >> psum(list(1:4, 1:5))
>> > [1] 10 15
>> >
>> > Restarting R session...
>> >
>> >> psum(list(1:4, 1:5))
>> > Error in .External(list(name = "InternalFunction_invoke", address =
>> > <pointer: 0x0>,  :
>> >   NULL value passed as symbol address
>> >
>> > the C code, probably irrelevant to this issue
>> >
>> > #include <vector>
>> > #include <Rcpp.h>
>> >
>> > // [[Rcpp::export]]
>> > std::vector<double> psum(Rcpp::List xx) {
>> >   std::vector<double> results(xx.size());
>> >   for(int i = 0; i < xx.size(); i ++) {
>> >     std::vector<double> x = Rcpp::as<std::vector<double> >(xx[i]);
>> >     for(int j = 0; j < x.size(); j++) {
>> >       results[i] += x[j];}}
>> >   return results;}
>> >
>> >
>> > I remember something of this sort happening also with cxxfunction and
>> that
>> > someone recommended to create a package for C extensions that aren't a
>> > one-off thing. But I would like to make the case that this is not a
>> > satisfactory solution because it raises the bar for users who can write
>> a C
>> > function but may not be ready to write a complete package. For
>> instance, the
>> > may just want to replace a
>> >
>> > sapply(data, sum)
>> >
>> > with the above function to see what kind of speed boost they can get. If
>> > they are doing this in the context of a mapreduce job, they can't as
>> they
>> > will get the above error. They need to write a package around that
>> single
>> > function to test their idea. That seems a bit too much. If they are not
>> > using RHadoop but, say, developing in RStudio, every time they rebuild a
>> > package they have to re-source their C code, not a show-stopper but an
>> > inconvenience. Am I missing something? Suggestions? Thanks
>> >
>> >
>> > Antonio
>> >
>> > _______________________________________________
>> > Rcpp-devel mailing list
>> > Rcpp-devel at lists.r-forge.r-project.org
>> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>>
>
>
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130129/46885b22/attachment.html>


More information about the Rcpp-devel mailing list