[Rcpp-devel] Registering a custom delete_finalizer for my XPtrs

Steve Lianoglou mailinglist.honeypot at gmail.com
Mon Jul 4 16:09:44 CEST 2011


Hi Dirk,

On Mon, Jul 4, 2011 at 4:00 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
>
> Hi Steve,
>
> On 3 July 2011 at 19:02, Steve Lianoglou wrote:
> | Greetings,
> |
> | Preamble: My C++ is quite ... hmmm, does anybody know any good
> | euphemisms for "weak"?. So, sorry if this (or questions that will
> | likely follow) is too basic.
> |
> | I've been using Rcpp to help make a more R-like wrapper library to the
> | shogun-toolbox[1]. So far, it's been pretty fun, and (to my surprise
>
> Cool! You mean 'more R-like' in the sense of augmenting / replacing the
> wrapper they have? Which, I guess, uses Swig or something like Swig as Shogun
> wraps to R, Octave, Python, ...

Yeah, I'm actually writing a third/custom wrapper that talks straight
to the "core" shogun c++ library (not a SWIG wrapped something).

shogun already has two ways it can be accessed through R -- using
their r_static, or r_modular libraries you can build from a
shogun-toolbox download, but:

(i) From what I understand, r_static is "the most stable," (in the R
world) but also a bit limited in the functionality it exposes (eg.
types of kernels available to the user), as well as only allowing you
to work with one "machine" at a time. It's a consistent interface in
that it is the same for R, Octave, MATLAB, etc. but feels quite
foreign for someone trying to build predictive models in R, see:

http://www.shogun-toolbox.org/doc/staticinterfaces.html#staticrinterf_sec

Practically -- taking the example here:

http://www.fml.tuebingen.mpg.de/raetsch/suppl/shogun/RExamples

[self advertisement]
One builds an SVM with a gaussian kernel using a `traindat` matrix and
`trainlab` vectorr with their interface like so:

sg("send_command","loglevel ALL")
sg("set_features", "TRAIN", traindat)
sg("set_labels", "TRAIN", trainlab)
sg("send_command", "set_kernel GAUSSIAN REAL 40 1")
sg("send_command", "init_kernel TRAIN")
sg("send_command", "new_svm LIGHT")
sg("send_command", "c 10.0")
sg("send_command", "svm_train")

With my library, it's more of what an R user would expect, eg:

svm <- SVM(traindat, trainlab, kernel='gaussian', width=40, C=10)
predict(svm, ...)

> I guess you also looked into altering their wrapper?

I did -- I also thought about just wrapping their wrappers, but (i) I
don't think I could make it as flexible as I would like; (ii) there is
actually some problem with swig in R-land that (I think) causes a
slow/long memory leak due to how they (shogun) control the lifecycle
of their objects (they've implemented some type of semi-manual
reference counting GC); (iii) who wants to write a glorified parser?
:-); and (iv) it's a chance for me to get more comfortable with both
C++ and interfacing it with R, which I think is an important skill
that isn't all that sharp in my toolbox ... so, I'm also trying to
sharpen my saw.

Lastly -- the standard shogun interfaces aren't put together in a way
that allows it to be packageable and sent out to CRAN, whereas it
looks like mine is (for the most part -- I still haven't been able to
test an R CMD INSTLL of my package on a windows box) ... so ... the
"machine-learning for the masses" manifesto, and all that jazz.

[/self advertisement]

> | :-) I've been able to get basic things working, such as building a
> | handful of different types of SVMs. A testament to the quality of both
> | Rcpp and the shogun-toolbox code, since, as I said, my C++ isn't "the
> | best".
> |
> | I know I'm not making the most out of using Rcpp, and wanted to write
> | my code in a more Rcpp-inspired manner before I get too deep. (I
> | actually don't know if Rcpp-modules would work here, but maybe will
> | try that in the future as well).
> |
> | So, in my C++ code, I wire different objects together, and return a
> | pointer to the shogun object that I just made. Currently I've got a
> | simple function that wraps pointers to shogun objects into externalptr
> | SEXP's and registers my custom finalizer, which basically does what
> | you'd expect, eg:
> |
> | SEXP SG2SEXP(shogun::CSGObject *o) {
> |     SEXP xp = R_MakeExternalPtr(o, R_NilValue, R_NilValue);
> |     R_RegisterCFinalizer(xp, _shogun_ref_count_down);
> |     return xp;
> | }
>
> That seems quite right. I would have to check details as I don't use R's
> external pointers (or even our XPtr) all that frequently.
>
> You could use our XPtr class instead too.  I do so in RcppDE to wrap a
> pointer to a user-supplied C function.

I do use it (XPtr mojo) when I'm "unwrapping" a pointer to an object
that was sent back to C++ from R, but I don't know how to use it the
first time -- ie, when I finish building the shogun object for the
first time and package it up to send back to R-land (for later use)
since I need this custom 'on delete' function.

I poked around in the RcppDE source but couldn't find an example of
that particular use case -- perhaps I missed it?

> | What I thought a more Rcpp-inspired thing to do is to instead
> | instantiate pointers to shogun objects using
> | Rcpp::XPtr<SomeShogunObject>and just rely on the auto-wrapping to make
> | things "more clean", maybe like:
> |
> | // ...
> | Rcpp::XPtr<SomeShogunObject> so(new SomeShogunObject(what,ever), true);
> | so->do_something(special);
> | // ...
> | return so;
>
> But where would the logic to deal with 'SomeShogunObject' come from?

I've already written it, the "destruction" of all shogun object are
the same, and is already written in my `_shogun_ref_count_down`
function I am currently registering like so:

R_RegisterCFinalizer(xp, _shogun_ref_count_down)

> | But I don't want Rcpp's "vanilla" delete_finalizer to be invoked on my
> | object's R-side destruction -- I want whatever I've defined in my
> | `_shogun_ref_count_down` function to be used instead.
> |
> | (this is where my n00b-ness becomes self-evident):
>
> No n00bness here. You are trying pretty advanced stuff.
>
> I would always go pedestrian first. Make contact with Shogun (as you did).
> Call functions, return results (as you did).  See that it does sensible stuff
> in sensible time, establish a baseline to compare against.  Only then go
> crazy :)
>
> Because the persistence stuff is harder.  It helps to know the other projects
> internals well enough.  So my advice would be to keep the fancy stuff for
> 'version 2.0'.  Unless you have plenty of time to learn and try fancy things.

Yeah -- it's good advice, actually ...

> | Since XPtr::setDeleteFinalizer doesn't take a parameter for the
> | finalizer function to use, I thought I the "expected" way to achieve
> | this is through template specialization of delete_finalizer in my
> | package. Since every object in the shogun-toolbox extends CSGObject, I
> | thought this would be easy enough and maybe do something like:
> |
> | template<>
> | void delete_finalizer<shogun::CSGObject>(SEXP p) { /* something special */ }
> |
> | But that doesn't work/compile -- I guess because this specialization
> | and Rcpp's delete_finalizer definition aren't in the same compilation
> | units, as described here:
> |
> | http://www.parashift.com/c++-faq-lite/templates.html#faq-35.12
> |
> | -- or maybe I have to stick it in the Rcpp namespace somehow, like the
> | custom wrap/as trick?
>
> You're doing good.  They may be a way to use Rcpp along with this -- just
> like the Rcpp-package vignette shows with the 'three ways to do wrap' (and as).
>
> I would have to think through if we really had an example where the entire
> class hierarchy descends from a 'base' object. I don't think this fits the
> Armadillo or Eigen mold.  But it is a good thing to try---I think that one
> day someone will want to do the same for the wonderful Qt library too.

Would it be possible to overload the XPtr::setDeleteFinalizer function
so it has a version that takes a function pointer? So I could do
something like:

Rcpp::XPtr<SomeShogunObject> so(new SomeShogunObject(what,ever), false);
so->setDeleteFinalizer(&my_finalizer_function);

(I don't know if that's the correct syntax to pass functions around(?))

> | So -- long story short ... is there a more Rcpp/spiffy way for me to
> | register my own delete_finalizer function, or should I just keep going
> | with my SG2SEXP function for now, and save the Rcpp-swagger for when I
> | see if Rccp-modules is a good fit (and just use its *.finalizer()
> | business)?
>
> In a way Rcpp modules is orthogonal, and also more restricted (but easier for
> simple connections to libraries).

Indeed -- as I was reading over it more, I don't think RcppModules
would fit here since (i) there are lots of overloaded methods for
shogun classes I'd need to "wire"; and (ii) shogun has pretty deep
class hierarchies (and I think "inheritance" is part of the "Future
extensions" of the RcppModules, anyway).

> Hope this helps,  Dirk

Yup -- thanks for taking the time to chew on the email.

Thanks,
-steve



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact


More information about the Rcpp-devel mailing list