[Rcpp-devel] What is the best practice to expose a C st ructure from 3rd party library into R?
Dirk Eddelbuettel
edd at debian.org
Tue Jul 23 05:22:38 CEST 2013
Hi Wush,
Nice work on Rhiredis. I had some time to look more closely and have some
comments below. In sum, you may have made things too complicated. But hey,
you have a full working example which isn't too bad! Well done.
On 22 July 2013 at 06:57, Dirk Eddelbuettel wrote:
| On 22 July 2013 at 13:19, romain at r-enthusiasts.com wrote:
| | Le 2013-07-22 10:12, Wush Wu a écrit :
| | > I wrote a wrapper of hiredis, which is a minimalistic C client for
| | > the Redis database. Its name is `Rhiredis` and is much faster than
| | > rredis, an existed redis client of R. Please
| | > see http://rpubs.com/wush978/rhiredis [1] for details.
[ Actually, part the speed difference comes from you using base64enc which is
C-based and faster than what rredis has. If you first create a base64 object,
and then send that object directly with Rhiredis, the difference is smaller
but of course still significant. ]
| | > However, I still want to know if there is a better approach. I tried
| | > three approaches. Here is my opinions.
| | >
| | > # Preliminary
| | >
| | > Given a 3rd party C library, it usually provides a C structure and
| | > functions of memory management. For example:
| | >
| | > ```c
| | >
| | > struct A {
| | > int flag;
| | > // ...
| | > };
| | >
| | > A* initA();
| | > void freeA(A* a);
| | > ```
| | >
| | > The structure `A` need to be stored into a R object and pass to
| | > another function.
| | > Also, the user might need to access `A.flag`. For simplicity, there
| | > is
| | > only one field in this example. However, there might be a number of
| | > fields in practice.
I think this is too complicated.
All you need is 'static pointer' to the redic context object.
Then create one function to instantiate the object, and reuse it.
| | > # XPtr
| | >
| | > Thanks for the user guide of Rcpp modules , this is the first
| | > approach I tried.
| | >
| | > We could expose `A` into R as follow:
| | >
| | > ```cpp
| | > Rcpp::XPtr<A, freeA> a(initA());
| | > ```
| | >
| | > However, we should write helper function to accessing `A.flag` for
| | > every fields.
Same.
You don't need to bring the redis context back to R. So no need for an
XPtr object.
| | > # RCPP_MODULE
| | >
| | > The guids of Rcpp modules also provides another approach:
| | >
| | > ```cpp
| | >
| | > RCPP_EXPOSED_CLASS(A)
| | >
| | > RCPP_MODULE(A) {
| | > class_<A>("A")
| | > .field("flag", &A::flag)
| | > ;
| | > }
| | >
| | > //@export
| | > //[[Rcpp::export]]
| | > SEXP init() {
| | > BEGIN_RCPP
| | > return wrap(*initA());
| | > END_RCPP
| | > }
| | > ```
| | >
| | > This will produce a S4 class named `A` in R which stores C structure
| | > `A`.
| | >
| | > However, it also produces memory leak because no `freeA` is called.
| | >
| | > Adding `.finalizer(freeA)` in `RCPP_MODULE` will cause an error of
| | > freeing memory twice.
| |
| | Gotta look into why this happens.
I suspect this may be due to your use of Boost smart_ptr so you may be
freeing something that might already be free'ed.
| | > # Embed `A` into C++ class and expose the class with RCPP_MODULE
| | >
| | > This approach is implemented in `Rhiredis`.
| | >
| | > Finally, I still need to write helper function to expose the field of
| | > `A`. But the user could access these flag in R with operator `$`.
| | >
| | > Note that I still need a function to extract the pointer of `A` from
| | > exposed S4 object:
| | >
| | > ```cpp
| | >
| | > template<class T>
| | > T* extract_ptr(SEXP s) {
| | > Rcpp::S4 s4(s);
| | > Rcpp::Environment env(s4);
| | > Rcpp::XPtr<T> xptr(env.get(".pointer"));
| | > return static_cast<T*>(R_ExternalPtrAddr(xptr));
| | > }
| | > ```
I do not understand why you would need all that.
| | > Please give me some suggestion if you know a better or a different
| | > approach.
| |
| | I would essentially do what you do: use RCPP_MODULE to expose a C++
| | class so that the C++ class manages scoping : constructor, destructor,
| | etc ...
| |
| | class A_cpp {
| | public:
| | A_cpp( ) : obj( initA() ){}
| | ~A_cpp(){ freeA(obj); obj = NULL ; }
| |
| | int get_flag(){ return obj->flag ; }
| | void set_flag( int x ){ obj->flag = x ; }
| |
| | private:
| | A* obj ;
| | } ;
I may do something like that tomorrow evening.
For now, I did something _much_ simpler still. We need single C++ source
file. Some key features:
-- no more Boost shared_ptr
-- no more Rcpp::XPtr
-- "empty" constructor and destructor, nothing to be done
-- no member variables
-- one worker function which takes string (with a redis command), runs it
and returns the result string
-- the worker function checks if the redis context pointer is non-NULL; in
case it is NULL (eg first use) it creates a context
-----------------------------------------------------------------------------
#include <Rcpp.h>
#include <hiredis/hiredis.h> // on Ubuntu file /usr/include/hiredis/hiredis.h
// We use a static pointer which makes the variable persistent across calls
// A more formal C++ idiom would be a singleton class, but this is simpler
static redisContext *prc = NULL;
// A simple and lightweight class -- without member variables
// We could add some member variables to cache the last call, status, ...
//
class Redis {
private:
void checkAndInit() {
if (prc == NULL) {
prc = redisConnect("127.0.0.1", 6379);
// should test for error here, could even throw() under Rcpp, ...
Rcpp::Rcout << "Context created\n";
} else {
Rcpp::Rcout << "Reusing context\n";
}
}
public:
Redis() { /* Rcpp::Rcout << "In ctor\n"; */ }
~Redis() { /* Rcpp::Rcout << "In dtor\n"; */ }
std::string execCommand(std::string cmd) {
checkAndInit(); // important: ensure prc is set up, re-use if so
redisReply *reply = static_cast<redisReply*>(redisCommand(prc, cmd.c_str()));
std::string res(reply->str);
freeReplyObject(reply);
return(res);
}
void disconnect() {
redisFree(prc);
prc = NULL; // just to be on the safe side
}
// could create new functions to (re-)connect with given host and port etc pp
};
-----------------------------------------------------------------------------
Couple this with a simple function (which is so simple that I wrote it
"straight" without even Rcpp attributes. One string in, one string out:
-----------------------------------------------------------------------------
// for now, single worker function
extern "C" SEXP execRedisCommand(SEXP p) {
Redis redis;
std::string txt = Rcpp::as<std::string>(p);
std::string res = redis.execCommand(txt);
return Rcpp::wrap(res);
}
-----------------------------------------------------------------------------
(We could create a second function to close redis and call disconnect();
and we could 'getter' function for the state variables inside the context.)
This function is then used by a single R function:
-----------------------------------------------------------------------------
## minimal R interface to redis using the hiredis library
redis <- function(cmd) {
## some sanity checking on cmd would be nice too...
.Call("execRedisCommand", cmd, package="Rhiredis")
}
-----------------------------------------------------------------------------
And that is all. I added a simple tester:
edd at don:~/git/Rhiredis$ Rscript tests/simpleRedisClient.R
Context created
[1] "PONG"
Reusing context
[1] "PONG"
Reusing context
[1] "PONG"
Reusing context
[1] "OK"
Reusing context
[1] "42"
edd at don:~/git/Rhiredis$
As you can see from the verbose text from checkAndInit(), we only need to
create the redis instance once.
I may still turn this into something based on Rcpp modules. Maybe tomorrow.
The code is for now in a quick fork of your Rhiredis in my github account.
Hope this helps, Dirk
--
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
More information about the Rcpp-devel
mailing list