[Rcpp-devel] What is the best practice to expose a C st ructure from 3rd party library into R?

Wush Wu wush978 at gmail.com
Tue Jul 23 17:52:54 CEST 2013


Hi all,

Thanks for your reply.

I'm very happy to contribute to Rcpp Gallery.

I wrote some wrappers of C structure from 3rd party library with Rcpp, and
these issues kept bothering me. A good example will make the life easier.

2013/7/23 Dirk Eddelbuettel <edd at debian.org>

>
> Hi Wush,
>
> Nice work on Rhiredis.  I had some time to look more closely and have some
> comments below.  In sum, you may have made things too complicated.  But
> hey,
> you have a full working example which isn't too bad!  Well done.
>
> On 22 July 2013 at 06:57, Dirk Eddelbuettel wrote:
> | On 22 July 2013 at 13:19, romain at r-enthusiasts.com wrote:
> | | Le 2013-07-22 10:12, Wush Wu a écrit :
> | | > I wrote a wrapper of hiredis, which is a minimalistic C client for
> | | > the Redis database. Its name is `Rhiredis` and is much faster than
> | | > rredis, an existed redis client of R. Please
> | | > see http://rpubs.com/wush978/rhiredis [1] for details.
>
> [ Actually, part the speed difference comes from you using base64enc which
> is
> C-based and faster than what rredis has. If you first create a base64
> object,
> and then send that object directly with Rhiredis, the difference is smaller
> but of course still significant. ]
>

Well, I didn't fully understand why rredis is so slow. As shown in my
benchmark, the serialization of R object spends a little time(about 0.01
second), so I guess the main difference is the efficiency of transferring
data between R and redis.

By the way, I talked to another redis user at Taiwan R User Group. He
suggested me to make the API compatible with rredis so that Rhiredis could
replace rredis in doRedis, which is one of parallel backend of `foreach`.


>
> | | > However, I still want to know if there is a better approach. I tried
> | | > three approaches. Here is my opinions.
> | | >
> | | > # Preliminary
> | | >
> | | > Given a 3rd party C library, it usually provides a C structure and
> | | > functions of memory management. For example:
> | | >
> | | > ```c
> | | >
> | | > struct A {
> | | >     int flag;
> | | >     // ...
> | | > };
> | | >
> | | > A* initA();
> | | > void freeA(A* a);
> | | > ```
> | | >
> | | > The structure `A` need to be stored into a R object and pass to
> | | > another function.
> | | > Also, the user might need to access `A.flag`. For simplicity, there
> | | > is
> | | > only one field in this example. However, there might be a number of
> | | > fields in practice.
>
> I think this is too complicated.
>
> All you need is 'static pointer' to the redic context object.
>
> Then create one function to instantiate the object, and reuse it.
>
>
Sometimes I need to connect multiple redis server simultaneously, so I need
to expose the connection object to make me select the server in R.

It is convenient to use a global object if the user only needs one
connection. Therefore, I suggest to expose connection object to R, and let
R put the last constructed object into `options` and pass it to the API as
the default argument. This is how I implement Rhiredis.

However, thanks for the idea of singleton which I have never thought
before. Indeed, it is simpler.

| | > However, it also produces memory leak because no `freeA` is called.
> | | >
> | | > Adding `.finalizer(freeA)` in `RCPP_MODULE` will cause an error of
> | | > freeing memory twice.
> | |
> | | Gotta look into why this happens.
>
>
I might misuse the `finalizer`. The `freeA` free the memory which will be
freed by R later. However, I cannot find another place to call the `freeA`
if I directly expose `A` with `RCPP_MODULE`.


> I suspect this may be due to your use of Boost smart_ptr so you may be
> freeing something that might already be free'ed.
>
>
The code in the repository is the third approach. Sorry that I didn't
explain it clearly in the first mail.

The reason of using smart pointer is subtle.

At first, I embeded the connection structure into a C++ class like what
Dirk showed below. I'll explain the problem there.

| | > # Embed `A` into C++ class and expose the class with RCPP_MODULE
> | | >
> | | > This approach is implemented in `Rhiredis`.
> | | >
> | | > Finally, I still need to write helper function to expose the field of
> | | > `A`. But the user could access these flag in R with operator `$`.
> | | >
> | | > Note that I still need a function to extract the pointer of `A` from
> | | > exposed S4 object:
> | | >
> | | > ```cpp
> | | >
> | | > template<class T>
> | | > T* extract_ptr(SEXP s) {
> | | >  Rcpp::S4 s4(s);
> | | >  Rcpp::Environment env(s4);
> | | >  Rcpp::XPtr<T> xptr(env.get(".pointer"));
> | | >  return static_cast<T*>(R_ExternalPtrAddr(xptr));
> | | > }
> | | > ```
>
> I do not understand why you would need all that.
>
>
Take `redisCommand` for example. Since I didn't make the connection object
to be a singleton, R need to pass a connection object to `redisCommand`.
The object is exposed by `RCPP_MODULE`, so I need to extract it from
`RCPP_MODULE`.


| | > Please give me some suggestion if you know a better or a different
> | | > approach.
> | |
> | | I would essentially do what you do: use RCPP_MODULE to expose a C++
> | | class so that the C++ class manages scoping : constructor, destructor,
> | | etc ...
> | |
> | | class A_cpp {
> | | public:
> | |      A_cpp( ) : obj( initA() ){}
> | |      ~A_cpp(){ freeA(obj); obj = NULL ; }
> | |
> | |      int get_flag(){ return obj->flag ; }
> | |      void set_flag( int x ){ obj->flag = x ; }
> | |
> | | private:
> | |      A* obj ;
> | | } ;
>
>
In my case, a double free runtime error occurred. The reason is due to the
copy constructor of class A_cpp, i.e. `A_cpp(const A_cpp& src);`.

This implementation will corrupt memory if:

```cpp
  A_cpp a1;
  A_cpp a2(a1);
```

The compiler will free the space of a1::obj twice. The first time is at
`a1::~A_cpp` and the second time is at `a2::~A_cpp`.

That's why I use a smart pointer to make class A_cpp copyable.

The implementation of RCPP_MODULE contains a copy construction. A
compile-time error occurred if I close the copy constructor manually:

```cpp
class A_cpp {
  //...
private:
  A_cpp(const A_cpp&);
  void operator=(const A_cpp&);
}
```

The code is for now in a quick fork of your Rhiredis in my github account.
>
> Hope this helps,  Dirk
>
>
 --
> Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
>

Thank you, Dirk.

A singleton does make things easier. However, I still want to know how to
expose C structure with provided `free`-like function because sometimes we
cannot use singleton.

Wush
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130723/4ad9d976/attachment.html>


More information about the Rcpp-devel mailing list