[Rprotobuf-yada] google protocol buffers

Romain François francoisromain at free.fr
Tue Oct 27 17:55:30 CET 2009


[repost on the mailing list]

Hi,

I have been travelling, so I'm discovering the thread right now 
(although I somehow started it ...).

I think the use I want to make of protocol buffers is much closer to 
Dirk proof of concept than Saptarshi's method. Basically I'd want to be 
able to read/write any protocol message (given the proto file) from/to R.

For example, with this proto file from google:

message Person {
   required int32 id = 1;
   required string name = 2;
   optional string email = 3;
}


I'd want to do this in R:

p <- new( P("Person.proto") )
p$id <- 3
p$name <- "Romain"
p$email <- "francoisromain at free.fr"

p <- readProto( P("Person.proto"), con ) # con is some connection
writeProto( p, con ) # con is some output connection

p <- within( new( P("Person.proto") ), {
	id <- 4
	name <- "Dirk"
	email <- "edd at debian.org"
} )

# ... and so on.

For the story, "P" here is protocol buffer analogue to the new "J" 
function in rJava :
d <- new( J("java.lang.Double"), 10 )
We even made "new" generic in rJava to support things like this


I have a lot of catching up to do before we can be any close to this. 
The way things are done in C++/java (not sure about python) is to 
generate some code that then gets compiled to support the sort of 
reflection as above, but there should be some way to parse the proto 
message format at runtime, but again I need to do some serious reading 
before, both on protocol buffers and R (I guess we need things like 
external pointers, ... )


I think protocol buffers are cool and my first reaction when I saw 
Saptarshi's first email on the mailing list is that I could very well 
make a new hybrid oo system for R.


Another plus is that it could be another push for applying Jeff's 
connection patch to R so that packages can manipulate connections on the 
c side.

Romain

On 10/26/2009 04:15 PM, Saptarshi Guha wrote:
>>
>> | I had the same problem, I think protobuf installed its .pc (I did use
>> | pb 2.2) file somewhere in a *64 directory and pkg_config
>> | did not pick it up, I had to then modify PKG_CONFIG_PATH.
>> | My experience, in cross platform installations(I haven't had a
>> | memorable experience using autoconf)
>> | files is *limited* at best and I usually perform hacks to make
>> | things work.
>> | If you have recommendations on improving the configure.in file, we
>> | could certainly incorporate them.
>>
>> I can take care of configure.in etc. I have a few packages on r-forge and
>> other places that use it for library search and config just like we
>> need it.
>> I have found it to be quite reliable for that.
>>
>> Our main problem will be that pkg-config is 'too new' so we still need
>> the
>> manual search for headers and libs. But I can add that rather easily
>> based
>> on how some of the other packages interfacing C code do it.
>>
>
> Great. I'd like to see how it works.
>
>> What about Java though? Is a Depends on rJava plus an assumption of R
>> being
>> Java-conf'ed good enough so that we can rely on javac etc ?

To answer the question, rJava only need a JRE to run (so only java, not 
javac) because all the java code is precompiled, so this does not 
guarantee to have javac.

What do we need java for ? Don't we only need to read/write protocol 
buffers to/from R. If someone needs to read them on java then I suppose 
they would use the standard java api ?

If we want to have some demo code that shows the full loop for example 
writing a message from R and reading it into java, then I would suggest 
to do it as a separate R package


> Is there a Depends on rJava? I don't think we need the javac compiler at
> all.
> I am almost nearly done writing a Java wrapper around the proto
> functions and currently
> can exchange vectors( scalar and lists) alongwith their attributes( thus
> including factors,
> matrices and I hope data.frames) between R and Java. However, the R
> package can simply
> include the Jar file (thus no compiling). Not sure what is the best
> approach.
>
> However, I suppose if R is java-confed, that should be good enough for a
> javac being present.
> I'll have to remove the ant based build scripts.

There is a "ant" R package on CRAN that contains ant as well as a few R 
extensions, so that you can call R from within the ant script.

http://romainfrancois.blog.free.fr/index.php?post/2009/09/08/new-R-package-%3A-ant

>> That looks very promising indeed, even to the python newb that I am. What
>> about nested data structures? Can you loop over those in Python in one
>> command to?
>>
>
> I *guess so* but haven't ventured very far down this line. Over the last
> few months
> I've been juggling R, C++ and Java and my pythons (average) skills have
> slipped.
> The java wrapper can loop over nested stuctures.
>
>
>> In case you don't have an account on r-forge.r-project.org yet, please
>> consider registering one. If you fill out the form today, chances are the
>> good guys in Vienna will have added you by tomorrow.
>>
>
> Thanks for the suggestions, done.
>
> Regards
> Saptarshi


-- 
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/BcPw : celebrating R commit #50000
|- http://tr.im/ztCu : RGG #158:161: examples of package IDPmisc
`- http://tr.im/yw8E : New R package : sos



More information about the Rprotobuf-yada mailing list