[Rprotobuf-commits] r913 - in pkg: . inst inst/proto inst/unitTests

noreply at r-forge.r-project.org noreply at r-forge.r-project.org
Wed Nov 26 22:13:54 CET 2014


Author: murray
Date: 2014-11-26 22:13:54 +0100 (Wed, 26 Nov 2014)
New Revision: 913

Modified:
   pkg/ChangeLog
   pkg/inst/NEWS.Rd
   pkg/inst/proto/rexp.proto
   pkg/inst/unitTests/runit.serialize_pb.R
Log:
Address referee feedback by adding support for serializing function,
language, and environment objects with serialize_pb.  It's still not
particularly useful since these are mostly R language constructs, but
still I agree it will make our exposition clearer in section 6 and
makes this functionality feel more complete.

Add a unit test verifying that all 106 built-in datasets in R can be
round-trip serialized/unserialized into protocol buffers without
error.



Modified: pkg/ChangeLog
===================================================================
--- pkg/ChangeLog	2014-11-26 03:09:13 UTC (rev 912)
+++ pkg/ChangeLog	2014-11-26 21:13:54 UTC (rev 913)
@@ -1,3 +1,21 @@
+2014-11-26  Murray Stokely  <mstokely at google.com>
+
+	Address feedback from anonymous reviewer for JSS to make this
+	package more complete:
+
+	* inst/unitTests/runit.serialize_pb.R: Add a test to verify that
+	we can serialize all 100+ built-in datasets with R and get an
+	identical object to the original once unserialized.
+
+	* R/rexp_obj.R: Serialize function, language, and environment
+	objects by just falling back to R's native serialization and using
+	raw bytes to store them.  This at least lets us round-trip encode
+	all native R types, even though these three only make sense in the
+	context of R.  Greatly simplify the can_serialize_pb function.
+
+	* inst/proto/rexp.proto: Add support for function, language, and
+	  environment objects.
+
 2014-11-25  Dirk Eddelbuettel  <edd at debian.org>
 
 	* vignettes/RProtoBuf-intro.Rnw: Applied a few corrections spotted by

Modified: pkg/inst/NEWS.Rd
===================================================================
--- pkg/inst/NEWS.Rd	2014-11-26 03:09:13 UTC (rev 912)
+++ pkg/inst/NEWS.Rd	2014-11-26 21:13:54 UTC (rev 913)
@@ -18,7 +18,12 @@
     \item Update the default print methods to use
     \code{cat()} with \code{fill=TRUE} instead of \code{show()} to eliminate the confusing
     \code{[1]} since the classes in \cpkg{RProtoBuf} are not vectorized.
-    \item Add unit tests.
+    \item Add support for serializing function, language, and
+      environment objects by falling back to R's native serialization
+      with \code{serialize_pb} and \code{unserialize_pb} to make it
+      easy to serialize into a protocol buffer all 100+ of the
+      built-in datasets with R.
+    \item Add unit tests for all of the above.
 }
 
 \section{Changes in RProtoBuf version 0.4.1 (2014-03-25)}{

Modified: pkg/inst/proto/rexp.proto
===================================================================
--- pkg/inst/proto/rexp.proto	2014-11-26 03:09:13 UTC (rev 912)
+++ pkg/inst/proto/rexp.proto	2014-11-26 21:13:54 UTC (rev 913)
@@ -1,8 +1,15 @@
 // Originally written by Saptarshi Guha for RHIPE (http://www.rhipe.org)
-// Released under Apache License 2.0, and reused with permission here  
+// Released under Apache License 2.0, and reused with permission here
+// Extended in November 2014 with new types to support encoding
+// language, environment, and function types from R.
 
 package rexp;
 
+option java_package = "org.godhuli.rhipe";
+option java_outer_classname = "REXPProtos";
+
+// TODO(mstokely): Refine this using the new protobuf 2.6 oneof field
+// for unions.
 message REXP {
   enum RClass {
     STRING = 0;
@@ -13,6 +20,9 @@
     LIST = 5;
     LOGICAL = 6;
     NULLTYPE = 7;
+    LANGUAGE = 8;
+    ENVIRONMENT = 9;
+    FUNCTION = 10;
   }
   enum RBOOLEAN {
     F=0;
@@ -20,7 +30,7 @@
     NA=2;
   }
 
-  required RClass rclass = 1 ; 
+  required RClass rclass = 1;
   repeated double realValue = 2 [packed=true];
   repeated sint32 intValue = 3 [packed=true];
   repeated RBOOLEAN booleanValue = 4;
@@ -32,6 +42,9 @@
 
   repeated string attrName = 11;
   repeated REXP attrValue = 12;
+  optional bytes languageValue = 13;
+  optional bytes environmentValue = 14;
+  optional bytes functionValue = 14;
 }
 message STRING {
   optional string strval = 1;
@@ -41,4 +54,3 @@
   optional double real = 1 [default=0];
   required double imag = 2;
 }
-

Modified: pkg/inst/unitTests/runit.serialize_pb.R
===================================================================
--- pkg/inst/unitTests/runit.serialize_pb.R	2014-11-26 03:09:13 UTC (rev 912)
+++ pkg/inst/unitTests/runit.serialize_pb.R	2014-11-26 21:13:54 UTC (rev 913)
@@ -3,11 +3,11 @@
 test.serialize_pb <- function() {
   #verify that rexp.proto is loaded
   RProtoBuf:::pb(rexp.REXP)
-  
+
   #serialize a nested list
   x <- list(foo=cars, bar=Titanic)
   checkEquals(unserialize_pb(serialize_pb(x, NULL)), x)
-  
+
   #a bit of everything, copied from jsonlite package
   set.seed('123')
   myobject <- list(
@@ -22,6 +22,20 @@
     somemissings = c(1,2,NA,NaN,5, Inf, 7 -Inf, 9, NA),
     myrawvec = charToRaw('This is a test')
   );
-  
+
   checkEquals(unserialize_pb(serialize_pb(myobject, NULL)), myobject)
 }
+
+test.serialize_pb.alldatasets <- function() {
+  datasets <- as.data.frame(data(package="datasets")$results)
+  datasets$name <- sub("\\s+.*$", "", datasets$Item)
+
+  encoded.datasets <- sapply(datasets$name,
+      function(x) serialize_pb(get(x), NULL))
+
+  unserialized.datasets <- sapply(encoded.datasets, unserialize_pb)
+
+  checkTrue(all(sapply(names(unserialized.datasets),
+                       function(name) identical(get(name),
+		       unserialized.datasets[[name]]))))
+}
\ No newline at end of file



More information about the Rprotobuf-commits mailing list