[Rprotobuf-commits] r913 - in pkg: . inst inst/proto inst/unitTests
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Wed Nov 26 22:13:54 CET 2014
Author: murray
Date: 2014-11-26 22:13:54 +0100 (Wed, 26 Nov 2014)
New Revision: 913
Modified:
pkg/ChangeLog
pkg/inst/NEWS.Rd
pkg/inst/proto/rexp.proto
pkg/inst/unitTests/runit.serialize_pb.R
Log:
Address referee feedback by adding support for serializing function,
language, and environment objects with serialize_pb. It's still not
particularly useful since these are mostly R language constructs, but
still I agree it will make our exposition clearer in section 6 and
makes this functionality feel more complete.
Add a unit test verifying that all 106 built-in datasets in R can be
round-trip serialized/unserialized into protocol buffers without
error.
Modified: pkg/ChangeLog
===================================================================
--- pkg/ChangeLog 2014-11-26 03:09:13 UTC (rev 912)
+++ pkg/ChangeLog 2014-11-26 21:13:54 UTC (rev 913)
@@ -1,3 +1,21 @@
+2014-11-26 Murray Stokely <mstokely at google.com>
+
+ Address feedback from anonymous reviewer for JSS to make this
+ package more complete:
+
+ * inst/unitTests/runit.serialize_pb.R: Add a test to verify that
+ we can serialize all 100+ built-in datasets with R and get an
+ identical object to the original once unserialized.
+
+ * R/rexp_obj.R: Serialize function, language, and environment
+ objects by just falling back to R's native serialization and using
+ raw bytes to store them. This at least lets us round-trip encode
+ all native R types, even though these three only make sense in the
+ context of R. Greatly simplify the can_serialize_pb function.
+
+ * inst/proto/rexp.proto: Add support for function, language, and
+ environment objects.
+
2014-11-25 Dirk Eddelbuettel <edd at debian.org>
* vignettes/RProtoBuf-intro.Rnw: Applied a few corrections spotted by
Modified: pkg/inst/NEWS.Rd
===================================================================
--- pkg/inst/NEWS.Rd 2014-11-26 03:09:13 UTC (rev 912)
+++ pkg/inst/NEWS.Rd 2014-11-26 21:13:54 UTC (rev 913)
@@ -18,7 +18,12 @@
\item Update the default print methods to use
\code{cat()} with \code{fill=TRUE} instead of \code{show()} to eliminate the confusing
\code{[1]} since the classes in \cpkg{RProtoBuf} are not vectorized.
- \item Add unit tests.
+ \item Add support for serializing function, language, and
+ environment objects by falling back to R's native serialization
+ with \code{serialize_pb} and \code{unserialize_pb} to make it
+ easy to serialize into a protocol buffer all 100+ of the
+ built-in datasets with R.
+ \item Add unit tests for all of the above.
}
\section{Changes in RProtoBuf version 0.4.1 (2014-03-25)}{
Modified: pkg/inst/proto/rexp.proto
===================================================================
--- pkg/inst/proto/rexp.proto 2014-11-26 03:09:13 UTC (rev 912)
+++ pkg/inst/proto/rexp.proto 2014-11-26 21:13:54 UTC (rev 913)
@@ -1,8 +1,15 @@
// Originally written by Saptarshi Guha for RHIPE (http://www.rhipe.org)
-// Released under Apache License 2.0, and reused with permission here
+// Released under Apache License 2.0, and reused with permission here
+// Extended in November 2014 with new types to support encoding
+// language, environment, and function types from R.
package rexp;
+option java_package = "org.godhuli.rhipe";
+option java_outer_classname = "REXPProtos";
+
+// TODO(mstokely): Refine this using the new protobuf 2.6 oneof field
+// for unions.
message REXP {
enum RClass {
STRING = 0;
@@ -13,6 +20,9 @@
LIST = 5;
LOGICAL = 6;
NULLTYPE = 7;
+ LANGUAGE = 8;
+ ENVIRONMENT = 9;
+ FUNCTION = 10;
}
enum RBOOLEAN {
F=0;
@@ -20,7 +30,7 @@
NA=2;
}
- required RClass rclass = 1 ;
+ required RClass rclass = 1;
repeated double realValue = 2 [packed=true];
repeated sint32 intValue = 3 [packed=true];
repeated RBOOLEAN booleanValue = 4;
@@ -32,6 +42,9 @@
repeated string attrName = 11;
repeated REXP attrValue = 12;
+ optional bytes languageValue = 13;
+ optional bytes environmentValue = 14;
+ optional bytes functionValue = 14;
}
message STRING {
optional string strval = 1;
@@ -41,4 +54,3 @@
optional double real = 1 [default=0];
required double imag = 2;
}
-
Modified: pkg/inst/unitTests/runit.serialize_pb.R
===================================================================
--- pkg/inst/unitTests/runit.serialize_pb.R 2014-11-26 03:09:13 UTC (rev 912)
+++ pkg/inst/unitTests/runit.serialize_pb.R 2014-11-26 21:13:54 UTC (rev 913)
@@ -3,11 +3,11 @@
test.serialize_pb <- function() {
#verify that rexp.proto is loaded
RProtoBuf:::pb(rexp.REXP)
-
+
#serialize a nested list
x <- list(foo=cars, bar=Titanic)
checkEquals(unserialize_pb(serialize_pb(x, NULL)), x)
-
+
#a bit of everything, copied from jsonlite package
set.seed('123')
myobject <- list(
@@ -22,6 +22,20 @@
somemissings = c(1,2,NA,NaN,5, Inf, 7 -Inf, 9, NA),
myrawvec = charToRaw('This is a test')
);
-
+
checkEquals(unserialize_pb(serialize_pb(myobject, NULL)), myobject)
}
+
+test.serialize_pb.alldatasets <- function() {
+ datasets <- as.data.frame(data(package="datasets")$results)
+ datasets$name <- sub("\\s+.*$", "", datasets$Item)
+
+ encoded.datasets <- sapply(datasets$name,
+ function(x) serialize_pb(get(x), NULL))
+
+ unserialized.datasets <- sapply(encoded.datasets, unserialize_pb)
+
+ checkTrue(all(sapply(names(unserialized.datasets),
+ function(name) identical(get(name),
+ unserialized.datasets[[name]]))))
+}
\ No newline at end of file
More information about the Rprotobuf-commits
mailing list