[Rprotobuf-commits] r932 - papers/jss
noreply at r-forge.r-project.org
noreply at r-forge.r-project.org
Mon Dec 15 19:52:34 CET 2014
Author: murray
Date: 2014-12-15 19:52:34 +0100 (Mon, 15 Dec 2014)
New Revision: 932
Modified:
papers/jss/response-to-reviewers.tex
Log:
Add more point to point replies. Still working. I can mostly finish this up today.
Modified: papers/jss/response-to-reviewers.tex
===================================================================
--- papers/jss/response-to-reviewers.tex 2014-12-15 03:01:40 UTC (rev 931)
+++ papers/jss/response-to-reviewers.tex 2014-12-15 18:52:34 UTC (rev 932)
@@ -72,14 +72,20 @@
bulleted list giving a high-level overview, then encourage the reader
to refer to the documentation for further details. Similarly, Tables
3-5 belong in the documentation, not in a vignette/paper.}
-\reply{Done. TO BE EXPANDED}
+\reply{Done. RProtoBuf was designed and implemented before RC were
+ available, and this is noted in a footnote now. Explanation of how
+ they are made mutable haas been added. Better explanation of the
+ two styles and '\$' as been added, while no longer using the
+ confusing term
+ 'pseudo-method' anywhere. Moved Tables 3-5 into the documentation
+ and out of the paper, as suggested.}
\pointRaised{Comment 4}{Section 7 is weak. I think the important message is that RProtobuf is
being used in practice at large scale for for large data, and is
useful for communicating between R and Python. How can you make that
message stronger while avoiding (for the purposes of this paper) the
relatively unimportant details of the map-reduce setup?}
-\reply{TBD}
+\reply{Done. Rewritten with more motivation taking into account this feedback.}
\subsubsection*{R to/from Protobuf translation}
@@ -87,15 +93,29 @@
much simpler if instead of Message, you provided a "vectorised"
Messages class (this would also make the interface more consistent and
hence the package easier to use).}
-\reply{TBD}
+\reply{This is an area for future work and is a space explored in
+ another package called Motobuf by other authors.}
\pointRaised{Comment 6}{Along these lines, I think it would make sense to combine sections 5
and 6 and discuss translation challenges in both direction
simultaneously. At the minimum, add the equivalent for Table 9 that
shows how important R classes are converted to their protobuf
equivalents.}
-\reply{TBD}
+\reply{We have updated these sections to make it clearer that the main
+ distinction is between schema-based datastructures (section 5) and
+ schema-less use where a catch-all .proto is used (section 6).
+ Neither section is meant to focus on only a single direction of the
+ conversion, but how conversion works when you have a schema or not.
+ How important R classes are converted to their protobuf equivalents
+ isn't super useful as a C++, Java, or Python program is unlikely to
+ want to read in an R data.frame exactly as it is defined. Much more
+ likely is an application-specific message format is defined between the
+ two services, such as the HistogramTools example in the next section.
+ Much more detail has been added to an interesting part of section 6 --
+ which datasets exactly are better served with RProtoBuf than
+ base::serialize and why?}
+
\pointRaised{Comment 7}{You should discuss how missing values are handled for strings and
integers, and why enums are not equivalent to factors. I think you
could make explicit how coercion of factors, dates, times and matrices
@@ -108,7 +128,16 @@
show how long it takes to serialise data frames using both RProtoBuf
and R's native serialisation. Is there a performance penalty to using
protobufs?}
-\reply{TBD}
+\reply{Table 10 has been replaced with a plot, the outliers are
+ labeled, and the text now includes some interesting explanation
+ about the outliers. Page 4 explains that the R implementation of
+ protocol buffers uses reflection to make operations slower but makes
+ it more convenient for interactive data analysis. None of the
+ built-in datasets are large enough for performance to really come up
+ as an issue, and for any serialization method examples could be
+ found that significantly favor one over another, so we don't think
+ there will be benefit to adding anything here.
+}
\subsubsection*{RObjectTables magic}
@@ -116,7 +145,8 @@
good fit for an infrastructure package and it's not clear what
advantages it has over explicitly loading a protobuf definition into
an object.}
-\reply{TBD}
+\reply{More information about the advantages and disadvantages of this
+ approach have been added.}
\pointRaised{Comment 10}{Using global state makes understanding code much harder. In Table 1,
it's not obvious where \texttt{tutorial.Person} comes from. Is it loaded by
@@ -125,19 +155,23 @@
as well as \texttt{HistogramTools}? This needs more explanation, and a
comment on the implications of this approach on CRAN packages and
namespaces.}
-\reply{TBD}
+\reply{We followed this recommendation and added explanation for how
+\texttt{tutorial.Person} is loaded, specifically : \emph{A small number of message types are imported when the
+package is first loaded, including the tutorial.Person type we saw in
+the last section.} We removed the superfluous attach of \texttt{RProtoBuf}.}
\pointRaised{Comment 11}{
I'd prefer you eliminate this magic from the magic, but failing that,
you need a good explanation of why.}
-\reply{TBD}
+\reply{We've added more explanation about this.}
\subsubsection*{Code comments}
\pointRaised{Comment 12}{Using \texttt{file.create()} to determine the absolute path seems like a bad idea.}
-\reply{TBD}
+\reply{We followed this recommendation and removed two instances of
+ \texttt{file.create()} for this purpose with calls to
+ \texttt{normalizePath} with \texttt{mustWork=FALSE}.}
-
\subsubsection*{Minor niggles}
\pointRaised{Comment 13}{Don't refer to the message passing style of OO as traditional.}
More information about the Rprotobuf-commits
mailing list