[Roxygen-devel] S4 implementation of @usage

Thu Aug 30 17:57:46 CEST 2012

  >> Hadley Wickham <hadley at rice.edu>
  >> on Thu, 30 Aug 2012 08:57:41 -0500 wrote:

  >> I have done my best here https://gist.github.com/3516476
  > Thanks - that's useful. A few comments/questions:

  > * I don't think the spec for processTag is quite right yet - currently
  > you have it returning a transformed object of the same class as the
  > input - but that only works for a limited number of tags.  Many tags
  > also need to modify other tags the rocblock.  For example the "@intro"
  > tag (a virtual tag added to the start of every rocblock) needs to
  > modify the title, description and details. The @docType tag needs to
  > modify a lot of other pieces.

How about this. You can have 3 levels on which a tag can perform an
aciton -- a local tag level, on the object documentation (block) level
and on a package level. For each of this actions you have a generic
dispatched on *tag* object:

setGeneric("prepareTag", function(tag)  standardGeneric("prepareTag"),
           useAsDefault = function(tag) tag)

setGeneric("prepareDoc", function(tag, roxydoc)  standardGeneric("prepareDoc"),
           useAsDefault = function(tag, roxydoc) roxydoc)

setGeneric("preparePackage", function(tag, roxypackage)  standardGeneric("preparePackage"),
           useAsDefault = function(tag, roxypackage) roxypackage)

So you have 3 core objects roxyPackage holding a list of roxyBlocks,
roxyBlocks which comprise roxyDoc and object, and finally roxyDoc which
comprise roxyTags. 

preparePackage returns roxyPackage
prepareDoc returns roxyDoc
prepareTag returns roxyTag

Pretty simple, isn't it? Only special tags have to declare prepareDoc
and preparePackage.

So roxygenize will iterate 3 times over all tags and call prepareTag on 
first iteration, then prepareDoc, and finally preparePackage.

(The above is to give an idea and fix the terminology for the
sequel. The finial implementation will only have one roxyPrepare method
dispatched on both arguments (tag, missing), (Tag, RoxyDoc), (Tag,
roxypackage). Hopefully I am clear enough here.)

Otherwise you need some global exchange. Store roxyBlock globally and
allow the tag prepareTag method to modify it by side effect. Ugly and
not an R-ish way.

  > setClass("GlobalTag", contains = "Tag",
  >   slots = "RocBlock" # pointer back to the rocblock that contained them
  > )

pointer? That's not that easy in R, is it? You mean drooping to C and
installing real C pointers to objects?

An alternative could be to make roxyDoc an S4 environment. And each
tag can have a @parent slot holding his parent environment. Same for
roxyPackage object, it can be an environment. And roxyBlock can have a
slot pointing to parent roxyPackage object. 

Then prepareTag can have access to his parent roxyDoc and roxyPackage.

As compared to the preparePackage/Doc/Tag approach, the modifications are
done by side effect. Not that nice IMO.

  > * I think I need to explain the output model a little bit more - the
  > output needs to be by tag, and return an intermediate representation
  > of the output object. More like:

  > setGeneric("outRd")
  > setMethod("outRd", "roxyTag", function(tag, rocblock)) {
  >   filename <- getTag(rocblock, "rdname")
  >   rdCommand(tag at name, tag at value, path = filename)
  > }

  > This is because the output methods can't write directly to disk - they
  > may need to do some aggregation:

  > * multiple rocblocks may output to the same Rd file with @rdname
  > * all outputs to NAMESPACE need to be sorted and have duplicates removed.

Hm,  I was thinking that outRd should return a string containing an Rd
representation of the tag and should have nothing to with the file. It's
a task of a special function (write_rd_file) to aggregate all the tags
from an roxyDoc object and write them into a file.

In other words write_rd_file call outRd on each tag in roxyDoc and
decide how and where to place it. 

>From what you say, it seems that outRd should also have a global
perspective. This looks like a redundancy to me. The preparation stage
(prepareTag, prepareDoc and preparePackage) should handle all this
global dependencies. 

Take for example @rdname. The method procesPackage(rdTag, roxypackage)
should return a new RoxyPackage object, with all RoxyDoc with the same
@rdname unified in one RoxyDoc object. So that by the end of all the
procesXXX stages, the roxyPackage contains only one RoxyDoc object per
output file.

  > Another possible approach could be:

  > setClass("OutputRd")
  > setGeneric("makeOutput", function(input, output) {})
  > setMethod("makeOutput", c("TagParam", "OutputRd"), ...)
  > setMethod("makeOutput", c("TagImport", "OutputNamespace"), ...)
  > setMethod("makeOutput", c("TagIncludes", "OutputDescription") ...)

  > setMethod("makeOutput", c("RoxyBlock", "Output"), function(input, output) {
  >   lapply(input at tags, makeOutput, output = output)
  > }
  > setMethod("makeOutput", c("RoxyPackage", "Output"), function(input, ouput) {
  >   unlist(lapply(input at blocks, makeOutput, output = output), recursive = FALSE)
  > }

  > setGeneric("writeOutput", function(output, data) {})
  > setMethod("writeOutput", c("outputNamespace", "list")) {
  >   lines <- sort(unique(unlist(data)))
  >   write_if_different(lines, "NAMESPACE")
  > }

  > That would make it easier for the user to specify which sorts of
  > output they want because they could just provide a list of Output
  > objects to roxygenise.

Interesting, but it doesn't feel natural to me. It specifies a *type* of
an output by the type of an *input* object which you create specifically
for this purpose (to indicate the type of output). That's tough ;).

I still don't understand why you would need a method to generate a
namespace and description? Isn't this a global action? That is, the
write_namespace function should take as input all objects (roxyPackage),
iterate through all the roxyDoc objects, look into @export field and
finally write a namespace file? Similarly for description.

And why users might want to modify the default namespace generator? 

S4 should be used only for those parts of the package which impose
different behavior for different objects. All the rest are simple
functions. It looks to me that you are really over zealous in trying to
use S4 for everything.

A conceptual note. Roxygen is a documentation generator, so the output
is one to one correspondence to the file format (rd, text, html
etc). Making an namespace or description output is unnatural and seems
to be an unnecessary confusion.

I think this trails back to rocklet concept in the first version of the
roxygen. I could never understand what is the point of collate_rocklet
and namespace_rocklet. From the documentation they are objects what
specify an action. This is confusing, as action is usually associated
with a function or a method. 

So instead of 

       roxygenize( ..., roclets = c("collate", "namespace", "rd"))

this would have been much simpler:

       roxygenize( ..., collate = TRUE, namespace = TRUE, output = rd)

I am definitely not seeing the full picture here, but I can be pretty
sure that whatever the reason behind those decision was, it could have
been done in a standard R-ish way. There is really no need to confuse
the user with new pseudo class objects like rocklets or roccers or
whatever. Functions, methods, classes and object, that is the standard R
language.

     Vitalie