[Roxygen-devel] roxygen3

Wed Aug 29 01:46:51 CEST 2012

  >> Hadley Wickham <hadley at rice.edu>
  >> on Tue, 28 Aug 2012 15:19:46 -0500 wrote:
  >> 
  >> Wouldn't it be better to inspect the evaluation environment for the
  >> traces of the evaluation and then dispatch on the objects discovered?
  >> Then the code
  >> 
  >> aaa <- local({ ... compute object ..})
  >> 
  >> will correctly dispatch on aaa and won't be ignored.

  HW> It does do that - but you need to parse the call because there are
  HW> number of calls that produce global side effects (e.g. creating
  HW> classes and methods). You could try doing it after the fact (e.g. by
  HW> using S4 introspection to find all the objects to document), but then
  HW> it's much more difficult to match the documentation block with the
  HW> object.

Not that difficult,  S4 always leave traces in the evaluation
environment, objects starting with:

  methods:::.TableMetaPattern()
  [1] "^[.]__T__"
  methods:::.ClassMetaPattern()
  [1] "^[.]__C__"

Inspecting those, you know exactly what was installed. 

S3 methods are left in ".__S3MethodsTable__." object locally. So no
trouble at all. I have done this before, and can provide the necessary
code.

It looks like an internal hackery, but it's really not. This
implementation will hardly ever change, and if changes, will be easy to
adapt.

  >> It will be possible to create documentation for a bunch of objects at
  >> the same time. For example
  >> 
  >> local({ a <- generate_object_a()
  >> b <- generate_object_b()})
  >> 
  >> will document both a and b.

  HW> Could you flesh out this example a bit more?  I don't understand why
  HW> you'd want to document objects that aren't evaluated by the user.

Ah sorry, that was stupid. I meant 

   eval({ a <- generate_object_a()
          b <- generate_object_b()})

Roxygen can make a convention if two declarations are followed
imidiately after each over they souled be documented in the same
roxy-doc and same Rd file:

foo <- function(a) ..
boo <- function(a) ..

will put both foo and boo in the same file. Curently one needs two
documentation blocks and rdname tag if I am not mistaken. 

  >> >> I don't see the need for every tag to be class aware just a few.
  >> 
  >> This is a complication. You ending up in implementing your own OO
  >> system. For example,  from roccer-.r:
  >> 
  >> #' The roccer object is a key component in roxygen3 - it defines the behaviour
  >> #' of a tag with a \code{parser} and a \code{output} write.
  >> 
  >> Why would you need an roccer if you already have "classes" and "methods"
  >> to define the behavior of the tag?

  HW> I'm not sure I get your point - the roccer _is_ the object that
  HW> represents the tag.

You meant roccer as an abstract encapsulation of the behavior of the
tag. That is

  structure(list(name = name, parser = parser, output = output),  class = "roccer")

An abstract notion of a "tag" has a representation with a name and two
methods which describe the behavior of a tag. This is precisely the task
of a class/method system.

add_rosser and roccer functions are basically a replacement for setClass
and setMethod.

In S4 instead of roccer + add_rocer + basic_roccer + etc  you might do:

   setClass("RoxyTag", list(name = "character"))
   setGeneric("roxyParse", function(tag) NULL)
   setGeneric("roxyRd", function(tag) NULL) 

For every tag:

   setClass("RoxyFamily", contanins = "RoxyTag")
   setMethod("roxyParse", signature = "RoxyFamily", 
             def = ... )

instead of 

parse_family <- function() ...
roc_family <- roccer("family", 
  roc_parser(tag = text_tag(), all = parse_family))

It looks like you want a simple interface, but it ends up being a cross
between S3 and internal roxygen object (i.e. tag) keeping system. Sort
of _roxyClasses_ approach, and it looks like you haven't yet get to the
inheritance and extension mechanism.

(Actually, I think I started understanding why you proposed to split the
package and to keep tags separately. To simplify the extension of
tags(that is it?). If the tags are S4 classes, then this is not a
problem. Any package can extend the system!)

To wrap up, quite some of the current code is essentially an OO keeping,
and can be completely eliminated by delegating the work to S4.

  >> You can just have a virtual S4 class "roxy_tag". Then subclass
  >> "roxy_tag_oxygen" and have all other tags derive from that. Most of them
  >> will probably have only two slots, "name" and "text", but some like
  >> "slots" tag will have more.
  >> 
  >> Then you can have "roxy_split_oxigen(object, doc)" generic dispatched on
  >> object which would split the string 'doc' into tags. Each tag is an
  >> object.  Then another generic "roxy_parse(tag)" to actually parse the
  >> tag. Another generic "roxy_rd(tag)" to generate rd entry, and yet
  >> another generic "roxy_template(tag)" to generate template. And so on.

  HW> I think this is more inline with how roxygen2 works. You can't have a
  HW> methods that just work with a single documentation block + object
  HW> (rocblock for short) at a time, because some tags work more globally
  HW> (e.g. @family, @include, @inheritParams).  That's more of a comment on
  HW> your suggested function names rather than using S4 - but it does have
  HW> a big impact on the API.

I didn't think about that. I barely understand how roxygen works as
yet:). You have rockblock_parser class already. I guess it's just a
question of S3 or S4 then.

Actually, the parsing generic (roxyParse or whatever) can by default
take two arguments, the object and the whole bunch of rocblocks. Each
tag will decide for itself whether to use rockblocks argument or not.

I guess you are already doing something similar, but I am a bit confused
of why the distinction between parse_rocblocks and roccer$parser is
necessary.

  >> The end user can getClass("roxy_tag") to see all the tags which
  >> are available. Same applies to methods.

  HW> Hmmm, that would be nice.

Especially if other packages extend the tag system :)

  >> All of this looks simple, consistent and transparent to me. In order to
  >> extend roxygen, one would not need to dig into the code and learn how it
  >> works and try to find workarounds to implement features which are not
  >> there. But, instead, just start writing methods and classes directly.

  HW> I think there are two issues at play:

  HW> * whether to use S3 or S4
  HW> * the design of the object system

  HW> I think the object system can definitely be improved (and your
  HW> discussion is really helpful), but I'm not convinced that using S4
  HW> over S3 brings enough advantages to make it worthwhile.  It would be
  HW> very useful if you could lay out what the main advantages of S4 in
  HW> your mind are in this situation.  That would help me think it through.
  HW> (Two advantages: it would force me to make roxygen S4 support a lot
  HW> better, and would force me to use S4 on a larger project ;)

  HW> My feeling is that generally R users are more familiar and comfortable
  HW> with S3 rather than S4.  So it might make it less likely to get
  HW> contributions.

Now roxy users have to learn roxyClasses system ;). And by building new
packages on S3 you actually contributing to rooting and roting of
S3. It's surprising why people are so stuck with it. S4 is so simple;
there are only two main functions setClass, setMethod. Nobody needs to
know more.

I can hardly add anything new to well known S4 advantages. Here are a
couple of obvious thoughts:

  - S4 is R standard and R-core encourages using it. S3 is virtually
    subsumed to S4 right now.

  - Extension across packages is completely handled in the background,
    and it takes a huge load of your shoulders. There are myriads of
    classes and objects which people might want to document in a
    different way: RefClasses, Rjava, Cpp, proto, etc. Roxygen should
    not care about them. Each package should define it's own tags,
    parsers, Rd converters etc.

  - S4 is actually a good system -- multiple inheritance and multiple
    dispatch. Thing which is implemented only by a handful of languages.

  - Roxygen might need multiple dispatch/inheritance, even if it is not
    apparent right now.

  - Type checks are done automatically. 

  - Building new tags on top of others is easy. Inheriting the behavior
    is automatic.

  - Conversion between current system of roccers to S4 is trivial. You
    handle almost everything as a list structures which will become S4
    objects with slots. 

  - Explaining how to extend roxygen won't take more than half a page:

         Reading source file -> Spliting in rocblocks -> Parsing with
         roxyParse -> Conveting to Rd with roxyRd. Please write your
         roxyParse and roxyRd methods.

  - People will finally learn some S4 :)

There must be more, but it's getting too late. 

  >> >> Can you give an example?  Generally, if you can automatically generate
  >> >> the template, why can't you automatically generate the Rd directly?
  >> >>
  >> 
  >> What I meant here is the following. Suppose you have a function
  >> declaration:
  >> 
  >> foo <- function(a = 4, b = 34){
  >> a + b
  >> }
  >> 
  >> To start documenting the function, a user might want to insert a
  >> skeleton for a documentation (template) like following
  >> 
  >> ##' ..description
  >> ##'
  >> ##' @title
  >> ##' @param a
  >> ##' @param b
  >> ##' @return
  >> ##' @author User Name
  >> ##' @examples
  >> 
  >> Depending on the editor, this might be bound to a key.

  HW> Ah, I see. And you see this being the role of roxygen to generate,
  HW> rather than the editor?  

Right, but if one editor have already done that (ESS for example) why
not to reuse the code in an editor independent way?

  HW> It seems like there's a lot that people could disagree on (##' vs
  HW> #', do you need @author) etc.

It could be customized in options$roxygen or alike. If roxygen
establishes an uniform customization interface, then editors will be
forced to pick it up from there.

  >> There might be also a roxy_update(OBJECT, OLD_TEXT) method which would
  >> take OLD_TEXT and output a modified version of it to account for changes
  >> in OBJECT. For example, if you have documented parameters a and b above,
  >> and then decided to rename b into c, then roxy_update will just change
  >> @param b into @param c. An editor can bind this to a key.

  HW> I think that's a nice idea, but a lot of work!

I would leave that to users and editors. ESS does the updating well for
functions, and the functionality will come pretty fast for other
objects, once the proper interface is in place.

      Vitalie.