[Roxygen-devel] roxygen3

Tue Aug 28 22:19:46 CEST 2012

>   HW> I've also been wondering about splitting "roxygen3" into two packages
>   HW> - one that defines all the basic objects etc and one that creates all
>   HW> the tags.
>
> An add-on package would have to load both right? So what is the
> advantage of the split then?

* When you look at the documentation, there's less confusion between
what a user needs, and what a developer needs

* You can use the roxygen framework without buying into any of my
documentation philosophy.

> Ok, I looked into it.
>
> It's still the pseudo dispatch on textual representation of the object
> definition. That is you parse the "foo <- function(" and dispatch
> (object_from_call) on "function", setGeneric(foo) is dispatched on
> "setGeneric" etc.
>
> Wouldn't it be better to inspect the evaluation environment for the
> traces of the evaluation and then dispatch on the objects discovered?
> Then the code
>
> aaa <- local({ ... compute object ..})
>
> will correctly dispatch on aaa and won't be ignored.

It does do that - but you need to parse the call because there are
number of calls that produce global side effects (e.g. creating
classes and methods). You could try doing it after the fact (e.g. by
using S4 introspection to find all the objects to document), but then
it's much more difficult to match the documentation block with the
object.

The pseudo-S3 dispatch isn't particularly elegant, but it seemed liked
a good 90% solution.

> It will be possible to create documentation for a bunch of objects at
> the same time. For example
>
> local({ a <- generate_object_a()
>         b <- generate_object_b()})
>
> will document both a and b.

Could you flesh out this example a bit more?  I don't understand why
you'd want to document objects that aren't evaluated by the user.

>   >> I don't see the need for every tag to be class aware just a few.
>
> This is a complication. You ending up in implementing your own OO
> system. For example,  from roccer-.r:
>
>      #' The roccer object is a key component in roxygen3 - it defines the behaviour
>      #' of a tag with a \code{parser} and a \code{output} write.
>
> Why would you need an roccer if you already have "classes" and "methods"
> to define the behavior of the tag?

I'm not sure I get your point - the roccer _is_ the object that
represents the tag.

> You can just have a virtual S4 class "roxy_tag". Then subclass
> "roxy_tag_oxygen" and have all other tags derive from that. Most of them
> will probably have only two slots, "name" and "text", but some like
> "slots" tag will have more.
>
> Then you can have "roxy_split_oxigen(object, doc)" generic dispatched on
> object which would split the string 'doc' into tags. Each tag is an
> object.  Then another generic "roxy_parse(tag)" to actually parse the
> tag. Another generic "roxy_rd(tag)" to generate rd entry, and yet
> another generic "roxy_template(tag)" to generate template. And so on.

I think this is more inline with how roxygen2 works. You can't have a
methods that just work with a single documentation block + object
(rocblock for short) at a time, because some tags work more globally
(e.g. @family, @include, @inheritParams).  That's more of a comment on
your suggested function names rather than using S4 - but it does have
a big impact on the API.

> The end user can getClass("roxy_tag") to see all the tags which
> are available. Same applies to methods.

Hmmm, that would be nice.

> All of this looks simple, consistent and transparent to me. In order to
> extend roxygen, one would not need to dig into the code and learn how it
> works and try to find workarounds to implement features which are not
> there. But, instead, just start writing methods and classes directly.

I think there are two issues at play:

* whether to use S3 or S4
* the design of the object system

I think the object system can definitely be improved (and your
discussion is really helpful), but I'm not convinced that using S4
over S3 brings enough advantages to make it worthwhile.  It would be
very useful if you could lay out what the main advantages of S4 in
your mind are in this situation.  That would help me think it through.
(Two advantages: it would force me to make roxygen S4 support a lot
better, and would force me to use S4 on a larger project ;)

My feeling is that generally R users are more familiar and comfortable
with S3 rather than S4.  So it might make it less likely to get
contributions.

>   >> Can you give an example?  Generally, if you can automatically generate
>   >> the template, why can't you automatically generate the Rd directly?
>   >>
>
> What I meant here is the following. Suppose you have a function
> declaration:
>
>      foo <- function(a = 4, b = 34){
>        a + b
>      }
>
> To start documenting the function, a user might want to insert a
> skeleton for a documentation (template) like following
>
>      ##' ..description
>      ##'
>      ##' @title
>      ##' @param a
>      ##' @param b
>      ##' @return
>      ##' @author User Name
>      ##' @examples
>
> Depending on the editor, this might be bound to a key.

Ah, I see. And you see this being the role of roxygen to generate,
rather than the editor?  It seems like there's a lot that people could
disagree on (##' vs #', do you need @author) etc.

> There might be also a roxy_update(OBJECT, OLD_TEXT) method which would
> take OLD_TEXT and output a modified version of it to account for changes
> in OBJECT. For example, if you have documented parameters a and b above,
> and then decided to rename b into c, then roxy_update will just change
> @param b into @param c. An editor can bind this to a key.

I think that's a nice idea, but a lot of work!

Hadley

-- 
Assistant Professor
Department of Statistics / Rice University
http://had.co.nz/