[Roxygen-devel] Parsing roxgen blocks

Vitalie Spinu spinuvit at gmail.com
Wed Aug 29 18:11:35 CEST 2012


  >> Hadley Wickham <hadley at rice.edu>
  >> on Wed, 29 Aug 2012 08:54:57 -0500 wrote:

  >> Each code chunk should be evaluated in a new environment. Then
  >> environment is inspected and all the new objects/classes/methods are
  >> returned.

  > So you have to evaluate everything twice? Or do you copy over the
  > results once you've done run it? (Is that even possible for S4?)
  > Otherwise, how do you make sure that the code actually runs?

  > i.e. how do you evaluate this code?

  > a <- 1
  > b <- 2
  > a + b

Indeed, this is a slight complication which is  solved (is it?) by
stacking the evaluation environments. That is, each new code chunk is
evaluated in an environment whose parent is the environment from the
previous evaluation. 

Another option is to make snapshots for each evaluation and compare
those after each invocation.

  > That approach also starts to get complicated with S4 because it's
  > picky about how you create the environment.

I will revisit this. As far as I remember the internals are pretty
straightforward.

  >> There might be an environment (or list) in roxygen namespace holding all
  >> the object guessers (roxy_env_inspectors?). Each of them is called and
  >> should return a list of new objects detected.
  >> 
  >> The basic one just looks for normally assigned objects,
  >> roxy_env_inspector.S3 looks for S3 tables *in* the evaluation
  >> environment, roxy_env_inspector.S4_classes looks in class table and gets
  >> the classes defined, roxy_env_inspector.S4_methods searches the S4
  >> table.

  > I just don't see the big advantage of this over parsing the call.
  > This approach will be much more expensive because now you have to run
  > multiple tests after every single expression.  

Is the bottleneck in the code evaluation? It looks to me that all of
that is insignificant as compared to the parsing and rd generation.

  > (Also I'm pretty sure your approach for S3 won't work, because those
  > S3 tables are created by namespace definitions, not by evaluating a
  > function)

Hmm, indeed, I have been overoptimistic. But s3 methods are just
functions, and automatic detection is not possible anyways.  The user
have to declare them as S3 method anyways. So nothing is lost with
respect tot he current implementation. 

  >> If a package declares some wiredo side effect initialization (which is
  >> pretty rare) it should define roxy_env_inspector.Wierdo_thing and add it
  >> to roxy_env_inspectors environment (which should not be sealed in the
  >> package).

  > So now you are defining your own object system ;)  

Very simple one - just functions, no dispatch no objects. You will need
something similar anyhow, as there are no objects at that stage, and no
OO system can be used.

  > Why wouldn't should these functions just be single method classes
  > that inherit from RoxyDetector or similar?

I guess you mean the object_from_call.foo dispatch mechanism here.

The benefit of the RoxyDetector is that in 99.99% of the cases, the
builtin detectors will do the job. A developer which would like to have
a custom documentation for an object of class X don't need to bother at
all with object_from_call and pseudo dispatch.

Whatever the textual representation by which object X is generated X <-
new(..), or X <- X.constructor(), or createObjectX(), or X <- eval(...)
will always have the same result.  It is much easier for the end user,
who is not forced to use a specific declaration for roxygen to work. All
what matters is the end object(s) which the code generate.

In current implementation, whenever a guy would like to have a different
call syntax, you will have to modify the base code.

In .01% of the wiredo cases with side effects, a developer will have to
write the detector function, which is a much simpler concept than pseudo
dispatch on the call name.

I still find the object_from_call mechanism a bit tricky, it's not only
the call name, but also the object name that plays the role. So the call
parser should be very smart, not only about how to detect the call name,
but also on how to detect the name of the object. Am I missing anything?
Where does the 'name' in the last call comes from:

╭──────── #20 ─ roxygen3/R/object-from-call.r 
│ object_from_call <- function(call, env) {
│   if (is.null(call)) return()
│   
│   # Find function, then use match.call to construct complete call
│   f <- eval(call[[1]], env)
│   if (!is.primitive(f)) {
│     call <- match.call(eval(call[[1]], env), call)
│   }
│   
│   fun_name <- deparse(call[[1]])
│   f <- find_fun(str_c("object_from_call.", fun_name))
│ 
│   if (is.null(f)) return(NULL)
│   f(call, name, env)
│ }
╰──────── #34 


  >> >> a <- setMethod("plot", "numeric", function(x, ...) {})
  >> >> str(a)
  >> >  chr "plot"
  >> 
  >> > Which means you'd have to compare states of all the S4 class/method
  >> > tables before and after each call.  That seems slow and error prone to
  >> > me.
  >> 
  >> Each call to meta functions setClass, setMethod etc creates a table *in*
  >> the evaluation env, if it is a top environment (environment should be
  >> explicitly designated as a top environment for this to work).

  > I have had many problems trying to do this right for devtools, so I'm
  > not so sure it's that simple.

I have done this with ess-developer
(http://ess.r-project.org/Manual/ess.html#ESS-developer) which allows
seamless evaluation of the code directly into the package environment
and namespace instead of the .GlobalEnv. So far so good, no problems at
all.

Unless I have missed something very basic, I am pretty optimistic about
all this story:)

      Vitalie


More information about the Roxygen-devel mailing list