[datatable-help] Weird behavior with S4 subclasses of data.table after loading RCurl

Jeffrey Arnold jeffrey.arnold at gmail.com
Mon Oct 1 19:39:00 CEST 2012


(Sorry, Steve; I realized that I originally replied to you instead of the
list)

Okay, the good news is that I know what's going on; the bad news is that
there exists no fix that doesn't break the existing syntax of data.table.

RCurl was a red-herring.  I'm almost certain that loading any library that
adds new S4 "[" methods will trigger this behavior. E.g. "Matrix", etc.
 The reason that my code never worked when you ran it is probably
because you had already loading some class like that before running my code.

What is happening due to the difference in the way that S3 and S4 check the
signatures before method dispatch and R's use of lazy evaluation.  Since S3
only checks the first argument, it never evaluates j, which allows
data.table to do its cool things with expressions in that argument.
 Because the S4 "[" method checks the classes of x, i, and j before
dispatching it must evaluate j.  If j is an expression it either throws an
error, or it will not, but it will do the unintended thing of evaluating
the expression in the calling frame instead of within the data.table.

As far as I can tell, there is no way to fix this without altering the
syntax of data.table.

I do have the following workaround.  Adding the following S4 methods allows
the use of quoted expressions in j for S4 classes inheriting from
data.table that act like unquoted expressions for the S3 data.table.

setMethod("[", c(x="data.table", i="ANY", j="ANY"),
>           function(x, i, j, ...) callNextMethod(...))
> setMethod("[", c(x="data.table", j="language"),
>           function(x, i, j, ...) data.table(x)[j=eval(j), ...])
>

E.g.

> library("Matrix")
>
> > setClass("DataTable2", contains="data.table")
> > setMethod("[", c(x="data.table", i="ANY", j="ANY"),
> +           function(x, i, j, ...) callNextMethod(...))
> [1] "["
> > setMethod("[", c(x="data.table", j="language"),
> +           function(x, i, j, ...) data.table(x)[j=eval(j), ...])
> [1] "["
> > ## This still doesnt work.
>
> > DT2[,v]
> Error: object 'v' not found
> > ## This does work
> > DT2[,quote(v)]
>
> [1] 1 2 3 4 5 6 7 8 9
>
> DT2[,quote(sum(v))]
> [1] 45


I hadn't realized that I was doing something unintended when I started, or
maybe I wouldn't have :-)  Now R supports S4 classes inheriting from S3
classes pretty well, so it seemed like a good idea at the time.   The S4
class I am actually writing is for storing / manipulating MCMC samples. One
way to do that is to have a data.frame like object with specific columns,
e.g. "chain", "iteration", "parameter", ..., and then add functions that
take advantage of this known structure.  I want to inherit from the
data.frame directly so that it can make use of all the generic functions
defined for the data.frame.  It is more intuitive to use object[...] rather
than object at someSlotName[...].That all works great, except that these get
samples can get pretty big, so, of course, I want the performance of
data.table :-) if I can have it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20121001/718c4e8f/attachment.html>


More information about the datatable-help mailing list