[datatable-help] data.table inherits from data.frame [was: Auto-convert characters to factors when settings keys?]

Short, Tom TShort at epri.com
Mon Jun 28 18:37:13 CEST 2010


One thing I caught is that in setting the row.names attribute, you use:

attr(x,"row.names") = integer()

I don't think that's right for data.frames. I think you need something like:

attr(x,"row.names") = .set_row_names( nrow(x) )

or 

attr(x,"row.names") = 1:nrow(x) 

.set_row_names is defined as:

function (n) 
if (n > 0) c(NA_integer_, -n) else integer(0L)
<environment: namespace:base>

- Tom


> -----Original Message-----
> From: datatable-help-bounces at lists.r-forge.r-project.org 
> [mailto:datatable-help-bounces at lists.r-forge.r-project.org] 
> On Behalf Of mdowle at mdowle.plus.com
> Sent: Monday, June 28, 2010 06:03
> To: mailinglist.honeypot at gmail.com
> Cc: datatable-help at lists.r-forge.r-project.org
> Subject: [datatable-help] data.table inherits from data.frame 
> [was: Auto-convert characters to factors when settings keys?]
> 
> 
> The subject of this thread got misleading too. Changing that now.
> Was: Auto-convert characters to factors when settings keys?
> 
> 
> Thanks Steve,
> 
> There are two types of users of data.table. Ones who know 
> they are using it, and ones that don't. This change is for 
> the latter. If a base function that only accepts data.frame, 
> such as subset for example, does dt[,c('b','c')] inside it, 
> then that actually does work and returns the columns, not c('b','c').
> 
> For example, at the end of subset.data.frame there is :
> 
>     x[r, vars, drop = drop]
> 
> and that will use [.data.frame even though x is data.table.  
> subset is a base function and as such isn't a data.table aware user.
> 
> However, when a user (such as you or I) uses data.table, then 
> we can use its features such as i and j expressions of column 
> names, joins using x[y][z] syntax, etc.  No changes there.  
> If we want dt[,c("b","c")] to return the columns, then we 
> will still have to convert to data.frame, as before.  Thats 
> because we work in our user workspace which is data.table aware.
> 
> It depends on where [.data.table was called from.
> 
> Its just so that packages (e.g. ggplot), and base functions, 
> can work with data.table more easily now, without removing 
> any of the advantages of the [.data.table syntax, and without 
> requiring conversion.
> 
> Makes more sense now hopefully ?
> 
> Matthew
> 
> 
> > Hi,
> >
> > On Sun, Jun 27, 2010 at 6:47 PM, Matthew Dowle 
> > <mdowle at mdowle.plus.com>
> > wrote:
> >> I went back to try again with S3 inheritance, discussed 
> further up in 
> >> this thread.  Just committed as it seems, so far, to work.
> >>
> >> * data.table now inherits from data.frame i.e. class =
> >> c("data.table","data.frame")
> >> * is.data.frame() now returns TRUE for data.table
> >> * data.table should now be compatible with functions and packages 
> >> that _only_ accept data.frame.
> >
> > Perhaps I lost the point of this conversation somewhere 
> along the way, 
> > but this change makes it *technically* compatible since a 
> data.table 
> > passes an is.data.frame test, but it doesn't work in ways that are 
> > perfectly acceptable for some function accepting a 
> data.frame to work,
> > ie:
> >
> > R> library(data.table)
> > R> dt <- data.table(a=1:5, b=letters[1:5], c=sample(1:100, 5)) 
> > R> dt[,c('b', 'c')]
> > [1] "b" "c"
> >
> > instead of
> >
> > R> df <- as.data.frame(dt)
> > R> df[,c('b', 'c')]
> >   b  c
> > 1 a 18
> > 2 b 11
> > 3 c  2
> > 4 d 50
> > 5 e 96
> >
> > Unless you were talking about making more changes to make a 
> data.table 
> > act more like a data.frame, I'm not sure allowing the user 
> to ignore 
> > the differences between data.table/frames is really a win/win 
> > situation.
> >
> > Sorry if I missed something.
> > -steve
> >
> > --
> > Steve Lianoglou
> > Graduate Student: Computational Systems Biology  | Memorial 
> > Sloan-Kettering Cancer Center  | Weill Medical College of Cornell 
> > University Contact Info: http://cbio.mskcc.org/~lianos/contact
> >
> 
> 
> 
> 
> 
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/d
atatable-help
> 


More information about the datatable-help mailing list