[datatable-help] data.table inherits from data.frame [was:Auto-convert characters to factors when settings keys?]

Short, Tom TShort at epri.com
Mon Jun 28 18:08:56 CEST 2010


Harish, you'll still (usually) get back a data.frame in that situation. There may be a situation involving just manipulation of the original where the function keeps the object as a data.table. Auto-converting return values would be a challenge.

- Tom


 

> -----Original Message-----
> From: datatable-help-bounces at lists.r-forge.r-project.org 
> [mailto:datatable-help-bounces at lists.r-forge.r-project.org] 
> On Behalf Of Harish
> Sent: Monday, June 28, 2010 11:57
> To: datatable-help at lists.r-forge.r-project.org
> Subject: Re: [datatable-help] data.table inherits from 
> data.frame [was:Auto-convert characters to factors when 
> settings keys?]
> 
> Removing the requirement for explicit conversions simplifies 
> things a bit.  Thanks.
> 
> You commented on arguments passed into a function.  What 
> happens to return values?  For example, the reshape functions 
> (reshape(), cast(), melt(), etc.) returns a data.frame.  Are 
> those automatically treated to be data.table in the workspace 
> that is data.table aware?
> 
> 
> Harish
> 
> 
> --- On Mon, 6/28/10, mdowle at mdowle.plus.com 
> <mdowle at mdowle.plus.com> wrote:
> 
> > From: mdowle at mdowle.plus.com <mdowle at mdowle.plus.com>
> > Subject: [datatable-help] data.table inherits from data.frame [was: 
> > Auto-convert characters to factors when settings keys?]
> > To: mailinglist.honeypot at gmail.com
> > Cc: datatable-help at lists.r-forge.r-project.org
> > Date: Monday, June 28, 2010, 3:03 AM
> > 
> > The subject of this thread got misleading too. Changing that now.
> > Was: Auto-convert characters to factors when settings keys?
> > 
> > 
> > Thanks Steve,
> > 
> > There are two types of users of data.table. Ones who know they are 
> > using it, and ones that don't. This change is for the latter. If a 
> > base function that only accepts data.frame, such as subset for 
> > example, does dt[,c('b','c')] inside it, then that actually 
> does work 
> > and returns the columns, not c('b','c').
> > 
> > For example, at the end of subset.data.frame there is :
> > 
> >     x[r, vars, drop = drop]
> > 
> > and that will use [.data.frame even though x is data.table. 
>  subset is 
> > a base function and as such isn't a data.table aware user.
> > 
> > However, when a user (such as you or I) uses data.table, 
> then we can 
> > use its features such as i and j expressions of column names, joins 
> > using x[y][z] syntax, etc.  No changes there.  If we want 
> > dt[,c("b","c")] to return the columns, then we will still have to 
> > convert to data.frame, as before.  Thats because we work in 
> our user 
> > workspace which is data.table aware.
> > 
> > It depends on where [.data.table was called from.
> > 
> > Its just so that packages (e.g. ggplot), and base 
> functions, can work 
> > with data.table more easily now, without removing any of the 
> > advantages of the [.data.table syntax, and without requiring 
> > conversion.
> > 
> > Makes more sense now hopefully ?
> > 
> > Matthew
> > 
> > 
> > > Hi,
> > >
> > > On Sun, Jun 27, 2010 at 6:47 PM, Matthew Dowle 
> > > <mdowle at mdowle.plus.com>
> > > wrote:
> > >> I went back to try again with S3 inheritance,
> > discussed further up in
> > >> this thread.  Just committed as it seems, so far,
> > to work.
> > >>
> > >> * data.table now inherits from data.frame i.e.
> > class =
> > >> c("data.table","data.frame")
> > >> * is.data.frame() now returns TRUE for data.table
> > >> * data.table should now be compatible with
> > functions and packages that
> > >> _only_ accept data.frame.
> > >
> > > Perhaps I lost the point of this conversation
> > somewhere along the way,
> > > but this change makes it *technically* compatible
> > since a data.table
> > > passes an is.data.frame test, but it doesn't work in
> > ways that are
> > > perfectly acceptable for some function accepting a
> > data.frame to work,
> > > ie:
> > >
> > > R> library(data.table)
> > > R> dt <- data.table(a=1:5, b=letters[1:5],
> > c=sample(1:100, 5))
> > > R> dt[,c('b', 'c')]
> > > [1] "b" "c"
> > >
> > > instead of
> > >
> > > R> df <- as.data.frame(dt)
> > > R> df[,c('b', 'c')]
> > >   b  c
> > > 1 a 18
> > > 2 b 11
> > > 3 c  2
> > > 4 d 50
> > > 5 e 96
> > >
> > > Unless you were talking about making more changes to
> > make a data.table
> > > act more like a data.frame, I'm not sure allowing the
> > user to ignore
> > > the differences between data.table/frames is really a
> > win/win
> > > situation.
> > >
> > > Sorry if I missed something.
> > > -steve
> > >
> > > --
> > > Steve Lianoglou
> > > Graduate Student: Computational Systems Biology
> > >  | Memorial Sloan-Kettering Cancer Center
> > >  | Weill Medical College of Cornell University  Contact Info: 
> > >http://cbio.mskcc.org/~lianos/contact
> > >
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > datatable-help mailing list
> > datatable-help at lists.r-forge.r-project.org
> > 
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable
> > -help
> > 
> 
> 
>       
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/d
atatable-help
> 


More information about the datatable-help mailing list