[datatable-help] data.table inherits from data.frame [was: Auto-convert characters to factors when settings keys?]

mdowle at mdowle.plus.com mdowle at mdowle.plus.com
Mon Jun 28 19:50:27 CEST 2010


Glad you found that, meant to come back to it ;)

.set_row_names sounds like the way to go then, will do. Looks like it
avoids the memory of 1:nrow(x) that I was trying to avoid. Neat.


> One thing I caught is that in setting the row.names attribute, you use:
>
> attr(x,"row.names") = integer()
>
> I don't think that's right for data.frames. I think you need something
> like:
>
> attr(x,"row.names") = .set_row_names( nrow(x) )
>
> or
>
> attr(x,"row.names") = 1:nrow(x)
>
> .set_row_names is defined as:
>
> function (n)
> if (n > 0) c(NA_integer_, -n) else integer(0L)
> <environment: namespace:base>
>
> - Tom
>
>
>> -----Original Message-----
>> From: datatable-help-bounces at lists.r-forge.r-project.org
>> [mailto:datatable-help-bounces at lists.r-forge.r-project.org]
>> On Behalf Of mdowle at mdowle.plus.com
>> Sent: Monday, June 28, 2010 06:03
>> To: mailinglist.honeypot at gmail.com
>> Cc: datatable-help at lists.r-forge.r-project.org
>> Subject: [datatable-help] data.table inherits from data.frame
>> [was: Auto-convert characters to factors when settings keys?]
>>
>>
>> The subject of this thread got misleading too. Changing that now.
>> Was: Auto-convert characters to factors when settings keys?
>>
>>
>> Thanks Steve,
>>
>> There are two types of users of data.table. Ones who know
>> they are using it, and ones that don't. This change is for
>> the latter. If a base function that only accepts data.frame,
>> such as subset for example, does dt[,c('b','c')] inside it,
>> then that actually does work and returns the columns, not c('b','c').
>>
>> For example, at the end of subset.data.frame there is :
>>
>>     x[r, vars, drop = drop]
>>
>> and that will use [.data.frame even though x is data.table.
>> subset is a base function and as such isn't a data.table aware user.
>>
>> However, when a user (such as you or I) uses data.table, then
>> we can use its features such as i and j expressions of column
>> names, joins using x[y][z] syntax, etc.  No changes there.
>> If we want dt[,c("b","c")] to return the columns, then we
>> will still have to convert to data.frame, as before.  Thats
>> because we work in our user workspace which is data.table aware.
>>
>> It depends on where [.data.table was called from.
>>
>> Its just so that packages (e.g. ggplot), and base functions,
>> can work with data.table more easily now, without removing
>> any of the advantages of the [.data.table syntax, and without
>> requiring conversion.
>>
>> Makes more sense now hopefully ?
>>
>> Matthew
>>
>>
>> > Hi,
>> >
>> > On Sun, Jun 27, 2010 at 6:47 PM, Matthew Dowle
>> > <mdowle at mdowle.plus.com>
>> > wrote:
>> >> I went back to try again with S3 inheritance, discussed
>> further up in
>> >> this thread.  Just committed as it seems, so far, to work.
>> >>
>> >> * data.table now inherits from data.frame i.e. class =
>> >> c("data.table","data.frame")
>> >> * is.data.frame() now returns TRUE for data.table
>> >> * data.table should now be compatible with functions and packages
>> >> that _only_ accept data.frame.
>> >
>> > Perhaps I lost the point of this conversation somewhere
>> along the way,
>> > but this change makes it *technically* compatible since a
>> data.table
>> > passes an is.data.frame test, but it doesn't work in ways that are
>> > perfectly acceptable for some function accepting a
>> data.frame to work,
>> > ie:
>> >
>> > R> library(data.table)
>> > R> dt <- data.table(a=1:5, b=letters[1:5], c=sample(1:100, 5))
>> > R> dt[,c('b', 'c')]
>> > [1] "b" "c"
>> >
>> > instead of
>> >
>> > R> df <- as.data.frame(dt)
>> > R> df[,c('b', 'c')]
>> >   b  c
>> > 1 a 18
>> > 2 b 11
>> > 3 c  2
>> > 4 d 50
>> > 5 e 96
>> >
>> > Unless you were talking about making more changes to make a
>> data.table
>> > act more like a data.frame, I'm not sure allowing the user
>> to ignore
>> > the differences between data.table/frames is really a win/win
>> > situation.
>> >
>> > Sorry if I missed something.
>> > -steve
>> >
>> > --
>> > Steve Lianoglou
>> > Graduate Student: Computational Systems Biology  | Memorial
>> > Sloan-Kettering Cancer Center  | Weill Medical College of Cornell
>> > University Contact Info: http://cbio.mskcc.org/~lianos/contact
>> >
>>
>>
>>
>>
>>
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/d
> atatable-help
>>
>




More information about the datatable-help mailing list