[datatable-help] Suggest a cool feature: Use data.table like a sorted / indexed data.list?

Tom Short tshort.rlists at gmail.com
Fri Sep 17 23:00:27 CEST 2010


Branson, that's been on the wishlist for a while:

https://r-forge.r-project.org/tracker/index.php?func=detail&aid=202&group_id=240&atid=978

It hasn't been an urgent enough need for anyone to dig into it. You
can always use one data table to index a list. It may take more memory
and a bit more bookkeeping for the user, but it's not that hard.

- Tom

On Fri, Sep 17, 2010 at 1:47 PM, Branson Owen <branson.owen at gmail.com> wrote:
> I believe you have been already aware of what I know. Just add some
> suggestions here.
>
> My understanding for data.frame is list of column VECTORs, so is
> data.table. What I just learned is that data.frame now can be a list
> of anything?
>
>> DF = data.frame(A = 1:3, B = rnorm(3))
>> DF$C = data.frame(a=1:3,b=rnorm(3))
>> DF$D = list(i=1:6, j = 1,k="?")
>> print(DF)
>
>  A         B     C.a        C.b                D
> 1 1 -0.949565   1 -0.5815717 1, 2, 3, 4, 5, 6
> 2 2 -1.903233   2 -0.5087712                1
> 3 3  1.559566   3  1.4596933                ?
>
>> class(DF$C)
> [1] "data.frame"
>
>> class(DF$D)
> [1] "list"
>
> This is very cool to me! I can think of many benefits from this features.
>
> A very common example: if D is a function of B but with variable
> output size, and I want to do fast grouping or sorting based on key A.
> Before I know this, I would have to save them as separate objects and
> add complexity of my codes. This just adds coding and management
> sugar. No benefits to performance yet.
>
> But, I think data.table can make a difference just like it makes
> differences to data.frame! There is no sorted / indexed list object
> yet, right? If my variable-size outputs are millions length, any
> aggregating operation on a less structured object like it will be
> painful. Technically, data.table can make it a sorted list to enjoy
> data.table high performance and syntax.
>
> I did some tests, use data.table as data.list, but most of the
> syntaxes that work for data.frame doesn't work for data.table.
>
> I would expect this could be an easy feature, since data.frame is kind
> of smoothly support it. Just a suggestion. *^^*
>
> Best regards,
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>


More information about the datatable-help mailing list