[datatable-help] A new package multitable (data.list) remind me of a long existing feature request #202 and discussion thread

Matthew Dowle mdowle at mdowle.plus.com
Sun Oct 2 02:58:17 CEST 2011


Hi ... 

On Thu, 2011-09-29 at 13:33 -0500, Branson Owen wrote:
> Dear all, 
> I just saw a new package 'multitable' (data.list) published today.
> Here is the selected description: 
> 
> "Many data sets do not easily fit into a single data frame, such as
> inherently multiple-table data (e.g. fourth-corner problem and other
> trait-based data sets). Storing such data in a single data frame
> results in either large numbers of meaningless missing values or
> storage of redundant information. The multitable package introduces
> new data storage objects called data.lists, which are extensions of
> data.frames. As data.lists can be coerced to data.frames, they can be
> used with all R functions that accept an object that is coercible to a
> data.frame (e.g. lm; plot; lme; and many more). The multitable package
> also provides several mechanisms for simplifying the manipulation of
> data.list objects."
> 
> [data.list] http://cran.r-project.org/web/packages/multitable/index.html

I read through the vignette, looks very interesting.

> [Feature Request
> #202] https://r-forge.r-project.org/tracker/index.php?func=detail&aid=202&group_id=240&atid=978
> [Discussion]
> http://r.789695.n4.nabble.com/Suggest-a-cool-feature-Use-data-table-like-a-sorted-indexed-data-list-td2544213.html

I thought that thread was largely solved. v1.5.2 came after that (I
think) which made grouping work on list() columns. #202 was then changed
to be just 'matrix as columns'. You can already have any sized matrix
(or data.table, or anything) as an element in a data.table (as an
element of a list() column). Not sure if that helps at all. multitable
seems to be about combining together a database of tables into one
object (i.e. different)?

> DT = data.table(a=1:3)
> DT$obj =
list(matrix(1:3,3),data.table(letters[1:4],1:4),data.table(letters[1:2],1:6))
> DT[1,obj]
[[1]]
     [,1]
[1,]    1
[2,]    2
[3,]    3

> DT[2,obj]
[[1]]
     V1 V2
[1,]  a  1
[2,]  b  2
[3,]  c  3
[4,]  d  4

> setkey(DT,a)
> DT[J(3),obj,mult="first"]   # keyed data.list?
[[1]]
     V1 V2
[1,]  a  1
[2,]  b  2
[3,]  a  3
[4,]  b  4
[5,]  a  5
[6,]  b  6


> [Idea] I am very excited when I see this package, and can't help
> imagining the possibility of data.table and data.list mutually
> leveraging each other, i.e. data.list can leverage data.table's
> beautiful syntax and extremely high performance; while data.table can
> leverage data.list's flexible object model (the long existing feature
> request #202).

> To data.table core team, will multitable (data.list) package make it
> easy to implement feature #202?

I'm not sure it will. The matrix that #202 is about would have the same
number of rows as the data.table. That might be too restrictive for
multitable.  All the examples I know of so far can be implemented as
long or wide format with NAs (or no rows), or joining together tables.
multitable seems to be encapsulating the tables and the relationships
into one object i.e. a level above data.table?

> 
> I don't know whether the multitable's author is aware of data.table.
> I will write to make suggestions to the multitable author:
> at least, I think it should be an easy change for him to make
> data.list an extension of data.table instead of data.frame, and will
> make multitable package directly inherit data.table's power. Any
> thoughts before I approach Steve? or it would be better to let
> data.table team raise this idea to him? :)

That would be great if you could approach. Thanks. Would be great to
leverage the synergies.

> 
> 
> Best regards,
> 
> 




More information about the datatable-help mailing list