[datatable-help] Discrepancy between as.data.frame & as.data.table when handling nested lists

Matthew Dowle mdowle at mdowle.plus.com
Thu Aug 8 06:48:47 CEST 2013


On 08/08/13 05:11, Eduard Antonyan wrote:
> This seems like a pretty natural interpretation of list->data.table to 
> me, although it would be nice to maybe get a warning I think here:
>
>     X = list(a = list(1,2), b = list(1,2,3))
>
>     as.data.table(X)
Good point,  that should be recycle-remainder warning.
>
> especially since this simply refuses to do anything:
>
>     data.table(a = c(1,2), b = c(1,2,3))
Ah but that's for consistency with data.frame ;)

 > data.frame(1:2,1:3)
Error in data.frame(1:2, 1:3) :
   arguments imply differing number of rows: 2, 3
 >

Happy to recycle and change both above to recyle-remainder warning.   If 
there's no recycle remainder then no warning, right? Repeating 
singletons is a common use case I wouldn't want a warning about, for 
example.   Could you file a request please?
>
>
> On Wed, Aug 7, 2013 at 10:30 PM, Ricardo Saporta 
> <saporta at scarletmail.rutgers.edu 
> <mailto:saporta at scarletmail.rutgers.edu>> wrote:
>
>     Hey Frank,
>
>     Thanks for pointing out that SO link, I had missed it.
>
>     All,
>
>     I'm curious as to which used cases this functionality would be
>     used in (used for?)
>
>     thanks,
>     Rick
>
>
>
>     On Wed, Aug 7, 2013 at 8:14 PM, Frank Erickson <FErickson at psu.edu
>     <mailto:FErickson at psu.edu>> wrote:
>
>         Hi Rick,
>
>         I guess it's intentional: Matthew saw this SO question (since
>         he edited one of the answers):
>         http://stackoverflow.com/questions/9547518/creating-a-data-frame-where-a-column-is-a-list
>
>         Some musings: Of course, to reproduce as.data.frame-like
>         behavior, you can un-nest the list, so both functions treat it
>         the same way.
>
>             Z <- unlist(Y,recursive=FALSE)
>
>             identical(as.data.table(Z),as.data.table(as.data.frame(Z))) #
>             TRUE
>             # or, equivalently (?)
>             identical(do.call(data.table,Z),data.table(do.call(data.frame,Z)))
>             # TRUE
>
>
>         On the other hand, going back the other direction (getting
>         data.table-like behavior when data.frame's is the default) is
>         more awkward, as seen in that SO question (where they mention
>         protecting each sublist with the I() function). Besides, I'm
>         with @flodel, who asked the SO question, in expecting
>         data.table's behavior: one top-level item in the list mapping
>         to one column in the result...
>
>         --Frank
>
>         On Wed, Aug 7, 2013 at 4:56 PM, Ricardo Saporta
>         <saporta at scarletmail.rutgers.edu
>         <mailto:saporta at scarletmail.rutgers.edu>> wrote:
>
>             Hi all,
>
>             Note the following discrepancy in structure between
>             as.data.frame & as.data.table when called on a nested list.
>             as.data.frame converts the sublist into individual columns
>             whereas as.data.table stacks them into a single column and
>             creates additional rows.
>
>             Is this intentional?
>             -Rick
>
>
>             as.data.frame(X)
>             #        start       type  end data.editDist data.second
>             # 1 start_node is_similar end_node             1  HelloWorld
>
>             as.data.table(X)
>             #         start       type    end       data
>             # 1: start_node is_similar end_node          1
>             # 2: start_node is_similar end_node HelloWorld
>
>
>
>
>             ### Copy+Paste'able Below ###
>
>             # Example 1:
>             X <-  structure(list(start = "start_node", type =
>             "is_similar", end = "end_node",
>                 data = structure(list(editDist = 1, second =
>             "HelloWorld"), .Names = c("editDist",
>                 "second"))), .Names = c("start", "type", "end", "data"))
>
>             as.data.frame(X)
>             as.data.table(X)
>
>             as.data.table(as.data.frame(X))
>
>
>             # Example 2, with more elements:
>             Y <- structure(list(start = c("start_node", "start_node"),
>             type = c("is_similar", "is_similar"), end = c("end_node",
>             "end_node"), data = structure(list(editDist = c(1, 1),
>             second = c("HelloWorld", "HelloWorld")), .Names =
>             c("editDist", "second"))), .Names = c("start", "type",
>             "end", "data"))
>
>             as.data.frame(Y)
>             as.data.table(Y)
>
>
>             _______________________________________________
>             datatable-help mailing list
>             datatable-help at lists.r-forge.r-project.org
>             <mailto:datatable-help at lists.r-forge.r-project.org>
>             https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
>
>
>
>     _______________________________________________
>     datatable-help mailing list
>     datatable-help at lists.r-forge.r-project.org
>     <mailto:datatable-help at lists.r-forge.r-project.org>
>     https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
>
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130808/dc12c76a/attachment.html>


More information about the datatable-help mailing list