[datatable-help] rbindlist

Eduard Antonyan eduard.antonyan at gmail.com
Tue Dec 3 18:24:08 CET 2013


With a small difference from what you wrote I guess - the classes are
coerced to the most general one now in rbindlist (and therefore in rbind).


On Tue, Dec 3, 2013 at 11:22 AM, Eduard Antonyan
<eduard.antonyan at gmail.com>wrote:

> I took a cursory look at your code - the new rbind does everything you
> want (check use.names and the fill arguments), and you may want to take a
> look at its code.
>
>
> On Tue, Dec 3, 2013 at 11:05 AM, Alexandre Sieira <
> alexandre.sieira at gmail.com> wrote:
>
>> For whom it may concern, I wrote a (rather bulky) wrapper around
>> rbindlist that:
>>
>> - checks that the classes of columns with the same name match;
>> - fills in any missing columns with NAs of the appropriate type;
>> - reorders columns for consistency;
>> - calls rbindlist on the results of this preprocessing.
>>
>>  The code is here: https://gist.github.com/asieira/7772953
>>
>> The results would be as follows:
>>
>> > smartrbindlist(list(data.table(a=1, b=2), data.table(b=4, a=3)))
>>    a b
>> 1: 1 2
>> 2: 3 4
>>
>> > smartrbindlist(list(data.table(a=1, b=2), list(c=3),
>> data.table(d="foo")))
>>     a  b  c   d
>> 1:  1  2 NA  NA
>> 2: NA NA  3  NA
>> 3: NA NA NA foo
>>
>> > smartrbindlist(list(data.table(a=1L, b=2), list(a=10)))
>> Erro em smartrbindlist(list(data.table(a = 1L, b = 2), list(a = 10)))
>>   smartrbindlist: column a has different classes in entry 2 [numeric] and
>> its predecessors [integer]
>>
>> Hope this helps anyone else out there.
>>
>> --
>> Alexandre Sieira
>> CISA, CISSP, ISO 27001 Lead Auditor
>>
>> "The truth is rarely pure and never simple."
>> Oscar Wilde, The Importance of Being Earnest, 1895, Act I
>>
>> On 3 de dezembro de 2013 at 14:46:08, G See (gsee000 at gmail.com<//gsee000 at gmail.com>)
>> wrote:
>>
>> I agree. Here is a related thread:
>> http://thread.gmane.org/gmane.comp.lang.r.datatable/2231
>>
>> Garrett
>>
>>
>> On Tue, Dec 3, 2013 at 8:26 AM, Alexandre Sieira
>> <alexandre.sieira at gmail.com> wrote:
>> > I have come across some behavior in rbindlist that look unexpected to
>> me:
>> >
>> >> rbindlist(list(data.table(a=1, b=2), data.table(b=4, a=3)))
>> > a b
>> > 1: 1 2
>> > 2: 4 3
>> >
>> > So it appears to assume (without checking) that all objects have not
>> only
>> > the same column names but also the same column order. So a value
>> assigned
>> > to column ‘a’ in the second object was used for column ‘b’ in the end
>> result
>> > (and vice-versa).
>> >
>> > I know the documentation says rbindlist uses the column types from the
>> first
>> > entry of the list, but I didn’t see any mention to column order or
>> names
>> > anywhere.
>> >
>> > I suggest that column names are matched, even if they are not in the
>> same
>> > order. Perhaps a ‘use.names’ parameter could be used to ask for this
>> > behavior to avoid breaking backwards compatibility.
>> >
>> > Or, at the very least, I suggest the documentation of bindlist be
>> updated to
>> > explicitly mention that the columns will be considered by position
>> only, and
>> > that callers need to ensure the column orders of all objects match
>> exactly.
>> > And that a warning is issued by rbindlist when the column names don’t
>> match.
>> >
>> > --
>> > Alexandre Sieira
>> > CISA, CISSP, ISO 27001 Lead Auditor
>> >
>> > "The truth is rarely pure and never simple."
>> > Oscar Wilde, The Importance of Being Earnest, 1895, Act I
>> >
>> > _______________________________________________
>> > datatable-help mailing list
>> > datatable-help at lists.r-forge.r-project.org
>> >
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>
>>
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>>
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20131203/1c3f1b69/attachment.html>


More information about the datatable-help mailing list