[datatable-help] rbindlist

Alexandre Sieira alexandre.sieira at gmail.com
Tue Dec 3 18:48:11 CET 2013


Thanks for pointing this out, Eduard. 

You are absolutely right. I just looked at the SVN repository HEAD and saw a new parameter called ‘fill’ was added to .rbind.data.table that would also accomplish something else I added to my function. Very nice! Looking forward to the new release. :)

-- 
Alexandre Sieira
CISA, CISSP, ISO 27001 Lead Auditor

"The truth is rarely pure and never simple."
Oscar Wilde, The Importance of Being Earnest, 1895, Act I

On 3 de dezembro de 2013 at 15:22:48, Eduard Antonyan (eduard.antonyan at gmail.com) wrote:

I took a cursory look at your code - the new rbind does everything you want (check use.names and the fill arguments), and you may want to take a look at its code.


On Tue, Dec 3, 2013 at 11:05 AM, Alexandre Sieira <alexandre.sieira at gmail.com> wrote:
For whom it may concern, I wrote a (rather bulky) wrapper around rbindlist that:

- checks that the classes of columns with the same name match;
- fills in any missing columns with NAs of the appropriate type;
- reorders columns for consistency;
- calls rbindlist on the results of this preprocessing.

The code is here: https://gist.github.com/asieira/7772953

The results would be as follows:

> smartrbindlist(list(data.table(a=1, b=2), data.table(b=4, a=3)))
   a b
1: 1 2
2: 3 4

> smartrbindlist(list(data.table(a=1, b=2), list(c=3), data.table(d="foo")))
    a  b  c   d
1:  1  2 NA  NA
2: NA NA  3  NA
3: NA NA NA foo

> smartrbindlist(list(data.table(a=1L, b=2), list(a=10)))
Erro em smartrbindlist(list(data.table(a = 1L, b = 2), list(a = 10)))
  smartrbindlist: column a has different classes in entry 2 [numeric] and its predecessors [integer]

Hope this helps anyone else out there.

-- 
Alexandre Sieira
CISA, CISSP, ISO 27001 Lead Auditor

"The truth is rarely pure and never simple."
Oscar Wilde, The Importance of Being Earnest, 1895, Act I

On 3 de dezembro de 2013 at 14:46:08, G See (gsee000 at gmail.com) wrote:

I agree. Here is a related thread:
http://thread.gmane.org/gmane.comp.lang.r.datatable/2231

Garrett


On Tue, Dec 3, 2013 at 8:26 AM, Alexandre Sieira
<alexandre.sieira at gmail.com> wrote:
> I have come across some behavior in rbindlist that look unexpected to me:
>
>> rbindlist(list(data.table(a=1, b=2), data.table(b=4, a=3)))
> a b
> 1: 1 2
> 2: 4 3
>
> So it appears to assume (without checking) that all objects have not only
> the same column names but also the same column order. So a value assigned
> to column ‘a’ in the second object was used for column ‘b’ in the end result
> (and vice-versa).
>
> I know the documentation says rbindlist uses the column types from the first
> entry of the list, but I didn’t see any mention to column order or names
> anywhere.
>
> I suggest that column names are matched, even if they are not in the same
> order. Perhaps a ‘use.names’ parameter could be used to ask for this
> behavior to avoid breaking backwards compatibility.
>
> Or, at the very least, I suggest the documentation of bindlist be updated to
> explicitly mention that the columns will be considered by position only, and
> that callers need to ensure the column orders of all objects match exactly.
> And that a warning is issued by rbindlist when the column names don’t match.
>
> --
> Alexandre Sieira
> CISA, CISSP, ISO 27001 Lead Auditor
>
> "The truth is rarely pure and never simple."
> Oscar Wilde, The Importance of Being Earnest, 1895, Act I
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

_______________________________________________
datatable-help mailing list
datatable-help at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20131203/8b971d1c/attachment-0001.html>


More information about the datatable-help mailing list