[datatable-help] Change in return type of 'by' when DT (or DT[i]) is empty

Steve Lianoglou mailinglist.honeypot at gmail.com
Wed Apr 18 15:05:47 CEST 2012


Hi,

On Wed, Apr 18, 2012 at 6:26 AM, Chris Neff <caneff at gmail.com> wrote:
> I would agree that a empty data table is the expected result I would
> have had.
[snip]

Completely agree w/ Chris' synopsis (below) and +1 on making the change.

-steve

>  If I do
>
> DT[, sum(v), by=x]
>
> I always expect a two column data.table with columns "x" and "V1"
> (which is sum(x)). In SQL this would read
>
> SELECT
>  x,
>  SUM(v)
> FROM DT
> GROUP BY x
>
>
> So with DT[y<1,  sum(v), by=x] we have
>
> SELECT
>  x,
>  SUM(v)
> FROM DT
> WHERE y < 1
> GROUP BY x
>
> Now in any SQL dialect I work with, if WHERE y < 1 didn't return any
> rows, then the result of this would be a 0 row selection.  I believe
> that is what it should be with data.table as well: a 0-row data.table
> with columns "x" and "V1".
>
> This is a valid bug IMO, and should definitely be fixed.
> On Wed, Apr 18, 2012 at 3:32 AM, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>> Dear datatablers,
>>
>> This seems like a valid bug report. Any reasons why we shouldn't make
>> this change?
>>
>> https://r-forge.r-project.org/tracker/?func=detail&atid=975&aid=1945&group_id=240
>>
>> Matthew
>>
>>
>>
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help



-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact


More information about the datatable-help mailing list