[datatable-help] Change in return type of 'by' when DT (or DT[i]) is empty

Chris Neff caneff at gmail.com
Wed Apr 18 12:26:49 CEST 2012


I would agree that a empty data table is the expected result I would
have had.  If I do

DT[, sum(v), by=x]

I always expect a two column data.table with columns "x" and "V1"
(which is sum(x)). In SQL this would read

SELECT
  x,
  SUM(v)
FROM DT
GROUP BY x


So with DT[y<1,  sum(v), by=x] we have

SELECT
  x,
  SUM(v)
FROM DT
WHERE y < 1
GROUP BY x

Now in any SQL dialect I work with, if WHERE y < 1 didn't return any
rows, then the result of this would be a 0 row selection.  I believe
that is what it should be with data.table as well: a 0-row data.table
with columns "x" and "V1".

This is a valid bug IMO, and should definitely be fixed.
On Wed, Apr 18, 2012 at 3:32 AM, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
> Dear datatablers,
>
> This seems like a valid bug report. Any reasons why we shouldn't make
> this change?
>
> https://r-forge.r-project.org/tracker/?func=detail&atid=975&aid=1945&group_id=240
>
> Matthew
>
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help


More information about the datatable-help mailing list