<div dir="ltr">Hello,<div><br></div><div>I've encounted what looks like a bug while sorting by POSIXct and logical column, which may or may not be related to the following bug:</div><div><br></div><div><a href="https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2552&group_id=240&atid=975">https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2552&group_id=240&atid=975</a><br>
</div><div><br></div><div>Here are all the details: <a href="http://stackoverflow.com/questions/15077232/data-table-not-summarizing-properly-by-two-columns">http://stackoverflow.com/questions/15077232/data-table-not-summarizing-properly-by-two-columns</a></div>
<div><br></div><div style>Here is the test case:</div><div style><br></div><div style><div> # First some data</div><div> data <- data.table(structure(list(</div><div> month = structure(c(1356998400, 1356998400, 1356998400, </div>
<div> 1359676800, 1354320000, 1359676800, 1359676800, 1356998400, 1356998400, </div><div> 1354320000, 1354320000, 1354320000, 1359676800, 1359676800, 1359676800, </div><div>
1356998400, 1359676800, 1359676800, 1356998400, 1359676800, 1359676800, </div><div> 1359676800, 1359676800, 1354320000, 1354320000), class = c("POSIXct", </div>
<div> "POSIXt"), tzone = "UTC"), </div><div> portal = c(TRUE, TRUE, FALSE, TRUE, </div><div> TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, </div>
<div> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE</div><div> ), </div><div> satisfaction = c(10L, 10L, 10L, 9L, 10L, 10L, 9L, 10L, 10L, </div><div> 9L, 2L, 8L, 10L, 9L, 10L, 10L, 9L, 10L, 10L, 10L, 9L, 10L, 9L, </div>
<div> 10L, 10L)), </div><div> .Names = c("month", "portal", "satisfaction"), </div><div> row.names = c(NA, -25L), class = "data.frame"))</div>
<div> </div><div> # Summarizing by month, portal with tapply works:</div><div> </div><div> > tapply(data$satisfaction, list(data$month, data$portal), mean)</div><div> FALSE TRUE</div><div> 2012-12-01 8.5 8.000000</div>
<div> 2013-01-01 10.0 10.000000</div><div> 2013-02-01 9.0 9.545455</div><div> </div><div> # Summarizing with 'by' argument of data.table does not:</div><div> </div><div> > data[, mean(satisfaction), by = 'month,portal']> </div>
<div> data[, mean(satisfaction), by = list(month, portal)]</div><div> month portal V1</div><div> 1: 2013-01-01 FALSE 10.000000</div><div> 2: 2013-02-01 TRUE 9.000000</div><div> 3: 2013-01-01 TRUE 10.000000</div>
<div> 4: 2012-12-01 FALSE 8.500000</div><div> 5: 2012-12-01 TRUE 7.333333</div><div> 6: 2013-02-01 TRUE 9.666667</div><div> 7: 2013-02-01 FALSE 9.000000</div><div> 8: 2012-12-01 TRUE 10.000000</div>
<div> </div><div> # Summarizing only this year's data works:</div><div> data[month >= ymd(20130101), mean(satisfaction), by = 'month,portal']</div><div> month portal V1</div><div> 1: 2013-01-01 TRUE 10.000000</div>
<div> 2: 2013-01-01 FALSE 10.000000</div><div> 3: 2013-02-01 TRUE 9.545455</div><div> 4: 2013-02-01 FALSE 9.000000</div></div><div><br clear="all"><div>Yours Sincerely,<br>Victor Kryukov<br></div>
</div></div>