<div>
Hi,
</div><div><br></div><div>Suppose you've a data.table, say:</div><div><br></div><div>require(data.table)</div><div>DT <- data.table(x = 1:5, y = 6:10)</div>
<div><div><br></div><div>Suppose you want to group by "x %/% 2" ( = 0, 1,1, 2,2) and then calculate the sum of each column for each group, then one would do:</div><div><br></div><div>DT[, grp := x %/% 2]</div><div>DT[, list(x.sum=sum(x), y.sum=sum(y)), by = grp] # avoid .SD in case of few columns</div><div><br></div><div>Now, assume that you've many many columns which would make the use of `.SD` sensible.</div><div><br></div><div>DT[, lapply(.SD, sum), by = grp]</div><div><div> grp x y</div><div>1: 0 1 6</div><div>2: 1 5 15</div><div>3: 2 9 19</div></div><div><br></div><div>The issue is that if you create the grouping column ad-hoc, then the column from which the ad-hoc grouping column is derived is not available to .SD. Let me illustrate this:</div><div><br></div><div><div>DT <- data.table(x = 1:5, y = 6:10)</div><div></div></div><div>DT[, lapply(.SD, sum), by = (grp=x %/% 2)] # ad-hoc creation of grouping column</div><div><div><div> grp y</div><div>1: 0 6</div><div>2: 1 15</div><div>3: 2 19</div></div></div><div><br></div><div>I think it'd be nice to have the column available to `.SD` so that we can save creating a temporary column, grouping and then deleting it, as "technically" it *is* a new column (meaning, "x" must still be available). Any take on this?</div><div><br></div><div>Arun</div><div><br></div></div>