[datatable-help] data.table by versus apply

Damian Betebenner dbetebenner at nciea.org
Sat Feb 26 23:55:44 CET 2011


All,

I'm curious from a speed perspective what the analog of apply is in data.table as I have a problem where, for each row,  I want to take either the min or the max of several columns depending upon the value of a third column:

For example:

test.dt <- data.table(ID=1:10, SCORE_1=rnorm(10), SCORE_2=rnorm(10), SCORE_3=rnorm(10), MAX_OR_MIN=c(rep("Max", 5), rep("Min", 5)))

For each row I'd like to get the max of SCORE_1, SCORE_2, and SCORE_3 if the MAX_OR_MIN value is MAX and the min of SCORE_1, SCORE_2, and SCORE_3 if the MAX_OR_MIN value is MIN.

It isn't too difficult to come up with a "bulky" and slow solution, but I'm wondering if I'm missing a way in which data.table would make such an effort elegant and quick.

Any help greatly appreciated.

Damian Betebenner
Center for Assessment
PO Box 351
Dover, NH   03821-0351

Phone (office): (603) 516-7900
Phone (cell): (857) 234-2474
Fax: (603) 516-7910

dbetebenner at nciea.org<mailto:dbetebenner at nciea.org>
www.nciea.org<http://www.nciea.org/>



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20110226/877a6d5b/attachment.htm>


More information about the datatable-help mailing list