[datatable-help] How can I apply a function of 2 columns to multiple other columns with a by clause

statquant3 statquant at outlook.com
Wed Oct 14 13:22:46 CEST 2015


Hello, I am looking to 
 * update several columns by 
 * applying a function f (to each of those columns) that would use those
columns AND another one.

If there is no "by" I make it work
How can I do the same with a "by" :

Bellow an example (as I can't be very clear)

#data setup
library(data.table)
set.seed(1)
N <- 101
DT <-
data.table(x1=rnorm(N),x2=rnorm(N),x3=rnorm(N),x4=rnorm(N),y=letters[sample(5,size=N,replace=T)])

#function to be applied
f <- function(x,y){return( frank(x/y,na.last='keep') )}
#column names
xCols <- paste0('x',1:3)
rCols <- paste0('r',1:3)

#when there is no by it is easy
DT[,(rCols):=lapply(FUN=f,X=.SD,y=DT$x4),.SDcols=xCols]

#when there is a by it fails (offcourse DT$x4 is too big)
DT[,(rCols):=lapply(FUN=f,X=.SD,y=DT$x4),.SDcols=xCols,by=.(y)]




--
View this message in context: http://r.789695.n4.nabble.com/How-can-I-apply-a-function-of-2-columns-to-multiple-other-columns-with-a-by-clause-tp4713576.html
Sent from the datatable-help mailing list archive at Nabble.com.


More information about the datatable-help mailing list