[datatable-help] Best way to apply function to set of columns to create new columns where the function requires other columns from data.table

Marc Halperin Halperin at outins.com
Tue Feb 10 19:58:17 CET 2015


I want to add new columns to a data.table that is the weighted average of the columns and a weight variable.  This is a general problem I run into when using .SDcols but also needing another variable from the data.table to be available within the function within lapply.  Without including that variable within .SDcols (in this case the weight variable), I don't have access to it in the lapply function argument.   Is it a bad idea to subset .SD how I've done it?

library(data.table)
library(Hmisc)

dt <- data.table(a=runif(10), b= runif(10), weight=runif(10))

varnames <- c("a","b")

dt[ , ( paste( "mean", varnames, sep = "_" ) ) := lapply( .SD[ , .SD, .SDcols = -"weight" ], wtd.mean, weight ), .SDcols = c("weight",varnames) ]

Thanks 

-Marc


More information about the datatable-help mailing list