[datatable-help] What's your opinion on the feature request: add option mult="random"

Christoph Jäckel christoph.jaeckel at wi.tum.de
Fri Jan 6 00:20:12 CET 2012


Hi together,

I run a Monte Carlo simulation on a data.table and do that currently with a
loop: on every run, I choose a subset of rows subject to certain criteria
and from those rows I take a random element. Currently, I do the following:
Let's say I have funds from two regions ("eu" and "us") and I want to
choose a random fund from "eu" (could be "us" in the next run and a
different region in the third):

library(data.table)
rawData <- data.table(fundID  = letters,
                      compGeo = rep(c("us", "eu"), each=13))
setkey(rawData, "compGeo")
intDT <- rawData[J("eu"), mult="all"]
intDT[sample.int(nrow(intDT), size=1)]

So my idea is to just give the user the option mult="random", which does
this in one step. What do you think about that feature request?

With respect to the implementation: I changed a few lines in the function
'[.data.table' and got this to run on my locale data.table version, so I
guess I could implement it (as far as I can see, one just needs to change
some R code). However, I haven't done extensive testing and I'm not an
expert on shared projects and subversion (never did that actually), so I
guess I would need some help to start with and the confirmation I couldn't
break anything ;-)

Christoph
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20120106/983a74e9/attachment.htm>


More information about the datatable-help mailing list