[datatable-help] Passing user defined functions as part of the j argument

djmuseR djmuser at gmail.com
Mon Sep 27 12:04:32 CEST 2010



Damian wrote:
----------------------------------------------------------------------
Within a function I have several functions. Within one of those functions I
call one of the other defined functions using data.table.

The issue is that I can’t find the function that I’m calling as part of the
j expression in data.table.
Here’s a simple example:

require(data.table)
test.data <- data.table(ID=rep(1:2,  each=10), SCORE=rnorm(20))

test.fun <- function(my.data) {

my.mean <- function(x) {
       mean(x, na.rm=TRUE)
}

my.data[,my.mean(SCORE), by=ID]

}

> test.fun(test.data)

Error in eval(expr, envir, enclos) : could not find function "my.mean"
#--------------------------------------------------------------------------------------------

Try this:

test.fun <- function(my.data) {
 
   require(data.table)
      myDT <- data.table(my.data)
      my.mean <- function(x) mean(x, na.rm=TRUE)
   myDT[, list(mymean = my.mean(SCORE)), by=ID]
 
 }

test.fun(test.data)
     ID     mymean
[1,]  1 -0.1344245
[2,]  2 -0.7677596

----------------------------------------------------------------------------------------------

As for your second question, try this:


outer.function <- function(ss.data) {

   require(data.table)
   test.data <- data.table(ss.data)
   my.mean <- function(x)   mean(x, na.rm=TRUE)

   test.fun <- function(my.data, my.function) {
        my.data[, list(mymean = my.function(SCORE)), by=ID]
   }
   test.fun(test.data, my.mean)
 }

> outer.function(test.data)
     ID     mymean
[1,]  1 -0.1344245
[2,]  2 -0.7677596


When you call a function, it creates an environment. Any function you write
inside the 'outer' function is local to the environment created by the outer
function, as are any objects defined or created inside the function body.
Similarly, the inner function creates its own environment when called, and
so on.

Let's learn from Dr. Gentleman's book (2008, p. 53):

"   When a function is evaluated, a new environment or frame is created
specifically for that evaluation. The global environment is always recorded
as frame 0 and other frames count up from there. The frame provides bindings
between the formal arguments for the function and the user-supplied values.
It is also where any local variables have their bindings stored.

The parent frame of a function evaluation is the environment from which the
function was invoked or called. It is not necessarily numbered one less than
the frame number of the current evaluation, although that is usually the
case. Symbols in the parent frame have no effect on evaluation of the
current function.

...However, programmers have access to the entire call stack and virtually
all objects and frames that are defined on it."

In other words, when evaluating a function, R searches for objects first in
the local environment of the function, then its enclosing environment, then
the enclosing environment of the enclosing environment, etc. until it
reaches the global environment. If the object in question is still not
found, it traverses the search path until it either finds the object or
throws an error.

In your first function, I modified it to bring the data into the local
environment and make a data table out of it before calling the inner
function. Since the enclosing environment of the inner function is that of
the outer (i.e., calling) function, everything needed to evaluate the inner
function is present in the local environment of the calling function.
Problem solved.

In the second case, both the data and the two inner functions are in the
local environment of outer.function. The inner function test.fun() can be
evaluated in the environment of outer.function because the objects it needs,
test.data and my.mean, are found there. 

Also observe that I defined the data table within the environment of the
outer function; this allows one to input either data frames or data tables
as arguments to the outer function.


HTH,
Dennis
-- 
View this message in context: http://r.789695.n4.nabble.com/Passing-user-defined-functions-as-part-of-the-j-argument-tp2714377p2715158.html
Sent from the datatable-help mailing list archive at Nabble.com.


More information about the datatable-help mailing list