[datatable-help] Environment of eval() execution for "j" appears to vary inexplicably

mdowle at mdowle.plus.com mdowle at mdowle.plus.com
Fri Jul 2 18:33:16 CEST 2010


Glad to hear it was that.
Yes dput is similar to dump, great, all sorted.
Your top() is very neat. I had to double-take for a second as it looks
strange to have DT repeated again inside the by, but it reads well now. I
can't think of any better way, looks like you nailed it.
Will look at the crash bug.
Matthew


> Thank you Matthew.  The issue was the "class" not having "data.frame".
> The bug fix you had made works perfectly.
>
> I had read online to use dput() to recreate variables for posting online.
> This, I understand, is good at times because there could be other aspects
> of the variables (other than the values) that could be contributing to the
> problem.  (I did not know about dump()).
>
> Having "(Other)" in the level is a by product of how I found the bug.  I
> was trying to create a function "top" which would keep the top N labels
> (where top N is measured in some fashion) and mark the others with
> "(Other)".  I had originally used the same function to recreate the
> problem on dummy data and took a dput() dump of the data from somewhere in
> the middle of the processing.  This allowed me to have "simpler" functions
> to paste on here to reproduce the problem without much extraneous code.
> That is why you had that the extra level in there.
>
> FYI, my original objective was to use my function top() to group on those
> values; so I would be focusing on the "important" values without being
> overwhelmed with the large amount of "unimportant" data.
>     DT[ , list( Col1, Col2), by=top( DT, top_criteria, sort_order,
> num_items_to_show ) ]
> If there is a better way to do the above, please comment.
>
> Thanks a bunch for all your help!
>
>
> Regards,
> Harish
>
>
> --- On Fri, 7/2/10, mdowle at mdowle.plus.com <mdowle at mdowle.plus.com> wrote:
>
>> From: mdowle at mdowle.plus.com <mdowle at mdowle.plus.com>
>> Subject: Re: [datatable-help] Environment of eval() execution for "j"
>> appears to vary inexplicably
>> To: "Harish" <harishv_99 at yahoo.com>
>> Cc: datatable-help at lists.r-forge.r-project.org
>> Date: Friday, July 2, 2010, 2:15 AM
>> Note the class of A is "data.table"
>> in the structure() rather than
>> c("data.table","data.frame").
>>
>> Now whilst, even so, I don't fully understand why the error
>> is occurring,
>> if you could try creating the data.table using data.table()
>> to create the
>> correct structure, rather than structure manually, and see
>> if that fixes
>> it.  Since the structure has changed between 1.4.1 and
>> now, thats where
>> the '1.4.1 knocking around' might be coming in.
>>
>> I've noticed use of structure() in other threads and had
>> assumed you were
>> using dump(..,file="") as I think some posting guidlines
>> say. I personally
>> prefer to see the data.table() call to create the dummy
>> data, as I can
>> read it quicker.
>>
>> Also why do return a factor constructed using structure()?
>> If you want
>> extra unused levels (e.g. '(Other)'), then factor() has a
>> levels argument
>> for that purpose.
>>
>> Anyway, might not be that, just first thing to try ..
>>
>>
>> > Unfortunate.  I still get the error.  I did
>> the following:
>> >
>> > 1) Deleted the "data.table" folder in win-library
>> > 2) Started R with "--vanilla" parameter
>> > 3) Installed the binaries from R-Forge
>> > 4) Restarted R
>> > 5) Ran the test case below
>> >
>> > Would someone else also try the following test
>> case?  That would help
>> > isolate whether it is my configuration that is causing
>> the problem.
>> > Thanks.
>> >
>> > ======== Start My R Session ========
>> >
>> >> search()
>> >  [1] ".GlobalEnv"     
>>    "package:data.table" "package:stats"
>> >  [4]
>> "package:graphics"   "package:grDevices" 
>> "package:utils"
>> >  [7]
>> "package:datasets"   "package:methods" 
>>   "Autoloads"
>> > [10] "package:base"
>> >> loadedNamespaces()
>> > [1] "base"       "data.table"
>> "graphics"   "grDevices"  "methods"
>> > [6] "stats"      "utils"
>> >> sessionInfo()
>> > R version 2.11.1 (2010-05-31)
>> > i386-pc-mingw32
>> >
>> > locale:
>> > [1] LC_COLLATE=English_United States.1252
>> > [2] LC_CTYPE=English_United States.1252
>> > [3] LC_MONETARY=English_United States.1252
>> > [4] LC_NUMERIC=C
>> > [5] LC_TIME=English_United States.1252
>> >
>> > attached base packages:
>> > [1] stats     graphics 
>> grDevices utils     datasets 
>> methods   base
>> >
>> > other attached packages:
>> > [1] data.table_1.5
>> >>
>> >
>> > ======== End My R Session ========
>> >
>> >
>> > ======== Example code ========
>> >
>> > A <- structure(list(a = structure(1:3, .Label =
>> c("A", "C", "D"), class =
>> > "factor"),
>> >     Count = c(4L, 8L, 1L)), .Names
>> = c("a", "Count"), class =
>> > "data.table")
>> >
>> > foo1 <- function(DT) {
>> >    dtRet <- DT[ ,
>> >         
>>    list( Count=sum( Count ) ),
>> >         
>>    by=list( Category=foo2( DT, a ) )
>> >          ]
>> >    invisible()
>> > }
>> >
>> >
>> > foo2 <- function( DT, v ) {
>> >    q <- substitute( v )
>> >
>> >    print( identical( q, substitute( v ) )
>> )   # TRUE as expected
>> >    print( DT[ 1:2, eval( q ) ] )
>> >    print( DT[ 1:2, eval( substitute( v ) ) ]
>> )
>> >    return( structure(1:3, .Label = c("A",
>> "C", "D", "(Other)"), class =
>> > "factor") )
>> > }
>> >
>> > foo1( A ) # Test 1
>> > foo2( A, a ) # Test 2
>> > === End code ===
>> >
>> > I get the following output:
>> >
>> >> foo1( A ) # Test 1
>> > [1] TRUE
>> > [1] A C
>> > Levels: A C D
>> > [1] A C
>> > Levels: A C D
>> > Error: evaluation nested too deeply: infinite
>> recursion /
>> > options(expressions=)?
>> >> foo2( A, a ) # Test 2
>> > [1] TRUE
>> > [1] A C
>> > Levels: A C D
>> > [1] A C
>> > Levels: A C D
>> > [1] A C D
>> > Levels: A C D (Other)
>> >>
>> >
>> > Note the error in the middle...
>> >
>> >
>> > Harish
>> >
>> > --- On Thu, 7/1/10, mdowle at mdowle.plus.com
>> <mdowle at mdowle.plus.com>
>> wrote:
>> >
>> >> From: mdowle at mdowle.plus.com
>> <mdowle at mdowle.plus.com>
>> >> Subject: Re: [datatable-help] Environment of
>> eval() execution for "j"
>> >> appears to vary inexplicably
>> >> To: "Harish" <harishv_99 at yahoo.com>
>> >> Cc: datatable-help at lists.r-forge.r-project.org
>> >> Date: Thursday, July 1, 2010, 2:01 AM
>> >>
>> >> I've seen that before. For me it was because
>> version 1.4.1
>> >> of data.table
>> >> was still knocking around.  For me I looked at
>> >> loadedNamespaces() and it
>> >> listed data.table, but search() did not.
>> >>
>> >> The loop happens because of the particular changes
>> that
>> >> have happened
>> >> between 1.4.1 and latest 1.5, AND having both
>> versions
>> >> somehow visible to
>> >> R at the same time, in some way conflicting with
>> each
>> >> other. Or at least
>> >> thats what it was for me.
>> >>
>> >> To be sure, start R with --vanilla, for me on
>> ubuntu I have
>> >> to "sudo R
>> >> --vanilla" anyway because of permissions (which I
>> like).
>> >>
>> >> Then install.packages(...) to cleanly install.
>> Then restart
>> >> R. The error
>> >> should go away?
>> >>
>> >>
>> >> > Matthew,
>> >> >
>> >> > Thanks for the fix.  It almost works...  I
>> >> tested it on Rev 101 binaries.
>> >> >
>> >> > I get an extra line of output for the test
>> case #1 I
>> >> mentioned...
>> >> >
>> >> >> foo1( A )  # Test 1
>> >> > [1] TRUE
>> >> > [1] A C
>> >> > Levels: A C D
>> >> > [1] A C
>> >> > Levels: A C D
>> >> > Error: evaluation nested too deeply:
>> infinite
>> >> recursion /
>> >> > options(expressions=)?
>> >> >>
>> >> >
>> >> > Please note that I get an error at the
>> end. 
>> >> Though, the output seems to
>> >> > be right.
>> >> >
>> >> >
>> >> > Regards,
>> >> > Harish
>> >> >
>> >> >
>> >> > --- On Tue, 6/29/10, Matthew Dowle <mdowle at mdowle.plus.com>
>> >> wrote:
>> >> >
>> >> >> From: Matthew Dowle <mdowle at mdowle.plus.com>
>> >> >> Subject: Re: [datatable-help] Environment
>> of
>> >> eval() execution for "j"
>> >> >> appears to vary inexplicably
>> >> >> To: "Harish" <harishv_99 at yahoo.com>
>> >> >> Cc: datatable-help at lists.r-forge.r-project.org
>> >> >> Date: Tuesday, June 29, 2010, 1:43 PM
>> >> >> Yes, that was reproducible, thanks.
>> >> >>
>> >> >> The last commit 101 fixes this one too, I
>> think.
>> >> Please
>> >> >> confirm.
>> >> >>
>> >> >> A = data.table(a=c("A","C","D"),
>> >> Count=c(4L,8L,1L))
>> >> >>
>> >> >> > foo1(A)
>> >> >> [1] TRUE
>> >> >> [1] A C
>> >> >> Levels: A C D
>> >> >> [1] A C
>> >> >> Levels: A C D
>> >> >>
>> >> >> > foo2(A,a)
>> >> >> [1] TRUE
>> >> >> [1] A C
>> >> >> Levels: A C D
>> >> >> [1] A C
>> >> >> Levels: A C D
>> >> >> [1] A C D
>> >> >> Levels: A C D (Other)
>> >> >> >
>> >> >>
>> >> >> Matthew
>> >> >>
>> >> >>
>> >> >> On Sat, 2010-06-26 at 00:28 -0700, Harish
>> wrote:
>> >> >> > I am running into a peculiar issue
>> which
>> >> seems to be
>> >> >> related to the environment in which the
>> eval() is
>> >> executed
>> >> >> when passed as the "j".  The environment
>> of
>> >> execution
>> >> >> of the eval() seems to vary depending on
>> whether I
>> >> pass in a
>> >> >> variable (of class "name") or an
>> equivalent
>> >> expression is
>> >> >> typed inside the eval.
>> >> >> >
>> >> >> > === Example code ===
>> >> >> >
>> >> >> > A <- structure(list(a =
>> structure(1:3,
>> >> .Label =
>> >> >> c("A", "C", "D"), class = "factor"),
>> >> >> >     Count = c(4L, 8L, 1L)),
>> .Names
>> >> >> = c("a", "Count"), class = "data.table")
>> >> >> >
>> >> >> > foo1 <- function(DT) {
>> >> >> >    dtRet <- DT[ ,
>> >> >> >         
>> >> >>    list( Count=sum( Count ) ),
>> >> >> >         
>> >> >>    by=list( Category=foo2( DT, a ) )
>> >> >> >          ]
>> >> >> >    invisible()
>> >> >> > }
>> >> >> >
>> >> >> >
>> >> >> > foo2 <- function( DT, v ) {
>> >> >> >    q <- substitute( v )
>> >> >> >
>> >> >> >    print( identical( q,
>> substitute( v ) )
>> >> >> )   # TRUE as expected
>> >> >> >    print( DT[ 1:2, eval( q ) ] )
>> >> >> >    print( DT[ 1:2, eval(
>> substitute( v ) )
>> >> ]
>> >> >> )
>> >> >> >    return( structure(1:3, .Label =
>> c("A",
>> >> >> "C", "D", "(Other)"), class = "factor")
>> )
>> >> >> > }
>> >> >> >
>> >> >> > foo1( A ) # Test 1
>> >> >> > foo2( A, a ) # Test 2
>> >> >> > === End code ===
>> >> >> >
>> >> >> > In Test 1, when I run foo1(), I am
>> >> essentially
>> >> >> executing
>> >> >> >    foo2( A, a ) from within the
>> code of
>> >> the
>> >> >> data table.
>> >> >> >
>> >> >> > I get:
>> >> >> > [1] TRUE
>> >> >> > [1] A C
>> >> >> > Levels: A C D
>> >> >> > [1] A C D
>> >> >> > Levels: A C D
>> >> >> >
>> >> >> > Issue #1 ==> The third print in
>> foo2() is
>> >> actually
>> >> >> returning 3 items when I am requesting
>> only the
>> >> first 2
>> >> >> items.  (Also, in my more complex
>> program, it
>> >> seemed to
>> >> >> return the data in alphabetical order or
>> the order
>> >> of the
>> >> >> factor levels rather than in the order of
>> the data
>> >> in the
>> >> >> table.  However, I am not able to
>> reproduce this
>> >> in a
>> >> >> simpler example.  I am hoping that this
>> behavior
>> >> will
>> >> >> also be rectified with any bug fixes you
>> make.)
>> >> >> >
>> >> >> > In Test 2, I run foo2() directly in
>> >> .GlobalEnv, but I
>> >> >> am passing in the same data that foo1()
>> would have
>> >> passed it
>> >> >> in Test 1.
>> >> >> >
>> >> >> > I get:
>> >> >> > [1] TRUE
>> >> >> > [1] A C
>> >> >> > Levels: A C D
>> >> >> > Error in eval(expr, envir, enclos) :
>> object
>> >> 'a' not
>> >> >> found
>> >> >> >
>> >> >> > Issue #2 ==> It looks like if I
>> have an
>> >> expression
>> >> >> inside eval(), it is executed in a
>> different
>> >> environment as
>> >> >> the prior print statement where I have an
>> eval()
>> >> with just a
>> >> >> single variable.  Technically, I would
>> expect
>> >> both to
>> >> >> be equivalent.
>> >> >> >
>> >> >> >
>> >> >> > I hope I clearly explained what my
>> issues
>> >> are.
>> >> >> >
>> >> >> >
>> >> >> > Regards,
>> >> >> > Harish
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >       
>> >> >> >
>> >> _______________________________________________
>> >> >> > datatable-help mailing list
>> >> >> > datatable-help at lists.r-forge.r-project.org
>> >> >> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>> >> >>
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>>
>>
>>
>
>
>
>




More information about the datatable-help mailing list