[datatable-help] [SOLVED] Something strange with scoping(?) when data.table is used inside a package I'm developing.

Matthew Dowle mdowle at mdowle.plus.com
Fri Jan 7 09:23:17 CET 2011


Tried again with fresh eyes and this is fixed now. Including in v1.5.1
sent to CRAN last night.

On Thu, 2010-11-04 at 14:32 +0000, Matthew Dowle wrote:
> When cedta (Calling Environment Data.Table Aware) runs it finds the
> namespace calling data.table no problem. getNamespaceImports() on that
> namespace will then return "data.table" if that package Imports it,
> distinguishing it as a data.table aware package. So far so good. For
> Depends though, .Depends is (I think) not in the namespace environment but
> the corresponding package environment for that namespace. The nut is going
> from namespace to package environment.
> 
> I tried a more robust version of the following but it doesn't seem very
> neat, and it didn't work. Might be just a case of trying again with fresh
> eyes :
> 
>   name = getNamespaceName(te)
>   pkg = as.environment(paste("package:",name,sep=""))
>   "data.table" %in% get(".Depends",envir=pkg)
> 
> Also R changed in this area fairly recently (name change from .required to
> .Depends).
> 
> 
> 
> > Thanks for the reply ... I actually moved it over to the Imports
> > anyway .. but still, it's probably a good thing to fix at some point
> > :-)
> >
> > It's kind of weird though ... do you know what "the nut" of the problem
> > is?
> > -steve
> >
> > On Thu, Nov 4, 2010 at 9:36 AM, Matthew Dowle <mdowle at mdowle.plus.com>
> > wrote:
> >>
> >> Yes that was an off list fix for importing data.table (Imports field in
> >> DESCRIPTON). There is still likely a problem with the Depends field of
> >> DESCRIPTION, though. If you need to Depend rather than Import then let
> >> me
> >> know and I'll try again to fix it.
> >>
> >> Note also that data.table queries don't work from a browser prompt (bug
> >> #1131).
> >>
> >> Matthew
> >>
> >>
> >>> I just `svn up`'ed my data.table package to see if I can dig into the
> >>> code and I noticed that this issue was previously reported and fixed
> >>> in 1.5.1.
> >>>
> >>> Seems like I stumbled on the same issue ... sorry, I didn't see this
> >>> in the mailing list.
> >>>
> >>> Thanks,
> >>> -steve
> >>>
> >>> On Tue, Nov 2, 2010 at 6:20 PM, Steve Lianoglou
> >>> <mailinglist.honeypot at gmail.com> wrote:
> >>>> More info on my problem:
> >>>>
> >>>> I'm loading my package, and passing an object to a function defined
> >>>> (and exported) in my library that does the fast dt[, SOMETHING,
> >>>> by='entrez.id']. I call this function, and it bombs when I try to
> >>>> access "things" (columns) of the data.table in my SOMEHTING
> >>>> expression.
> >>>>
> >>>> Given the same object, I then execute the lines of the function that
> >>>> failing one by one (I don't call the function), and it executes
> >>>> normally.
> >>>>
> >>>> I wonder if this is relevant:
> >>>>
> >>>> (I). After the functions bombs, this is the result of traceback() --
> >>>> is it normal for the deepest call to actually be to `[.data.frame`?
> >>>>
> >>>> R> traceback()
> >>>> 4: `[.data.frame`(x, i, j)
> >>>> 3: `[.data.table`(dt, , list(seqnames = seqnames[1], strand =
> >>>> strand[1],
> >>>>       start = min(start), end = max(end)), by = "entrez.id")
> >>>> 2: dt[, list(seqnames = seqnames[1], strand = strand[1], start =
> >>>> min(start),
> >>>>       end = max(end)), by = "entrez.id"]
> >>>> 1: annotatedTxBounds(annotated)
> >>>>
> >>>> (II) Bioconductor packages use the S4 system. I've defined some
> >>>> conversion functions on objects in my package so that I can convert
> >>>> them to data.table's in the "expected way", a la:
> >>>>
> >>>> R> my.dt <- as(MyOwnClass, 'data.table')
> >>>>
> >>>> In order for me to do that, I've had to S4-ize the S3 data.table
> >>>> class,
> >>>> like so:
> >>>>
> >>>> setOldClass("data.table")
> >>>> (I also had `setOldClass(c('data.table', 'data.frame'))`, both have
> >>>> the same error)
> >>>>
> >>>> So ... it has something to do with my function as its run from within
> >>>> my package's environment -- but I don't know what to do about it.
> >>>>
> >>>> I tried adding `import(data.table)` into NAMESPACE file as well, but
> >>>> no
> >>>> dice.
> >>>>
> >>>> Thanks,
> >>>> -steve
> >>>>
> >>>> On Tue, Nov 2, 2010 at 3:14 PM, Steve Lianoglou
> >>>> <mailinglist.honeypot at gmail.com> wrote:
> >>>>> Hi,
> >>>>>
> >>>>> Sorry for what is about to be a vaguely described problem I'm having,
> >>>>> but here goes.
> >>>>>
> >>>>> I'm developing some R/bioconductor packages and have been using
> >>>>> data.table for a few things in them quite happily. Although my
> >>>>> packages are written to be properly installed, as I develop with
> >>>>> them,
> >>>>> I just source their "R" directories to make my life easier so I can
> >>>>> easily modify them and update my R environment when I find something
> >>>>> wrong (w/o having to restart R, then call library(MyPackage), etc ..)
> >>>>>
> >>>>> So, I just went through my package to make sure my NAMESPACE stuffs
> >>>>> are kosher -- that I export the classes, methods, and functions I
> >>>>> need
> >>>>> to export.
> >>>>>
> >>>>> In my DESCRIPTION file, data.table is listed in the "Depends"
> >>>>> section.
> >>>>>
> >>>>> I'm only mentioning this because now that I've successfully done all
> >>>>> that, I installed my package and am now using it by calling
> >>>>> library(MyPackage). Now there is something strange happening with my
> >>>>> data.table stuff.
> >>>>>
> >>>>> The column names of my data.table are no longer recognized in my j
> >>>>> functions. For instance:,
> >>>>>
> >>>>> R> library(data.table)
> >>>>> R> df <- structure(list(seqnames = c("chr22", "chr22", "chr22",
> >>>>> "chr22",
> >>>>> "chr22", "chr22", "chr22", "chr22", "chr22", "chr22"), start =
> >>>>> c(22639026L,
> >>>>> 22639103L, 22639574L, 22643475L, 22643596L, 28059152L, 15897460L,
> >>>>> 15905763L, 15908214L, 15917963L), end = c(22639102L, 22639210L,
> >>>>> 22639749L, 22643595L, 22644748L, 28059247L, 15898234L, 15905890L,
> >>>>> 15908316L, 15919682L), width = c(77L, 108L, 176L, 121L, 1153L,
> >>>>> 96L, 775L, 128L, 103L, 1720L), strand = structure(c(1L, 1L, 1L,
> >>>>> 1L, 1L, 2L, 1L, 1L, 1L, 1L), .Label = c("+", "-", "*"), class =
> >>>>> "factor"),
> >>>>>    exon.anno = structure(c(5L, 1L, 1L, 1L, 4L, 3L, 3L, 3L, 3L,
> >>>>>    3L), .Label = c("cds", "overlap", "utr", "utr3", "utr5"), class =
> >>>>> "factor"),
> >>>>>    symbol = c("DDTL", "DDTL", "DDTL", "DDTL", "DDTL", "SNORD125",
> >>>>>    "CECR7", "CECR7", "CECR7", "CECR7"), entrez.id = c("100037417",
> >>>>>    "100037417", "100037417", "100037417", "100037417", "100113380",
> >>>>>    "100130418", "100130418", "100130418", "100130418")), .Names =
> >>>>> c("seqnames",
> >>>>> "start", "end", "width", "strand", "exon.anno", "symbol", "entrez.id"
> >>>>> ), row.names = c(NA, -10L), class = "data.frame")
> >>>>> R> dt <- data.table(df, key='entrez.id')
> >>>>>
> >>>>> Now, something like this should work (and in fact does when I have a
> >>>>> clean environment like you would by just starting R and pasting the
> >>>>> above code):
> >>>>>
> >>>>> R> bounds <- dt[, list(start=min(start), end=min(end)),
> >>>>> by='entrez.id']
> >>>>>
> >>>>> But when the "bowels" of my code in my package are running this (only
> >>>>> when it's attached with library(MyLibrary), I'm not getting this
> >>>>> error:
> >>>>>   Error in min(start) : invalid 'type' (closure) of argument
> >>>>>
> >>>>> If I try to use the .SD object in the same place, I also get an other
> >>>>> error:
> >>>>>
> >>>>> R> bounds2 <- dt[, {
> >>>>>  .sd <- .SD[1]
> >>>>>  .sd$start <- min(start)
> >>>>>  .sd$end <- max(end)
> >>>>>  .sd
> >>>>> }, by='entrez.id']
> >>>>>
> >>>>> (The code here is simplified, but assume I need to use .SD -- I want
> >>>>> to get the rest of the columns in the dt data.table w/o referencing
> >>>>> them explicitly)
> >>>>>
> >>>>> The error when the code is run from within my package is:
> >>>>>
> >>>>>   Error in `[.data.frame`(x, i, j) : object '.SD' not found
> >>>>>
> >>>>> Even though it works in a "clean" R environment.
> >>>>>
> >>>>> Can anyone take a stab at why this might be happening? I'm at a bit
> >>>>> of
> >>>>> a loss.
> >>>>>
> >>>>> For what it's worth, this is the sessionInfo of my R environment when
> >>>>> my package is installed (my package is called GenomicFeaturesX). Most
> >>>>> of the packages in "other attached packages" are from biocondutcor.
> >>>>>
> >>>>> R version 2.12.0 (2010-10-15)
> >>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
> >>>>>
> >>>>> locale:
> >>>>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> >>>>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> >>>>>  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
> >>>>>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
> >>>>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> >>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >>>>>
> >>>>> attached base packages:
> >>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
> >>>>>
> >>>>> other attached packages:
> >>>>>  [1] BSgenome.Hsapiens.UCSC.hg18_1.3.16 BSgenome_1.18.0
> >>>>>  [3] Biostrings_2.18.0                  doMC_1.2.1
> >>>>>  [5] multicore_0.1-3                    foreach_1.3.0
> >>>>>  [7] codetools_0.2-2                    iterators_1.0.3
> >>>>>  [9] GenomicFeaturesX_0.2               data.table_1.5
> >>>>> [11] GenomicFeatures_1.2.0              GenomicRanges_1.2.1
> >>>>> [13] IRanges_1.8.2
> >>>>>
> >>>>> loaded via a namespace (and not attached):
> >>>>>  [1] annotate_1.28.0      AnnotationDbi_1.12.0 Biobase_2.10.0
> >>>>>  [4] biomaRt_2.6.0        DBI_0.2-5            RCurl_1.4-3
> >>>>>  [7] RSQLite_0.9-2        rtracklayer_1.10.2   tools_2.12.0
> >>>>> [10] XML_3.2-0            xtable_1.5-6
> >>>>>
> >>>>> Thanks,
> >>>>> -steve
> >>>>>
> >>>>> --
> >>>>> Steve Lianoglou
> >>>>> Graduate Student: Computational Systems Biology
> >>>>>  | Memorial Sloan-Kettering Cancer Center
> >>>>>  | Weill Medical College of Cornell University
> >>>>> Contact Info: http://cbio.mskcc.org/~lianos/contact
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Steve Lianoglou
> >>>> Graduate Student: Computational Systems Biology
> >>>>  | Memorial Sloan-Kettering Cancer Center
> >>>>  | Weill Medical College of Cornell University
> >>>> Contact Info: http://cbio.mskcc.org/~lianos/contact
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Steve Lianoglou
> >>> Graduate Student: Computational Systems Biology
> >>>  | Memorial Sloan-Kettering Cancer Center
> >>>  | Weill Medical College of Cornell University
> >>> Contact Info: http://cbio.mskcc.org/~lianos/contact
> >>> _______________________________________________
> >>> datatable-help mailing list
> >>> datatable-help at lists.r-forge.r-project.org
> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> >>>
> >>
> >>
> >>
> >
> >
> >
> > --
> > Steve Lianoglou
> > Graduate Student: Computational Systems Biology
> >  | Memorial Sloan-Kettering Cancer Center
> >  | Weill Medical College of Cornell University
> > Contact Info: http://cbio.mskcc.org/~lianos/contact
> >
> 
> 
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




More information about the datatable-help mailing list