[datatable-help] := unclarity and possible bug?

Chris Neff caneff at gmail.com
Thu Aug 4 18:44:44 CEST 2011


The package absolutely installed. I did have 00Lock issues and deleted
it to make it work. Yes I made sure to do require(data.table). When I
do sudo R instead of just R I get:

package "methods" in options("defaultPackages") was not found

Same if I do sudo R CMD INSTALL.  But R CMD INSTALL without sudo
didn't ask for a different location or anything, and seemed to install
just fine.  Is there something I can do to verify that the version
loaded by library(data.table) is the same as the tar file I used to
install? .libPaths() tells me it is loading in the right order, and
the package installed to the first place .libPaths() shows.

For what its worth, the R-Forge build seems to be failing the same
way: https://r-forge.r-project.org/R/?group_id=240&log=check_x86_64_linux&pkg=data.table&flavor=devel

I took the tar from cran and installed it. now DT[,z:=5] works.
However that still means I'm not running the latest developer build.
How can I make sure I install something that is the latest developer
build?

On 4 August 2011 11:15, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
> Relief.  Yes that's definitely not the latest version. That error is fixed
> (otherwise it wouldn't be accepted to CRAN).  The source tar.gz is on CRAN
> now, that should work.
>
> I don't know why type="source" from R-Forge is so behind. I thought it was
> bang up to date that way. If you could ask Stefan on R-Forge I'd be most
> grateful. Or maybe someone else on the list knows.     Might the package
> have gotten locked in your install?  Read up about 00Lock.  Absolutely sure
> no error messages on install?  Did you require(data.table) in the "sudo R"
> after the install just to tickle it?  There are some conditions it rolls
> back to previous (silently, but I never worked it out).  If other R
> processes are using the package (even zombies) it might get confused, so
> kill all R sessions and use a sudo R --vanilla to install.   Maybe an 'svn
> up' way to get latest,  but I never thought that was necessary.  You did
> "sudo R" to install,  right?  Otherwise it asks if you want to install
> somewhere else, and you don't want to do that.
>
>
> "Chris Neff" <caneff at gmail.com> wrote in message
> news:CAAuY0RXA1-9pzL3SR=uF5iJw4MnU_nF40Lw-P1VpiN6Yn=cuDA at mail.gmail.com...
>> test.data.table()
> Running
> /home/caneff/R/x86_64-pc-linux-gnu-library/2.12/data.table/tests/tests.R
> Loading required package: ggplot2
> Loading required package: reshape
>
> Attaching package: 'reshape'
>
> The following object(s) are masked from 'package:plyr':
>
>    rename, round_any
>
> Loading required package: grid
> Loading required package: proto
> Test 304 Error in try(x, TRUE) : could not find function "haskey"
> Error in eval(expr, envir, enclos) : 1 errors in test.data.table()
>
> And the banner says 1.6.3.  I got it from the link after "Download:"
> here: https://r-forge.r-project.org/R/?group_id=240
>
> Should I be clicking somewhere else for 1.6.4? Why won't
> install.packages with type="source" get 1.6.4 for me either?
>
> On 4 August 2011 10:16, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>> does test.data.table() work? How many tests does it run? Does the startup
>> banner state 1.6.4 ?
>>
>> "Chris Neff" <caneff at gmail.com> wrote in message
>> news:CAAuY0RX3er-gSB52am1AZo4Ws9=fbXzqigaJnmfKqPxF_mn-8g at mail.gmail.com...
>> I thought this might be the problem (from the R-forge download page):
>> "In order to successfully install the packages provided on R-Forge,
>> you have to switch to the most recent version of R or, alternatively,
>> install from the package sources (.tar.gz) in older versions of R"
>>
>> I'm running 2.12.1 because it is built with internal company compilers
>> for extra internal support. I tried just downloading the
>> data.table_1.6.3.tar.gz file from r-forge and doing R CMD INSTALL, but
>> still DT[,z:=5] doesn't work.
>>
>>
>>
>> On 4 August 2011 10:06, Chris Neff <caneff at gmail.com> wrote:
>>> cacheOK=FALSE didn't fix it. Running on Ubuntu (so should I do
>>> anything in that paragraph?). Session info:
>>>
>>>> sessionInfo()
>>> R version 2.12.1 (2010-12-16)
>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>>
>>> locale:
>>> [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C
>>> [3] LC_TIME=C LC_COLLATE=C
>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8
>>> [7] LC_PAPER=en_US.utf8 LC_NAME=C
>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] data.table_1.6.3
>>>
>>>
>>> On 4 August 2011 09:56, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>>>>
>>>> Try cacheOK=FALSE (passed on to download.file() via ...)
>>>>
>>>> Or, sessionInfo() please. Is it Windows (dll not being refeshed).
>>>> Reboot,
>>>> clear out, R --vanilla install, clear out browser cache manually.
>>>> Failing
>>>> all that, download file manually and install from file.
>>>>
>>>> The rnorm(10) is already a vector as long as the table itself => invokes
>>>> "replace" column. Since you the user already created it, it is plonked
>>>> right into the column, bang.
>>>> The other case is recycling or subassign, and that preserves the
>>>> column's
>>>> type (for speed, unlike data.frame).
>>>> So, intended behaviour, just not what you expected.
>>>>
>>>>
>>>> "Chris Neff" <caneff at gmail.com> wrote in message
>>>> news:CAAuY0RWYJ8+wEbVG8XvYYqhSNFPV6XxgdQo8smRy+DRWGj5sfg at mail.gmail.com...
>>>> I've ran the following 3 different times in new sessions:
>>>>
>>>> install.packages("data.table",
>>>> repos="http://R-Forge.R-project.org",type="source")
>>>>
>>>> and still DT[,z:=5] does nothing. Is there something I check to make
>>>> sure that the latest version is loaded?
>>>>
>>>>
>>>> As for the coercion stuff, I feel that it feels somewhat inconsistent
>>>> right now. For instance:
>>>>
>>>>> DT <- data.table(x=1:10, y=1:10)
>>>>
>>>>> DT$y <- TRUE
>>>>
>>>>> sapply(DT, class)
>>>>
>>>> x y
>>>> "integer" "integer"
>>>>
>>>>> DT$y <- rnorm(10)
>>>>> sapply(DT, class)
>>>> x y
>>>> "integer" "numeric"
>>>>
>>>> So in the first case y silently coerces the logical to an integer
>>>> without warning, but in the second case y happily turns into a numeric
>>>> when need be. Why the difference?
>>>>
>>>> When I do something like DT$y <- foo, I expect that y should turn into
>>>> foo regardless of what y was before. If there is some reason why DT[,
>>>> y:=foo] should be different than DT$y <- foo, that is a secondary
>>>> matter, but I get mightily confused when DT$y <- foo doesn't behave
>>>> like data.frame.
>>>>
>>>> On 4 August 2011 08:50, Matthew Dowle <mdowle at mdowle.plus.com> wrote:
>>>>> Still doesn't seem to be latest version: DT[,z:=5] should add column
>>>>> (and
>>>>> that's tested).
>>>>> Otherwise correct and intended behaviour (although an informative
>>>>> warning
>>>>> needs adding when 5 gets coerced to type of column (i.e. logical) -
>>>>> thanks
>>>>> for spotting). Remember as.logical(5) is TRUE without warning. So, try
>>>>> creating column with NA_integer_ or NA_real_ instead. Once the column
>>>>> type
>>>>> is set, that's it. Columns aren't coerced to match type of RHS, unlike
>>>>> data.frame [which if you think about it is a big hit if the data is
>>>>> large].
>>>>>
>>>>> "Chris Neff" <caneff at gmail.com> wrote in message
>>>>> news:CAAuY0RXT7q+cm91PJ8KGkMwDApwFxM_EALb-Yu=P6ndp+LEfXg at mail.gmail.com...
>>>>> Ignore this second one, restarting and refreshing my data.table
>>>>> install now gives the proper error message when I try that. Sorry I'm
>>>>> not used to being on the bleeding edge of these things and I forget to
>>>>> update. However the first question is still mainly relevant:
>>>>>
>>>>>> DT <- data.table(x=1:10, y=rep(1:2,5))
>>>>>> DT[,z:=5]
>>>>> x y
>>>>> [1,] 1 1
>>>>> [2,] 2 2
>>>>> [3,] 3 1
>>>>> [4,] 4 2
>>>>> [5,] 5 1
>>>>> [6,] 6 2
>>>>> [7,] 7 1
>>>>> [8,] 8 2
>>>>> [9,] 9 1
>>>>> [10,] 10 2
>>>>>> DT[1:nrow(DT),z:=5]
>>>>> Error in `[.data.table`(DT, 1:nrow(DT), `:=`(z, 5)) :
>>>>> Attempt to add new column(s) and set subset of rows at the same
>>>>> time. Create the new column(s) first, and then you'll be able to
>>>>> assign to a subset. If i is set to 1:nrow(x) then please remove that
>>>>> (no need, it's faster without).
>>>>>> DT$z <- NA
>>>>>> DT[, z:=5]
>>>>> x y z
>>>>> [1,] 1 1 TRUE
>>>>> [2,] 2 2 TRUE
>>>>> [3,] 3 1 TRUE
>>>>> [4,] 4 2 TRUE
>>>>> [5,] 5 1 TRUE
>>>>> [6,] 6 2 TRUE
>>>>> [7,] 7 1 TRUE
>>>>> [8,] 8 2 TRUE
>>>>> [9,] 9 1 TRUE
>>>>> [10,] 10 2 TRUE
>>>>>
>>>>>
>>>>>
>>>>> The return on DT[,z:=5] when I haven't initialized DT$z yet is
>>>>> different, but still more uninformative than it is when I do
>>>>> DT[1:nrow(DT), z:=5]. And the DT$z <- NA issue is still there.
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>> On 4 August 2011 08:18, Chris Neff <caneff at gmail.com> wrote:
>>>>>> A second question while I'm playing with it. It seems from the FRs
>>>>>> that it doesn't support multiple := in one select, but:
>>>>>>
>>>>>> DT <- data.table(x=1:10, y=rep(1:2,10))
>>>>>> DT$a = 0
>>>>>> DT$z = 0
>>>>>>
>>>>>> DT[, list(a := y/sum(y), z := 5)]
>>>>>>
>>>>>> works just fine for me. An error gets thrown but afterwards the
>>>>>> columns are modified as intended. Why the error?
>>>>>>
>>>>>>> DT[,list(z:=5,a:=y/sum(y))]
>>>>>> z
>>>>>> [1] 5
>>>>>> [1] TRUE
>>>>>> a
>>>>>> y/sum(y)
>>>>>> [1] TRUE
>>>>>> Error in data.table(`:=`(z, 5), `:=`(a, y/sum(y))) :
>>>>>> column or argument 1 is NULL
>>>>>>> DT
>>>>>> x y z a
>>>>>> [1,] 1 1 5 0.06666667
>>>>>> [2,] 2 2 5 0.13333333
>>>>>> [3,] 3 1 5 0.06666667
>>>>>> [4,] 4 2 5 0.13333333
>>>>>> [5,] 5 1 5 0.06666667
>>>>>> [6,] 6 2 5 0.13333333
>>>>>> [7,] 7 1 5 0.06666667
>>>>>> [8,] 8 2 5 0.13333333
>>>>>> [9,] 9 1 5 0.06666667
>>>>>> [10,] 10 2 5 0.13333333
>>>>>>
>>>>>> -Chris
>>>>>>
>>>>>> On 4 August 2011 08:12, Chris Neff <caneff at gmail.com> wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> If I do:
>>>>>>>
>>>>>>> DT <- data.table(x=1:10, y=rep(1:2,5))
>>>>>>>
>>>>>>> Then try the following
>>>>>>>
>>>>>>> DT[, z:=5]
>>>>>>>
>>>>>>> I get:
>>>>>>>
>>>>>>>> DT[, z:=5]
>>>>>>> z
>>>>>>> [1] 5
>>>>>>> [1] TRUE
>>>>>>> NULL
>>>>>>>
>>>>>>> and if I were to do DT <- DT[, z:=5], then DT gets set to NULL.
>>>>>>> Alternatively if I do
>>>>>>>
>>>>>>> DT[1:10, z:=5]
>>>>>>>
>>>>>>> I get
>>>>>>>
>>>>>>>> DT=DT[1:nrow(DT),z:=5]
>>>>>>> z
>>>>>>> [1] 5
>>>>>>> [1] 1 2 3 4 5 6 7 8 9 10
>>>>>>> Error in `:=`(z, 5) :
>>>>>>> Attempt to add new column(s) and set subset of rows at the same
>>>>>>> time. Create the new column(s) first, and then you'll be able to
>>>>>>> assign to a subset. If i is set to 1:nrow(x) then please remove that
>>>>>>> (no need, it's faster without).
>>>>>>>
>>>>>>>
>>>>>>> Which is more informative. So I do as it instructs:
>>>>>>>
>>>>>>> DT$z <- NA
>>>>>>>
>>>>>>> DT[, z:=5]
>>>>>>>
>>>>>>> And as output I get:
>>>>>>>
>>>>>>>> DT
>>>>>>> x y z
>>>>>>> [1,] 1 1 TRUE
>>>>>>> [2,] 2 2 TRUE
>>>>>>> [3,] 3 1 TRUE
>>>>>>> [4,] 4 2 TRUE
>>>>>>> [5,] 5 1 TRUE
>>>>>>> [6,] 6 2 TRUE
>>>>>>> [7,] 7 1 TRUE
>>>>>>> [8,] 8 2 TRUE
>>>>>>> [9,] 9 1 TRUE
>>>>>>> [10,] 10 2 TRUE
>>>>>>>
>>>>>>>
>>>>>>> Why isn't z 5 like assigned? I think it is because I assigned it as
>>>>>>> NA, and data table didn't know to change it to integer (although why
>>>>>>> it changed it to logical is another puzzle). If I instead do
>>>>>>>
>>>>>>> DT$z <- 0
>>>>>>>
>>>>>>> DT[, z:=5]
>>>>>>>
>>>>>>> It works fine.
>>>>>>>
>>>>>>> So my two points are:
>>>>>>>
>>>>>>> A) Doing DT[,z:=5] should be as informative as doing DT[1:nrow(DT),
>>>>>>> z:=5] with the error message.
>>>>>>>
>>>>>>> B) What went wrong with the NA assignment I did?
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Chris
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> datatable-help mailing list
>>>>> datatable-help at lists.r-forge.r-project.org
>>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> datatable-help mailing list
>>>> datatable-help at lists.r-forge.r-project.org
>>>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>>>
>>>
>>
>>
>>
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>>
>
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>


More information about the datatable-help mailing list