[datatable-help] := unclarity and possible bug?
Chris Neff
caneff at gmail.com
Thu Aug 4 14:39:48 CEST 2011
Ignore this second one, restarting and refreshing my data.table
install now gives the proper error message when I try that. Sorry I'm
not used to being on the bleeding edge of these things and I forget to
update. However the first question is still mainly relevant:
> DT <- data.table(x=1:10, y=rep(1:2,5))
> DT[,z:=5]
x y
[1,] 1 1
[2,] 2 2
[3,] 3 1
[4,] 4 2
[5,] 5 1
[6,] 6 2
[7,] 7 1
[8,] 8 2
[9,] 9 1
[10,] 10 2
> DT[1:nrow(DT),z:=5]
Error in `[.data.table`(DT, 1:nrow(DT), `:=`(z, 5)) :
Attempt to add new column(s) and set subset of rows at the same
time. Create the new column(s) first, and then you'll be able to
assign to a subset. If i is set to 1:nrow(x) then please remove that
(no need, it's faster without).
> DT$z <- NA
> DT[, z:=5]
x y z
[1,] 1 1 TRUE
[2,] 2 2 TRUE
[3,] 3 1 TRUE
[4,] 4 2 TRUE
[5,] 5 1 TRUE
[6,] 6 2 TRUE
[7,] 7 1 TRUE
[8,] 8 2 TRUE
[9,] 9 1 TRUE
[10,] 10 2 TRUE
The return on DT[,z:=5] when I haven't initialized DT$z yet is
different, but still more uninformative than it is when I do
DT[1:nrow(DT), z:=5]. And the DT$z <- NA issue is still there.
Thanks!
On 4 August 2011 08:18, Chris Neff <caneff at gmail.com> wrote:
> A second question while I'm playing with it. It seems from the FRs
> that it doesn't support multiple := in one select, but:
>
> DT <- data.table(x=1:10, y=rep(1:2,10))
> DT$a = 0
> DT$z = 0
>
> DT[, list(a := y/sum(y), z := 5)]
>
> works just fine for me. An error gets thrown but afterwards the
> columns are modified as intended. Why the error?
>
>> DT[,list(z:=5,a:=y/sum(y))]
> z
> [1] 5
> [1] TRUE
> a
> y/sum(y)
> [1] TRUE
> Error in data.table(`:=`(z, 5), `:=`(a, y/sum(y))) :
> column or argument 1 is NULL
>> DT
> x y z a
> [1,] 1 1 5 0.06666667
> [2,] 2 2 5 0.13333333
> [3,] 3 1 5 0.06666667
> [4,] 4 2 5 0.13333333
> [5,] 5 1 5 0.06666667
> [6,] 6 2 5 0.13333333
> [7,] 7 1 5 0.06666667
> [8,] 8 2 5 0.13333333
> [9,] 9 1 5 0.06666667
> [10,] 10 2 5 0.13333333
>
> -Chris
>
> On 4 August 2011 08:12, Chris Neff <caneff at gmail.com> wrote:
>> Hi all,
>>
>> If I do:
>>
>> DT <- data.table(x=1:10, y=rep(1:2,5))
>>
>> Then try the following
>>
>> DT[, z:=5]
>>
>> I get:
>>
>>> DT[, z:=5]
>> z
>> [1] 5
>> [1] TRUE
>> NULL
>>
>> and if I were to do DT <- DT[, z:=5], then DT gets set to NULL.
>> Alternatively if I do
>>
>> DT[1:10, z:=5]
>>
>> I get
>>
>>> DT=DT[1:nrow(DT),z:=5]
>> z
>> [1] 5
>> [1] 1 2 3 4 5 6 7 8 9 10
>> Error in `:=`(z, 5) :
>> Attempt to add new column(s) and set subset of rows at the same
>> time. Create the new column(s) first, and then you'll be able to
>> assign to a subset. If i is set to 1:nrow(x) then please remove that
>> (no need, it's faster without).
>>
>>
>> Which is more informative. So I do as it instructs:
>>
>> DT$z <- NA
>>
>> DT[, z:=5]
>>
>> And as output I get:
>>
>>> DT
>> x y z
>> [1,] 1 1 TRUE
>> [2,] 2 2 TRUE
>> [3,] 3 1 TRUE
>> [4,] 4 2 TRUE
>> [5,] 5 1 TRUE
>> [6,] 6 2 TRUE
>> [7,] 7 1 TRUE
>> [8,] 8 2 TRUE
>> [9,] 9 1 TRUE
>> [10,] 10 2 TRUE
>>
>>
>> Why isn't z 5 like assigned? I think it is because I assigned it as
>> NA, and data table didn't know to change it to integer (although why
>> it changed it to logical is another puzzle). If I instead do
>>
>> DT$z <- 0
>>
>> DT[, z:=5]
>>
>> It works fine.
>>
>> So my two points are:
>>
>> A) Doing DT[,z:=5] should be as informative as doing DT[1:nrow(DT),
>> z:=5] with the error message.
>>
>> B) What went wrong with the NA assignment I did?
>>
>> Thanks!
>> Chris
>>
>
More information about the datatable-help
mailing list