[datatable-help] setnames changes names of other data.table

Matt Dowle mdowle at mdowle.plus.com
Sat Jan 11 15:53:00 CET 2014


On 11/01/14 14:31, Arunkumar Srinivasan wrote:
>
> Thanks for reporting. That's expected behaviour. Use an explicit |copy|.
>
> In short, when you do: |DT1 <- DT2|, there's *|no copy|* being made. 
> They still reference/point to the same location (try doing 
> |tracemem(DT1)| and |tracemem(DT2)|).
>
Just to be clear that's no different to base.   DF1 <- DF2 makes no copy 
in base either.   In fact  x <- y never makes a copy in R regardless of 
what x and y are.

The phrase "copy-on-write" is terribly named because it might imply DF1 
<- DF2 copies.  I think the term should be "copy-on-subassign" because 
that's really what R does.  Only at the point of changing a sub-element 
of an object,  does <- copy (if another symbol is pointing to that same 
object).    It is switching from <-  to   set and := that does things by 
reference.    Not switching from data.frame to data.table.   
Subassigning to a data.table using <- will still copy the entire 
data.table, just like base.    Only  set* and := can modify by 
reference.   In fact,   set* can be used on data.frame too, and other 
objects;  e.g. setattr is often useful on non-data.table's  and 
therefore copy() is useful on non-data.table's  too.      Hope that 
clarifies.
>
> So when you change the names of one |DT| by reference, the other one 
> will get changed as well - they're both pointing to the same location.
>
> To overcome this, when you want to duplicate a |DT|, explicitly use 
> |copy|. That is, |DT1 <- copy(DT2)|. Now if you |setnames(DT1, c("x", 
> "y"))|, then |DT2| names won't get changed.
>
> I think there's a FR somewhere on documenting this... Thanks again for 
> reporting (with nice example).
>


>
> Arun
> ------------------------------------------------------------------------
> From: Holger Kirsten Holger Kirsten <mailto:hkirsten at imise.uni-leipzig.de>
> Reply: Holger Kirsten hkirsten at imise.uni-leipzig.de 
> <mailto:hkirsten at imise.uni-leipzig.de>
> Date: January 11, 2014 at 3:19:02 PM
> To: datatable-help at lists.r-forge.r-project.org 
> datatable-help at lists.r-forge.r-project.org 
> <mailto:datatable-help at lists.r-forge.r-project.org>
> Subject: [datatable-help] setnames changes names of other data.table
>> In a debugging session, I found that setnames changed the names of an 
>> identical data.table although having a different name>
>>
>> > ############### using setnames()
>> > require(data.table)
>> > mytab = data.table(a = letters[1:4], b = 1:4 )
>> > str(mytab)
>> Classes 'data.table' and 'data.frame':    4 obs. of  2 variables:
>>  $ a: chr  "a" "b" "c" "d"
>>  $ b: int  1 2 3 4
>>  - attr(*, ".internal.selfref")=<externalptr>
>> > mytab
>>    a b
>> 1: a 1
>> 2: b 2
>> 3: c 3
>> 4: d 4
>> >
>> > othertab = mytab
>> > othertab
>>    a b
>> 1: a 1
>> 2: b 2
>> 3: c 3
>> 4: d 4
>> > setnames(othertab, c("a", "b"), c("aa","bb"))
>> > othertab
>>    aa bb
>> 1:  a  1
>> 2:  b  2
>> 3:  c  3
>> 4:  d  4
>> > mytab ## names have unexpectedly changed too
>>    aa bb
>> 1:  a  1
>> 2:  b  2
>> 3:  c  3
>> 4:  d  4
>> >
>> > ############### using names()
>> > mytab = data.table(a = letters[1:4], b = 1:4 )
>> > str(mytab)
>> Classes 'data.table' and 'data.frame':    4 obs. of  2 variables:
>>  $ a: chr  "a" "b" "c" "d"
>>  $ b: int  1 2 3 4
>>  - attr(*, ".internal.selfref")=<externalptr>
>> > mytab
>>    a b
>> 1: a 1
>> 2: b 2
>> 3: c 3
>> 4: d 4
>> >
>> > othertab = mytab
>> > othertab
>>    a b
>> 1: a 1
>> 2: b 2
>> 3: c 3
>> 4: d 4
>> > names(othertab) = c("aa","bb")
>> Warning message:
>> In `names<-.data.table`(`*tmp*`, value = c("aa", "bb")) :
>>   The names(x)<-value syntax copies the whole table. This is due to 
>> <- in R itself. Please change to setnames(x,old,new) which does not 
>> copy and is faster. See help('setnames'). You can safely ignore this 
>> warning if it is inconvenient to change right now. Setting 
>> options(warn=2) turns this warning into an error, so you can then use 
>> traceback() to find and change your names<- calls.
>> > othertab
>>    aa bb
>> 1:  a  1
>> 2:  b  2
>> 3:  c  3
>> 4:  d  4
>> > mytab ## names unchanged as expected
>>    a b
>> 1: a 1
>> 2: b 2
>> 3: c 3
>> 4: d 4
>> >
>> > sessionInfo()
>> R version 3.0.1 (2013-05-16)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>
>> locale:
>> [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 
>> LC_MONETARY=German_Germany.1252 LC_NUMERIC=C LC_TIME=German_Germany.1252
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods base
>>
>> other attached packages:
>> [1] data.table_1.8.10
>>
>> loaded via a namespace (and not attached):
>> [1] tools_3.0.1
>> _______________________________________________
>> datatable-help mailing list
>> datatable-help at lists.r-forge.r-project.org
>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140111/74b4bbb9/attachment-0001.html>


More information about the datatable-help mailing list