[datatable-help] data.table BUG : data.table assignment

Ken Williams Ken.Williams at windlogics.com
Thu Oct 4 17:34:10 CEST 2012


One thing that would be super-awesome is a lazy copy-on-write mechanism.  By which I mean table2 would initially point to the same memory location as table1, but as soon as it's modified, the portions being modified would be copied to a new location.

It's pretty rare use case to make a copy of a data structure without intending to modify it.  The only instance I can really think of is passing arguments to a function, which IIUC is already copy-on-write in R?

Also, I do think Natus' example is a little different from your example on SO.  In his, a new column is being added, but in yours, an existing column is being modified.

Is there a doc reference showing which circumstances make `<-` do a copy-by-reference, and which do a deep copy?  For example, if I do `table2 <- table1[x>1, list(id)]` , it seems to do a deep copy:

> table1<-data.table(id=c(1,2,3),x=c(1,2,3))
> table2 <- table1[x>1, list(id)]
> table2[, id := 3:4]
> table1
   id x
1:  1 1
2:  2 2
3:  3 3

Sorry if this has already been hashed out a million times, I'm pretty new to data.table.

-Ken

From: datatable-help-bounces at lists.r-forge.r-project.org [mailto:datatable-help-bounces at lists.r-forge.r-project.org] On Behalf Of Christoph Jäckel
Sent: Thursday, October 04, 2012 7:07 AM
To: natus
Cc: datatable-help at lists.r-forge.r-project.org
Subject: Re: [datatable-help] data.table BUG : data.table assignment

This is actually intended behaviour and I had the problem once as well. Here is my question and the solution to it:

http://stackoverflow.com/questions/8030452/pass-by-reference-the-operator-in-the-data-table-package

In a nutshell: Use copy() if you don't want table2 to have y as well.

I hope this helps,

Christoph
On Thu, Oct 4, 2012 at 1:56 PM, natus <niparisco at gmail.com<mailto:niparisco at gmail.com>> wrote:
Hello,

see this example :

require(data.table)

table1<-data.table(id=c(1,2,3),x=c(1,2,3))
table2<-table1
table1[,y:=sum(x)]
table1
table2

The problem ? Both of table1 and table2 have the variable 'y' BUT only
table1 should.

Thx



--
View this message in context: http://r.789695.n4.nabble.com/data-table-BUG-data-table-assignment-tp4644988.html
Sent from the datatable-help mailing list archive at Nabble.com.
_______________________________________________
datatable-help mailing list
datatable-help at lists.r-forge.r-project.org<mailto:datatable-help at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help


________________________________
CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20121004/ba490ee0/attachment.html>


More information about the datatable-help mailing list