[datatable-help] whole data.table copied "warning"

Arunkumar Srinivasan aragorn168b at gmail.com
Sun Jan 25 19:06:07 CET 2015


The over allocation of pointers gets (silently) lost when we assign using
`<-` in R. That's why we insist on using `:=`. And this is why we use an
external pointer to detect if the over allocation has been lost.

We can't directly use `:=`, as the over-allocation is gone. So we've to
shallow copy, over-allocate once again. The message should say that the
data was "shallow" copied. So it shouldn't affect performance (unless you
do this quite repeatedly).

But best to avoid the message altogether by using it the idiomatic way.

I'm not sure if the R version matters here or not.

HTH
Arun.


On Sun, Jan 25, 2015 at 12:33 PM, statquant3 <statquant at outlook.com> wrote:

> I am experiencing a problem with data.table and can't find out what the
> problem is.
> I am getting the warning "this data table had to be copied over"
> First i cannot reproduce the bug on small example so I realize this post is
> mostly a bottle thrown at the see...
>
> Here is what I do:
>
> 1.
> I have several tables on a distant kdb server that I retrieve through a
> proprietary package.
> Those tables are copied and some data.frames are created. (I think it as
> nothing to do with it but still you never know)
> Because I have several tables on that server I created a function wrapper
> the "fetching" of a table, a data.frame is created and
> I setDT them the make them a data.table.
> 2.
> I call this function within a sapply command and end up with a named-list
> of
> data.table
> 3.
> I modify some of those tables in a R function, once again using lapply so I
> end up with a modified list of data.tables
> 4.
> I use attach() to be able to work on each data.table by name
>
> **5**
> Latter I when I try to add a column by ":=" I am getting a warning saying
> that the whole table had to be copied.
> Am I doing something obviously wrong here ?
>
>
> pseudo code:
>
> fetchingData = function(tableName, connectionToServer){
>    DF = connectAndFetchTable(connectionToServer, tableName)
>    DT = setDT(DF)
>    return(DT)
> }
>
> modifyDataTable = function(DT){
>    if('thisColName' %in% colnames(DT)){
>       DT[,thisColName:=someTransformation(thisColName)]
>    }
>    ...
> }
>
> I the main code :
>
> myDataTableList = sapply(c('tableA','tableB','tableC','tableD'),
> FUN=fetchingData, connectionToServer=myCon)
> myDataTableList = lapply(myDataTableList, FUN=modifyDataTable)
> attach(myDataTableList)
>
> tableB[,newColumn:=1L]
> *** getting the warning here***
>
> Not I am using R 3.0.2
>
> Finally let me say that I am using data.table for a long time right now, so
> it I know about := and set function (that usually are forgotten and trigger
> this warning like when doing DT$new = stuff)
>
> If any of you gets it I'd be gratefull (sorry I can't reproduce it
> correctly)
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/whole-data-table-copied-warning-tp4702267.html
> Sent from the datatable-help mailing list archive at Nabble.com.
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20150125/c4de7410/attachment.html>


More information about the datatable-help mailing list