[datatable-help] Memory usage of data.table chaining
Arunkumar Srinivasan
aragorn168b at gmail.com
Fri Feb 20 23:36:30 CET 2015
Hi Mick,
Hope it went great!
Yes, this isn’t particularly memory efficient as you materialise the first subset, only to subset again with your second condition. The query within `[…]` can be optimised much easier when compared to chained expressions.
What’s the rationale here for doing it this way? To take advantage of automatic indexing? It’d be great to have auto indexing optimised for complex expressions like `typeID %in% showID & transactionType == side` but until then, setting key and subsetting would be the best way.
HTH,
Arun
On 20 Feb 2015 at 16:42:51, Mick Cooney (mickcooney at gmail.com) wrote:
I gave a talk about data.table last night to Dublin R and got a very
interesting question at the end of it that I hadn't thought of before.
I was showing how you can chain operations together in nice concise
one liners, the specific example I gave was:
show.dt <- trade.dt[typeID %in% showID]
[transactionType == side]
[, list(transactionID, transactTime,
transactionType,
typeID, typeName, quantity, price)];
print(tail(show.dt, n = count));
This code is written for the game Eve Online and is used to show the
last n number of trades on one side of a trade that my character had
done, and I used it as an example of operation chaining.
I was asked at the end of talk if the chaining of the typeID and the
transactionType was any different to using a logical AND, and my
response was that I wasn't sure, but I figured it might be, as doing
the logical AND would invoke a vector scan.
He then asked about memory use, so in the above example, do all the
subcopies of the tables get kept in memory during the invocation, in
effect mushrooming the amount of memory required?
If that was the case, I could imagine that for large tables it might
be worth going with the logical operation to prevent the multiple
copies being made?
--
Mick Cooney
mickcooney at gmail.com
_______________________________________________
datatable-help mailing list
datatable-help at lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20150220/9368eb11/attachment-0001.html>
More information about the datatable-help
mailing list