<div>FYI, I remembered we raised an idea awhile ago for data.table leveraging on int64 package. Now we seem to have a better alternative? If we are indeed considering to support int64 package, maybe you should take a look at this (though 90% chance you already did :p). <br>
</div><div><br></div><div>[Copy from R-packages Digest, Vol 101, Issue 3]<br></div><div><br></div><div>
<span style>Package 'bit64' provides fast serializable S3 atomic 64bit (signed)</span><br style><span style>integers that can be used in vectors, matrices, arrays and data.frames.</span><br style><span style>Methods are available for coercion from and to logicals, integers,</span><br style>
<span style>doubles, characters as well as many elementwise and summary functions.</span><br style><br style><span style>Package 'bit64' has the following advantages over package 'int64' (which</span><br style>
<span style>was sponsored by Google):</span><br style><span style>- true atomic vectors usable with length, dim, names etc.</span><br style><span style>- only S3, not S4 class system used to dispatch methods</span><br style>
<span style>- less RAM consumption by factor 7 (under 64 bit OS)</span><br style><span style>- faster operations by factor 4 to 2000 (under 64 bit OS)</span><br style><span style>- no slow-down of R's garbage collection (as caused by the pure</span><br style>
<span style>existence of 'int64' objects)</span><br style><span style>- pure GPL, no copyrights from transnational commercial company</span>
</div><div><span style><br></span></div><div>
<span style>While the advantage of the atomic S3 design over the complicated S4</span><br style><span style>object design is obvious, it is less obvious that an external package is</span><br style><span style>the best way to enrich R with 64bit integers. An external package will</span><br style>
<span style>not give us literals such as 1LL or directly allow us to address larger</span><br style><span style>vectors than possible with base R. But it allows us to properly address</span><br style><span style>larger vectors in other packages such as 'ff' or 'bigmemory' and it</span><br style>
<span style>allows us to properly work with large surrogate keys from external</span><br style><span style>databases. An external package realizing just one data type also makes a</span><br style><span style>perfect test bed to play with innovative performance enhancements.</span><br style>
<span style>Performance tuned sorting and hashing are planned for the next release,</span><br style><span style>which will give us fast versions of sort, order, merge, duplicated,</span><br style><span style>unique, and table - for 64bit integers.</span>
<span style><br></span></div><div><span style><br></span></div><div>
<span style>For those who still hope that R's 'integer' will be 64bit some day, here</span><br style><span style>is my key learning: migrating R's 'integer' from 32 to 64 bit would be</span><br style>
<span style>RAM expensive. It would most likely require to also migrate R's 'double'</span><br style><span style>from 64 to 128 bit - in order to again have a data type to which we can</span><br style><span style>lossless coerce. The assumption that 'integer' is a proper subset of</span><br style>
<span style>'double' is scattered over R's semantics. We all expect that binary and</span><br style><span style>n-ary functions such as '+' and 'c' do return 'double' and do not</span><br style>
<span style>destroy information. With solely extending 64bit integers but not 128bit</span><br style><span style>doubles, we have semantic changes potentially disappointing such</span><br style><span style>expectations: integer64+double returns integer64 and does kill decimals.</span><br style>
<span style>I did my best to make operations involving integer64 consistent and</span><br style><span style>numerically stable - please consult the documentation at ?bit64 for details.</span><br style><br style><span style>Since this package is 'at risk' to create a lot of dependencies from</span><br style>
<span style>other packages, I'd appreciate serious beta-testing and also</span><br style><span style>code-review, ideally from the R-Core team. Please check the</span><br style><span style>'Limitations' sections at the help page and the numerics involving "long</span><br style>
<span style>double" in C. If the conclusion is that this should be better done in</span><br style><span style>Base R - I happly donate the code and drop this package. If we have to</span><br style><span style>go with an external package for 64bit integers, it would be great if</span><br style>
<span style>this work could convince the Rcpp team including Romain about the</span><br style><span style>advantages of this approach. Shouldn't we join forces here?</span><br style><br style><span style>Best regards</span><br style>
<br style><span style>Jens Oehlschl?gel</span><br style><span style>Munich, 21.2.2012</span>
<span style><br></span></div><div><span style><br></span></div><div><span style><br></span></div>