[datatable-help] Non-integer key?

Leon Baum leonbaum2 at gmail.com
Mon Aug 15 03:44:22 CEST 2011


Hello,

I understand the performance reasons for requiring the key to be of
integer type, but I often would like to use data.table's joining and
grouping features with millisecond POSIXct timestamps.

One workaround is to convert the timestamps to an integer representing
the milliseconds since the first timestamp, but for time ranges of
larger than a few weeks, the number of milliseconds is larger than what
can be stored in a 32-bit integer.

I see two possible solutions:

1. Store the integer milliseconds as numeric type, which allows 64-bit
integers to be stored exactly. This solution would simply require an
option for data.table to assume that a numeric type is actually storing
integer values. 

2. Add a second, slower, sorting algorithm to data.table that could
handle non-integer values. This solution would allow POSIXct types to be
used directly by data.table. 

Are either of these solutions possible and reasonable? Or is there some
workaround I'm missing to be able to use larger-than-32-bit integers as
keys?

-Leon


More information about the datatable-help mailing list