[datatable-help] Non-integer key?

Matthew Dowle mdowle at mdowle.plus.com
Mon Aug 15 09:20:08 CEST 2011


We'd like to do things like this and have talked about it. There isn't
anything fundamentally stopping it, just time/priority really.

Another option is to use two columns: date and time. The time can then
then be 31bit integer milliseconds from midnight, or (less
traditionally) HHMMSSmmm integers. The advantage of 2 columns is it
allows you to roll=TRUE but not across days, easily. There might be more
advantages of HHMMSSmmm time vs epoch time than is generally realised.

> 1. Store the integer milliseconds as numeric type, which allows
> 64-bit integers to be stored exactly. This solution would
> simply require an option for data.table to assume that a
> numeric type is actually storing integer values. 

That's an interesting idea: storing a 64bit int in what R thinks is a
double vector. There could be a cast in sortedmatch.c, and
month,week,hour,minute,ms methods could be defined for it at R level for
use by 'by'. I wonder what Tom thinks about this.

> 2. Add a second, slower, sorting algorithm to data.table that
> could handle non-integer values. This solution would allow
> POSIXct types to be used directly by data.table. 

Yes, that's possible and reasonable, too. That
would be a great contribution to add to data.table. R-Forge makes it
easy to join and contribute, one click for you to request, one click for
me to accept. Every commit is emailed round the developers at the time,
so as long as test.data.table() passes, then feel free to enhance.

Matthew




On Sun, 2011-08-14 at 22:33 -0400, Leon Baum wrote:
> Chris Neff <caneff at gmail.com> writes:
> 
> > Have you seen the ITime class (also IDate and IDateTime)? Meant for
> > this purpose.
> >
> 
> Yes, I looked at it briefly, but it appears ITime can only handle
> whole-second precision.
> 
> Perhaps it would be possible to modify IDate/ITime to store time as whole
> milliseconds, but in the numeric type. This change would allow it to
> store any reasonable date/time in milliseconds, but data.table could
> still use radix sorting. Thoughts?
> 
> -Leon
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




More information about the datatable-help mailing list