[datatable-help] sorting on floating point column

Arunkumar Srinivasan aragorn168b at gmail.com
Tue Apr 30 16:16:03 CEST 2013


Matthew, 
I see. I din't think about tolerance. Although

dt[with(dt, order(y)), ] 

seems to do the task right (similar to data.frame). I'm glad that I don't have to convert to data.frame to perform the order. I am not keying by this column. Unless one needs this column for keying, I don't think a tolerance option is essential. Although, having it definitely would be only nicer.

Arun


On Tuesday, April 30, 2013 at 4:09 PM, Matthew Dowle wrote:

>  
> Hi,
> data.table sorts double within machine tolerance :
> > sqrt(.Machine$double.eps)
> [1] 1.490116e-08
> > 
>  
> i.e. numbers closer than this are considered equal.
>  
> Otherwise we wouldn't be able to do things like DT[.(3.14)].
>  
> I had a quick look, see arguments of data.table:::ordernumtol which takes "tol" but there is no option provided (yet) to change this. Do we need one?
>  
> In the examples section of one of the help pages it has an example which generates a series of numers very close together using pi. Note that your numbers are both close together, and, very close to 0.
>  
> Matthew
>  
> On 30.04.2013 14:52, Arunkumar Srinivasan wrote:
> > Hi there,
> > I just saw something strange when I was sorting a column of p-values. I checked the data.table bug tracker for words "sort" and "floating point" and there were no hits for this case. There's a bug for "integer 64" sort on a column though.
> > So, here's a reproducible example. I'd be glad to file a bug, if it is and be corrected if it's something I am doing wrong.
> > set.seed(45)
> > dt <- data.table(x=sample(50), y= sample(c(seq(0, 1, length.out=1000), 7000000:7000100), 50)/1e7)
> > head(dt)
> >     x            y
> > 1: 32 5.395395e-08
> > 2: 16 6.956957e-08
> > 3: 12 2.142142e-08
> > 4: 18 5.855856e-08
> > 5: 17 6.216216e-08
> > 6: 14 5.025025e-08
> > setkey(dt, "y") # sort by column y
> > head(dt, 10)
> >      x            y
> >  1: 47 1.401401e-09
> >  2: 12 2.142142e-08
> >  3: 24 1.391391e-08
> >  4: 43 9.809810e-09 <~~~ obviously false
> >  5:  1 2.932933e-08
> >  6: 48 2.562563e-08
> >  7: 49 1.891892e-08
> >  8: 40 2.182182e-08
> >  9:  9 7.307307e-09 <~~~ obviously false
> > 10: 45 2.482482e-08
> > 
> > Best,
> > Arun
> > 
> > 
> 
>  
>  
> 
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130430/b2916bbc/attachment.html>


More information about the datatable-help mailing list