[datatable-help] sorting on floating point column
Matthew Dowle
mdowle at mdowle.plus.com
Tue Apr 30 16:13:09 CEST 2013
Or, perhaps the tolerance should be a function of the range of the
column. [The range would be quick to calculate with a single C for
loop.]
On 30.04.2013 15:09, Matthew Dowle wrote:
> Hi,
>
>
data.table sorts double within machine tolerance :
>
>>
sqrt(.Machine$double.eps)
> [1] 1.490116e-08
>>
>
> i.e. numbers
closer than this are considered equal.
>
> Otherwise we wouldn't be
able to do things like DT[.(3.14)].
>
> I had a quick look, see
arguments of data.table:::ordernumtol which takes "tol" but there is no
option provided (yet) to change this. Do we need one?
>
> In the
examples section of one of the help pages it has an example which
generates a series of numers very close together using pi. Note that
your numbers are both close together, and, very close to 0.
>
>
Matthew
>
> On 30.04.2013 14:52, Arunkumar Srinivasan wrote:
>
>> Hi
there,
>> I just saw something strange when I was sorting a column of
p-values. I checked the data.table bug tracker for words "sort" and
"floating point" and there were no hits for this case. There's a bug for
"integer 64" sort on a column though.
>> So, here's a reproducible
example. I'd be glad to file a bug, if it is and be corrected if it's
something I am doing wrong.
>>
>> set.seed(45)
>> dt <-
data.table(x=sample(50), y= sample(c(seq(0, 1, length.out=1000),
7000000:7000100), 50)/1e7)
>> head(dt)
>> x y
>> 1: 32 5.395395e-08
>> 2: 16 6.956957e-08
>> 3: 12 2.142142e-08
>> 4: 18 5.855856e-08
>>
5: 17 6.216216e-08
>> 6: 14 5.025025e-08
>> setkey(dt, "y") # sort by
column y
>> head(dt, 10)
>> x y
>> 1: 47 1.401401e-09
>> 2: 12
2.142142e-08
>> 3: 24 1.391391e-08
>> 4: 43 9.809810e-09 <~~~
obviously false
>> 5: 1 2.932933e-08
>> 6: 48 2.562563e-08
>> 7: 49
1.891892e-08
>> 8: 40 2.182182e-08
>> 9: 9 7.307307e-09 <~~~ obviously
false
>> 10: 45 2.482482e-08
>>
>> Best,
>> Arun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130430/3bf90fac/attachment.html>
More information about the datatable-help
mailing list