[datatable-help] sorting on a floating point column

frederik at ofb.net frederik at ofb.net
Thu Jan 28 00:03:16 CET 2016


data.table 1.9.6

What's surprising is that sorting a list of floats wouldn't do the
obvious thing, and sort them exactly. Is it surprising that this would
be surprising?

Why do you want a minimal test case, when setNumericRounding explains
that the behavior I reported is intentional?

I now see that this is also documented in the data.table::order page.
So I guess it is already "documented visibly".

And setNumericRounding explains that it is slightly faster to ignore
the last two bytes, requiring fewer radix sort passes.

I wanted to share my experience that this behavior is confusing. Thank
you at least for pointing me to your documentation.

Frederick

On Wed, Jan 27, 2016 at 10:13:44PM +0100, Arunkumar Srinivasan wrote:
> This is following up on a thread from a couple years ago: 
> http://lists.r-forge.r-project.org/pipermail/datatable-help/2013-May/001689.html 
> Things have changed A LOT! I suggest you keep up-to-date by reading the README about bug fixes and features from the github project page: https://github.com/Rdatatable/data.table
> 
> I ran into this problem myself, it took a bit of time to debug because it is so surprising. 
> What’s surprising? Reproducible example please. data.table package version, R version as well please. 
> Without that my best guess is for you to look at `?setNumericRounding`.
> 
> -- 
> Arun
> 
> On 27 January 2016 at 21:40:23, frederik at ofb.net (frederik at ofb.net) wrote:
> 
> This is following up on a thread from a couple years ago:  
> 
> http://lists.r-forge.r-project.org/pipermail/datatable-help/2013-May/001689.html  
> 
> I ran into this problem myself, it took a bit of time to debug because  
> it is so surprising.  
> 
> In my case, I was using order() to sort a list of floats.  
> 
> I expected the result to be monotonic but it wasn't!  
> 
> Then I found out that the problem was due to 'order' being part of the  
> data.table library. By using base::order, I was able to get correct  
> behavior.  
> 
> I don't understand why improperly ordering floating point data helps  
> the data.table library accomplish anything, whether it is looking up  
> keys or what.  
> 
> Also, it must be much slower to compare floats with a tolerance, than  
> to just compare them. I seem to recall that floats were designed so  
> that normal comparison is quite fast.  
> 
> Please fix this bug, or at least document it more visibly.  
> 
> Thank you,  
> 
> Frederick Eaton  
> _______________________________________________  
> datatable-help mailing list  
> datatable-help at lists.r-forge.r-project.org  
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help  


More information about the datatable-help mailing list