[datatable-help] My real issue with numeric keys: two numeric keys don't seem to unique correctly.

Matthew Dowle mdowle at mdowle.plus.com
Tue May 15 18:55:13 CEST 2012


Interesting, thanks. Yes, please file a bug report.

> Sorry for the last email, I had realised I wasn't 100% sure about what
> my issue was. Here it is:
>
>
>> dt=data.table(x=0.0,y=c(0,.1,0,.2,0))
>> setkeyv(dt,c('x','y'))
>
>
> After doing this, y is not sorted.  Note that dt has the row 0,0
> repeated three different times. This comes from the following issue I
> guess:
>
>
>> dt
>      x   y
> [1,] 0 0.0
> [2,] 0 0.1
> [3,] 0 0.0
> [4,] 0 0.2
> [5,] 0 0.0
>> unique(dt)
>      x   y
> [1,] 0 0.0
> [2,] 0 0.1
> [3,] 0 0.0
> [4,] 0 0.2
> [5,] 0 0.0
>
>
> Unique does not detect the duplicated rows! This also means doing
>
>> dt[,list(count=.N),by=c("x","y")]
>
> Does not group the way it should.
>
> This seems to result from faulty logic in data.table:::fastorder.  It
> sorts the last column, y, correctly, but when using that to sort the x
> column, it returns the identity ordering which clearly doesn't make
> sense here.
>
> The final tidbit is that it seems to be because of two numeric columns
> together.  If you change x to character:
>
>> dt$x=as.character(dt$x)
>> unique(dt)
>      x   y
> [1,] 0 0.0
> [2,] 0 0.1
> [3,] 0 0.2
>
>
> And everything works fine as it should.  Shall I file a bug report?
>
> -Chris
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>




More information about the datatable-help mailing list