[datatable-help] 'by' on a numeric column produces inconsistent utput

Arunkumar Srinivasan aragorn168b at gmail.com
Thu Dec 19 09:02:02 CET 2013


Ah, that explains it as well. So a copy is not being sent to fastorder, but that only happens the first time… I'll write again if there are more questions.   

Thanks again Kevin.


Arun


On Thursday, December 19, 2013 at 8:55 AM, Kevin Ushey wrote:

> Hmm, I am seeing that after the data.table:::fastorder call, the dt
> itself is modified. Notice that 'by' is rearranged without modifying
> 'y'.
>  
> > dt
> y by
> 1: 0.01464054 0.7
> 2: 0.87328871 0.4
> 3: -1.02794620 0.4
> > (o__ <- data.table:::fastorder(byval)) # 2,3,1
>  
> [1] 2 3 1
> > dt
>  
> y by
> 1: 0.01464054 0.4
> 2: 0.87328871 0.4
> 3: -1.02794620 0.7
>  
> On Wed, Dec 18, 2013 at 11:44 PM, Arunkumar Srinivasan
> <aragorn168b at gmail.com (mailto:aragorn168b at gmail.com)> wrote:
> > Aha, the issue seems to be with 'uniqlist', not sure why it gives
> >  
> > (f__ = data.table:::uniqlist(byval, order=o__)) # 1,3
> >  
> > 1,2,3 for you and 1,3 consistently for me. I'll revert this back to
> > `duplist` for now. Not sure how to solve this though. I've tried it so far
> > on 3 machines:
> >  
> > 1) OS X 10.8.5 + libvm (gcc)
> > 2) OS X Mavericks + Clang
> > 3) Debian Weezy + gcc
> >  
> > All of them give consistent output. Man this is such a drag.
> >  
> > Arun
> >  
> > On Thursday, December 19, 2013 at 8:37 AM, Kevin Ushey wrote:
> >  
> > Hi Arun,
> >  
> > Here's the output on my machine -- other information missing from
> > before; it's with OSX Mavericks, with R and data.table compiled with
> > Apple clang.
> >  
> > ---
> >  
> > library(data.table, lib="/Users/kevinushey/Library/R/3.1/library")
> > set.seed(32)
> > n <- 3
> > dt <- data.table(
> >  
> > + y=rnorm(n),
> > + by=round( rnorm(n), 1)
> > + )
> >  
> > ## run one
> >  
> > byval <- list(by=dt$by)
> > (o__ <- data.table:::fastorder(byval)) # 2,3,1
> >  
> > [1] 2 3 1
> >  
> > (f__ = data.table:::uniqlist(byval, order=o__)) # 1,3
> >  
> > [1] 1 2 3
> >  
> > (len__ = data.table:::uniqlengths(f__, nrow(dt))) # 2,1
> >  
> > [1] 1 1 1
> >  
> > (firstofeachgroup = o__[f__]) # 2,1
> >  
> > [1] 2 3 1
> >  
> > (origorder = data.table:::iradixorder(firstofeachgroup)) # 2,1
> >  
> > [1] 3 1 2
> >  
> > (f__ = f__[origorder]) # 3,1
> >  
> > [1] 3 1 2
> >  
> > (len__ = len__[origorder]) # 2,1
> >  
> > [1] 1 1 1
> >  
> > ## run two
> >  
> > (o__ <- data.table:::fastorder(byval)) # 2,3,1
> >  
> > [1] 1 2 3
> >  
> > (f__ = data.table:::uniqlist(byval, order=o__)) # 1,3
> >  
> > [1] 1 3
> >  
> > (len__ = data.table:::uniqlengths(f__, nrow(dt))) # 2,1
> >  
> > [1] 2 1
> >  
> > (firstofeachgroup = o__[f__]) # 2,1
> >  
> > [1] 1 3
> >  
> > (origorder = data.table:::iradixorder(firstofeachgroup)) # 2,1
> >  
> > [1] 1 2
> >  
> > (f__ = f__[origorder]) # 3,1
> >  
> > [1] 1 3
> >  
> > (len__ = len__[origorder]) # 2,1
> >  
> > [1] 2 1
> >  
> > On Wed, Dec 18, 2013 at 11:22 PM, Arunkumar Srinivasan
> > <aragorn168b at gmail.com (mailto:aragorn168b at gmail.com)> wrote:
> >  
> > Not sure how to debug without being able to reproduce. Tried on Mac OS X
> > 10.8.5 and Debian GNU/Linux 7 (wheezy). I don't have access to a windows
> > machine. I consistently gives me this:
> >  
> > dt[,
> >  
> > + list(max=max(y, na.rm=TRUE)),
> > + by=list(by)
> > + ]
> > by max
> > 1: 0.7 0.01464054
> > 2: 0.4 0.87328871
> >  
> >  
> > dt[,
> >  
> > + list(max=max(y, na.rm=TRUE)),
> > + by=list(by)
> > + ]
> > by max
> > 1: 0.7 0.01464054
> > 2: 0.4 0.87328871
> >  
> > Can either of you provide me with the output of these steps in cases where
> > there's an error? I've commented the output I get for each step.
> >  
> > byval <- list(by=dt$by)
> > o__ <- data.table:::fastorder(byval) # 2,3,1
> > f__ = data.table:::uniqlist(byval, order=o__) # 1,3
> > len__ = data.table:::uniqlengths(f__, nrow(dt)) # 2,1
> > firstofeachgroup = o__[f__] # 2,1
> > origorder = data.table:::iradixorder(firstofeachgroup) # 2,1
> > f__ = f__[origorder] # 3,1
> > len__ = len__[origorder] # 2,1
> >  
> >  
> > Arun
> >  
> > <...snip...>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20131219/dd7cd03d/attachment-0001.html>


More information about the datatable-help mailing list