[datatable-help] Slow execution: Extracting last value in each group
Arunkumar Srinivasan
aragorn168b at gmail.com
Fri Aug 16 15:47:07 CEST 2013
Frank,
Great, thank you. So, basically it's the call to "C" that's taking the time.. Probably version of C? I still have trouble using gdb with R. Can't help much to debug there. Hopefully someone else could lend a hand.
Arun
On Friday, August 16, 2013 at 3:43 PM, Frank Erickson wrote:
> Hi Arun,
>
> Yup, windows (see below).
>
> I tried debugonce, but didn't really know what I was looking for. Every step was instantaneous except this one:
>
> debug: ans = .Call(Cdogroups, x, xcols, groups, grpcols, jiscols, grporder,
> o__, f__, len__, jsub, SDenv, cols, newnames, verbose)
>
>
> --Frank
>
> sessionInfo()
>
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] rbenchmark_1.0.0 data.table_1.8.8
>
> loaded via a namespace (and not attached):
> [1] tools_3.0.1
>
>
>
>
> On Fri, Aug 16, 2013 at 5:37 AM, Arunkumar Srinivasan <aragorn168b at gmail.com (mailto:aragorn168b at gmail.com)> wrote:
> > Frank,
> > Is it a windows machine as well?
> > And could you try to use `debugonce` to find out the line(s) where it's slow?
> >
> > Arun
> >
> >
> > On Friday, August 16, 2013 at 12:34 PM, Frank Erickson wrote:
> >
> > > I get similar timings to arun, with the data.table call being a lot slower than the other timings. If data.table is not optimized for that .SD expression, perhaps that is okay because, as Arun pointed out, there are alternatives.. I can't guess why it would perform differently on different hardware, though...
> > >
> > > # alternatives:
> > > a <- dt1[dt1[, .I[.N], by='Date']$V1]
> > > b <- dt1[J(unique(Date)),,mult='last'] # a little slower
> > > d <- dt1[, .SD[.N], by='Date'] # 600x slower; it would take ages to benchmark
> > > identical(a,b) # true
> > > identical(a,d) # false
> > > identical(as.data.frame(d),as.data.frame(a)) # true
> > >
> > > --Frank
> > >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130816/1e7b431d/attachment-0001.html>
More information about the datatable-help
mailing list