[datatable-help] Grouping with sort

Matthew Dowle mdowle at mdowle.plus.com
Sat May 7 00:37:18 CEST 2011


Steve H,
How much is 'much better' and 'much longer' please? And on how many
rows/GB? What is the bigger picture, and why are you concatenating
strings together and using paste() at all?
Guess 1: you can include the x column in your key; e.g. setkey(grp,x),
then there would be no need to sort(x) again.
Guess 2: sorting character can be slow. Hence we don't allow character
columns in keys (yet); data.table converts character to factor.
But, ideally, more information at a higher level would help us to help.
Matthew


On Fri, 2011-05-06 at 12:16 -0700, Steve Harman wrote:
> Connected to this RMySQL performs much better
> (using GROUP BY and functions such as GROUP_CONCAT which allows you
> to
> order and use a separator too).
> 
> So, I would recommend using them if you want grouping with sorting.
> 
> On May 6, 2:36 pm, Steve Harman <stvhar... at gmail.com> wrote:
> > Hello !
> > When grouping using data.table, mean and sum functions applied within
> > groups work well but if we use sort(x) function it takes much longer.
> >
> > I would like to do first sort(x) and put it inside paste such as
> > paste(sort(x),collapse=",")
> > I was wondering if there is any more efficient of effective way of
> > doing this?
> >
> > thanks in advance,
> >
> > Steve
> > _______________________________________________
> > datatable-help mailing list
> > datatable-h... at lists.r-forge.r-project.orghttps://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatabl...
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




More information about the datatable-help mailing list