[datatable-help] data.table and reshape

Matthew Dowle mdowle at mdowle.plus.com
Tue Aug 2 01:46:13 CEST 2011


Hi Zach,

I can't think of a better way.

The reshape on aggregate data is typically an order of magnitude smaller
than the original data, so speed isn't so much an issue.

That said, there is a FR on it :

https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1055&group_id=240&atid=978

and that mentions :

http://r.789695.n4.nabble.com/reshape-to-wide-format-takes-extremely-long-tp2487153p2487153.html

which mentions cast(). Also, try searching datatable-help for "reshape"
- there are one or two past threads on it.

I tend to keep everything in long format (like a database) and just
reshape (although I don't actually use reshape) at the end for output or
presentation purposes. Just like you did basically.

Btw, rather than :
> out <- DT[,sum(v),by='x,y']
> names(out)[3] <- 'SUM'

it's more robust to do that in one step :

> out <- DT[,list(SUM=sum(v)),by='x,y']

Matthew


On Mon, 2011-08-01 at 14:02 -0400, Zachary Mayer wrote:
> Hello,
> 
> 
> I was wondering what is the best way to reshape a data table.  Is it
> appropriate to use the 'reshape' function in base r, or is there a
> better way?
> 
> 
> Here is an example:
> library(data.table)
> set.seed(1234)
> DT <- data.table(x=rep(c("a","b","c"),each=4), y=c("AA","BB"), v=sample(1:100,12)
> out <- DT[,sum(v),by='x,y']
> names(out)[3] <- 'SUM'
> out <- reshape(out,direction='wide',idvar='x', timevar='y')
> Thank you.
> 
> 
> -Zach
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




More information about the datatable-help mailing list