[datatable-help] data.table and reshape
Matthew Dowle
mdowle at mdowle.plus.com
Thu Aug 4 04:47:12 CEST 2011
Hi,
I don't know reshape/dcast/melt well, so thanks Dennis. I've linked this
thread to the FR on it. This area seems to be coming up quite a bit.
https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1055&group_id=240&atid=978
Matthew
On Mon, 2011-08-01 at 18:14 -0700, Dennis Murphy wrote:
> Hi:
>
> An alternative to the reshape() function is the reshape package (or
> the enhanced reshape2 package). Since data tables also have a data
> frame attribute, the reshape package plays nice with them. Here's how
> it would look using the cast() function in the reshape package:
>
> library(reshape)
> cast(out, x ~ y, value = 'SUM')
> x AA BB
> 1 a 72 123
> 2 b 84 119
> 3 c 162 96
>
> The variables you want in the rows ('id' variables) are listed on the
> LHS of the formula, the 'timevar' variable is on the right hand side
> of the formula and the value variable is the 'dependent' variable, for
> lack of a better term.
>
> The dcast() function in the reshape2 package is preferred because it
> has a few extra options that come in handy on occasion - e.g., a means
> of optionally setting a value when a cell in the reshaped data frame
> is empty rather than filling it with NA. The code in this case is
> almost identical:
>
> > dcast(out, x ~ y, value_var = 'SUM')
> x AA BB
> 1 a 72 123
> 2 b 84 119
> 3 c 162 96
>
> There are some differences in the output of the two functions, though:
>
> > str(dcast(out, x ~ y, value = 'SUM'))
> Using SUM as value column: use value_var to override.
> 'data.frame': 3 obs. of 3 variables:
> $ x : Factor w/ 3 levels "a","b","c": 1 2 3
> $ AA: int 72 84 162
> $ BB: int 123 119 96
> > str(reshape(out, direction='wide', idvar='x', timevar='y'))
> Classes ‘data.table’ and 'data.frame': 3 obs. of 3 variables:
> $ x : Factor w/ 3 levels "a","b","c": 1 2 3
> $ SUM.AA: int 72 84 162
> $ SUM.BB: int 123 119 96
> - attr(*, "reshapeWide")=List of 5
> ..$ v.names: NULL
> ..$ timevar: chr "y"
> ..$ idvar : chr "x"
> ..$ times : Factor w/ 2 levels "AA","BB": 1 2
> ..$ varying: chr [1, 1:2] "SUM.AA" "SUM.BB"
> > str(as.data.table( dcast(out, x ~ y, value = 'SUM')))
> Using SUM as value column: use value_var to override.
> Classes ‘data.table’ and 'data.frame': 3 obs. of 3 variables:
> $ x : Factor w/ 3 levels "a","b","c": 1 2 3
> $ AA: int 72 84 162
> $ BB: int 123 119 96
>
> As you can see, the last line retains both classes but does not create
> the attributes that the reshape() function does. You can decide which
> best suits your purposes.
>
> HTH,
> Dennis
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
More information about the datatable-help
mailing list