[datatable-help] Use of data.table with doMC/foreach
Matthew Dowle
mdowle at mdowle.plus.com
Fri Dec 31 15:22:46 CET 2010
Tom might be your man on that.
What I'd say is that if the 7.5sec wasn't ok for one grouping then we
might be a bit stuck, but in your case it should be ok using a parallel
wrapper through the 500 groupings. I'd need a toy example to go further
with this one please to see why there's no apparent speed up.
Matthew
On Wed, 2010-12-29 at 11:25 -0600, Damian Betebenner wrote:
> All,
>
>
>
> Does anyone have experience using data.table in parallel using
> doMC/foreach?
>
>
>
> I have a data.table with approximately 3.5 million rows and am
> calculating different summaries (e.g., medians and counts) on some of
> the variables across approximately
>
> 500 distinct groupings (same j variable with 500 different by groups).
> My thought was to run the different analyses in parallel on my
> multi-core machine and get a good performance
>
> boost but thus far it doesn’t work any faster.
>
>
>
> The data.table performance is great for each by grouping (about 7.5
> seconds), but doing this 500 times take a while.
>
>
>
> Anyone have experience along these lines.
>
>
>
> Best,
>
>
>
> Damian
>
>
>
>
>
>
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
More information about the datatable-help
mailing list