[datatable-help] Use of data.table with doMC/foreach

Matthew Dowle mdowle at mdowle.plus.com
Fri Dec 31 15:22:46 CET 2010


Tom might be your man on that.
What I'd say is that if the 7.5sec wasn't ok for one grouping then we
might be a bit stuck, but in your case it should be ok using a parallel
wrapper through the 500 groupings. I'd need a toy example to go further
with this one please to see why there's no apparent speed up.
Matthew


On Wed, 2010-12-29 at 11:25 -0600, Damian Betebenner wrote:
> All,
> 
>  
> 
> Does anyone have experience using data.table in parallel using
> doMC/foreach?
> 
>  
> 
> I have a data.table with approximately 3.5 million rows and am
> calculating different summaries (e.g., medians and counts) on some of
> the variables across approximately
> 
> 500 distinct groupings (same j variable with 500 different by groups).
> My thought was to run the different analyses in parallel on my
> multi-core machine and get a good performance
> 
> boost but thus far it doesn’t work any faster.
> 
>  
> 
> The data.table performance is great for each by grouping (about 7.5
> seconds), but doing this 500 times take a while.
> 
>  
> 
> Anyone have experience along these lines.
> 
>  
> 
> Best,
> 
>  
> 
> Damian
> 
>  
> 
>  
> 
>  
> 
> 
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




More information about the datatable-help mailing list