[datatable-help] Fwd: Data table hanging on memory allocation failure

Steve Lianoglou lianoglou.steve at gene.com
Fri Aug 2 18:44:02 CEST 2013


Hi Paul,

Is this error always reproducible after the same call? You mentioned
you've done 5 (or so?) large data manipulation calls on the data.table
before calling the straw that breaks the camel's back -- if you start
with the last call first, does it still stall on gc()? If the last
call that hangs was changed to do something else (same calling order
as yo have now), does it also hang?

Just taking random guesses here ...

Is there anyway for you to be able to test if you get the same
behavior on a *nix machine? I'm guessing it's probably a tall order to
find extra hardware lying around that has the specs to match the
machine you're reporting the error on, but it might be worth a try.

Sorry: no real answers for you yet.

-steve


We've had some mysterious memory issues in the past which Matthew has
done a good job smoking out
On Fri, Aug 2, 2013 at 7:43 AM, Paul Harding <p.harding at paniscus.com> wrote:
> Hi, I've got a big data table and I'm having memory allocation issues. This
> isn't about the memory issue per se, rather it's about how it gets handled.
>
> The table has 2M+ rows and is about 15G in size. Whilst manipulating the
> table memory usage grows quite fast, and I'm having to manually garbage
> collect after each manipulation. Even so it's possibly to reach a point
> (there are a lot of other developers using this server for all sorts of
> things) where even though there is 28GB memory free I can't allocate a
> needed 944MB contiguous chunk.
>
> I get the usual error message and it would be convenient if data table
> exited at that point (then I wouldn't lose my previous work), but it just
> hangs:
>
> 02-06:30:38.8> dt[,pt:=as.integer(p),by=list(sk, ik, pk)]; gc()
> Error: cannot allocate vector of size 944.8 Mb
>
> And the world holds its breath ... and the world starts turning blue ...I've
> left it like this for hours, nothing further happens.
>
> Windows Server 2008 R2 Enterprise SP1 // Intel Zeon CPU E7-4830 @ 2.13Hhz 4
> processors // 128GB memory installed, 28.7GB available, R session 65GB
> R 3.0.0 data.table 1.8.9 rev 874
> RStudio 0.97
>
> Incidentally, after finishing a table manipulation and garbage collecting
> the R session memory usage drops to 33GB. This is consistent behaviour,
> there were 5 similar calls prior to this one that executed successfully,
> with the same behavior ( garbage collected after each). Almost as if there
> were a copy being made. But that's for info, not shooting off at a tangent
> (I'll try and do some investigation and maybe ask for help around the
> temporary memory growth issue later).
>
> I would be really happy if data table exited on this error or if I had that
> option, even if it's doing something very clever (waiting for memory
> availability?) because it doesn't seem to succeed.
>
> Regards
> Paul
>
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help



-- 
Steve Lianoglou
Computational Biologist
Bioinformatics and Computational Biology
Genentech


More information about the datatable-help mailing list