<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Hi,<br>
<br>
Interesting. To hone in on this my first quick thoughts are :<br>
1. Try in plain R at the prompt rather than RStudio, just to
isolate that for now.<br>
2. Assign the result dummy<-dt[,pt:=as.integer(p),by=list(sk,
ik, pk)]; gc(). That shouldn't make a difference but when
printing at the prompt (even just the head and tail) I'm aware
that makes an internal copy of the whole object (to be fixed, and
in the meantime a manual print(dt) avoids that copy). If it's a
script that's being run then maybe printing comes into it.<br>
3. Is it after the last group has been processed, or during
grouping? To establish this try printing the value of .GRP inside
j; i.e., dt[,pt:={print(.GRP);as.integer(p)},by=list(sk, ik, pk)].
This will give me a clue where it might be.<br>
4. p is definitely a column of the table dt at that point? If p
is actually in calling scope it might be doing the wrong thing
(over and over again).<br>
5. Does it work with a much smaller subset of dt say 10 rows?
Often this reveals that an incorrect (much larger result) is being
computed. Maybe related to allow.cartesian.<br>
6. Set options(datatable.verbose=TRUE), run again from scratch in
a new session and send us the output. Might be a lot of it but we
might get lucky, or give further clues.<br>
7. Otherwise, something reproducible would be great if possible.
In cases like this it doesn't have to reproduce the memory
allocation problem, it just has to be pasteable into a fresh R
session and complete on small data. Then I can stress test it
myself and see if I can see where the leak or corruption is
happening.<br>
<br>
Matthew<br>
<br>
On 02/08/13 15:43, Paul Harding wrote:<br>
</div>
<blockquote
cite="mid:CAMSrYkd7C0AJmzdb28CUmoR_DL=psXAwEySOiXJnLDiV82GUoA@mail.gmail.com"
type="cite">
<div dir="ltr">Hi, I've got a big data table and I'm having memory
allocation issues. This isn't about the memory issue per se,
rather it's about how it gets handled.<br>
<div class="gmail_quote">
<div dir="ltr">
<div><br>
</div>
<div>The table has 2M+ rows and is about 15G in size. Whilst
manipulating the table memory usage grows quite fast, and
I'm having to manually garbage collect after each
manipulation. Even so it's possibly to reach a point
(there are a lot of other developers using this server for
all sorts of things) where even though there is 28GB
memory free I can't allocate a needed 944MB contiguous
chunk.</div>
<div><br>
</div>
<div>I get the usual error message and it would be
convenient if data table exited at that point (then I
wouldn't lose my previous work), but it just hangs:</div>
<div><br>
</div>
<div>
<div>02-06:30:38.8> dt[,pt:=as.integer(p),by=list(sk,
ik, pk)]; gc()</div>
<div>Error: cannot allocate vector of size 944.8 Mb</div>
</div>
<div><br>
</div>
<div>And the world holds its breath ... and the world starts
turning blue ...I've left it like this for hours, nothing
further happens.</div>
<div><br>
</div>
<div>Windows Server 2008 R2 Enterprise SP1 // Intel Zeon CPU
E7-4830 @ 2.13Hhz 4 processors // 128GB memory installed,
28.7GB available, R session 65GB</div>
<div>R 3.0.0 data.table 1.8.9 rev 874</div>
<div>
RStudio 0.97</div>
<div><br>
</div>
<div>Incidentally, after finishing a table manipulation and
garbage collecting the R session memory usage drops to
33GB. This is consistent behaviour, there were 5 similar
calls prior to this one that executed successfully, with
the same behavior ( garbage collected after each). Almost
as if there were a copy being made. But that's for info,
not shooting off at a tangent (I'll try and do some
investigation and maybe ask for help around the temporary
memory growth issue later).</div>
<div><br>
</div>
<div>I would be really happy if data table exited on this
error or if I had that option, even if it's doing
something very clever (waiting for memory availability?)
because it doesn't seem to succeed.</div>
<div><br>
</div>
<div>Regards</div>
<span class="HOEnZb"><font color="#888888">
<div>Paul</div>
</font></span></div>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
datatable-help mailing list
<a class="moz-txt-link-abbreviated" href="mailto:datatable-help@lists.r-forge.r-project.org">datatable-help@lists.r-forge.r-project.org</a>
<a class="moz-txt-link-freetext" href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a></pre>
</blockquote>
<br>
</body>
</html>