[datatable-help] segfault with "large" number of rows

Arunkumar Srinivasan aragorn168b at gmail.com
Wed Jan 29 01:41:29 CET 2014


Hi Guenter,
CC: data.table list,

I filed this as bug #5305 and now we've now fixed it with commit 1100
v1.8.11. Thank you very much once again for reporting!


On Wed, Jan 22, 2014 at 9:52 PM, "Günter J. Hitsch"
<guenter.hitsch at mac.com>wrote:

>
> I’ve been using data.table for several months.  It’s a great package—thank
> you for developing it!
>
> Here’s my question:  I’ve run into a problem when I use “large” data
> tables with many millions of rows.  In particular, for such large data
> tables I get segmentation faults when I create columns by groups.  Example:
>
> N = 2500                        # No. of groups
> T = 100000              # No. of observations per group
>
> DT = data.table(group = rep(1:N, each = T), x = 1)
> setkey(DT, group)
>
> DT[, sum_x := sum(x), by = group]
> print(head(DT))
>
> This runs fine.  But when I increase the number of groups, say from 2500
> to 3000, I get a segfault:
>
> N = 3000                        # No. of groups
> T = 100000              # No. of observations per group
>
> ...
>
>  *** caught segfault ***
> address 0x159069140, cause 'memory not mapped'
>
> Traceback:
>  1: `[.data.table`(DT, , `:=`(sum_x, sum(x)), by = group)
>  2: DT[, `:=`(sum_x, sum(x)), by = group]
>  3: eval(expr, envir, enclos)
>  4: eval(ei, envir)
>  5: withVisible(eval(ei, envir))
>
>
> I can reproduce this problem on:
>
> (1) OS X 10.9, R 3.0.2, data.table 1.8.10
> (2) Ubuntu 13.10, R 3.0.1, data.table 1.8.10
>
> And of course the amount of RAM in my machines is not the issue.
>
> Thanks in advance for your help with this!
>
> Günter
>
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140129/4de229bd/attachment.html>


More information about the datatable-help mailing list