[datatable-help] Memory issue

Matthew Dowle mdowle at mdowle.plus.com
Tue Oct 16 01:28:10 CEST 2012


Welcome.

Nothing springs to mind I'm afraid. Stabs in the dark ...  POSIXlt is 40
bytes per date and since you mention it's an isodate check for pesky
POSIXlt anywhere? Or one of the tables might be being recycled in a list
column perhaps?

Will need the output of str() on DT1,DT2 and DT3 please. Before and after
the setkey.  And the exact command you're using to write to disk e.g.,
text or binary?  If you load each file back into another R session, can
you spot any differences in demensions or type between them?

Matthew

> I've been using data.table heavily since UseR! 2012.
>
> For the most part it's been nothing short of a magical panacea, until
> today.
>
> I had a strange problem where using setkey on a data table makes the file
> huge (in comparison to what it should be) when it saves.
>
> I spent hours trying to reproduce this with similar data that I could
> share, but I couldn't get it to happen on simulated data.
>
> Here is an outline of my process:
> Data table 1 (DT1) is about 80 mb when I save
> Data table 2 (DT2) is about 10 mb when I save
> Data table 3 (DT3) = cbind(DT1, DT2)
>
> Data table 3 is about 90 mb when I save  (so far so good)
>
> If I set the key of DT3 to be a particular column (for me it's isotime),
> suddently the table is 212 mb of disk space
> If I change the key to something else, or set it to NULL it still takes
> 212
> mb
>
> HOWEVER, if I never set DT3's key to isotime, but I set it to another
> column instead (like a "name" field), then the file only takes about 90 mb
> as expected
>
> The memory ballooning only happens with the save.  The actual "in memory"
> values for these data sets are about the right size.
>
> I need to step, but I can give more information tomorrow if you would
> like.
>
> I'm using R 2.15.1 "Roasted Marshmallows" and a Windows 7 machine.  The
> package version is data.table 1.8.2
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




More information about the datatable-help mailing list