[datatable-help] data.table segfaulting, need help verifying the reason

Chris Neff caneff at gmail.com
Tue Sep 10 19:32:20 CEST 2013


I'm pretty sure it is some issue of a column that thinks it is bigger than
it actually is.  I have tried, so far in vain, to make a reproducible
example that I can share.  I have one, but can't share it.

What happens is this:

A data.frame is made:

> d = data.frame(...)

Then I call apply over every row, calling a different function that takes
in a DT as well:

l = apply(d, 1, function(x) func(x[1], x[2], DT))

This returns a data.frame.  If I rbindlist this:

a = rbindlist(l)

I can print a just fine, and it will show me all data like normal. but if I
try to just do

a$x

x is one of the columns that was a key in DT, then it segfaults.  If I ask
for a column that was made by "func" and wasn't a column in DT, it works
fine.  If I ask for only the first 10 rows and then ask for x:

a[1:10]$x

it works fine.

So somewhere these key columns think they are different lengths than they
really are, and when I try to access it I go into memory I shouldn't so I
segfault.  How can I verify this? Is there something about the DT I can
check to see what DT thinks these columns are?


Also, if instead of apply when making the list, I do

l = lapply(1:nrow(d), function(i) func(x[i,1],x[i,2],DT))

and rbindlist that, it works fine too.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130910/c6f9554d/attachment.html>


More information about the datatable-help mailing list