<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix"><br>
Yes, seems like the columns themselves have names, with
inconsistent length.<br>
<br>
lapply(a,names) should reveal the "hidden" names<br>
<br>
To remove them :<br>
<br>
for (i in 1:ncol(a)) setattr(a[[i]],"names",NULL)<br>
<br>
Then lapply(a,names) should be clear.<br>
<br>
Then try again the things that segfaulted before.<br>
<br>
If this fixes it, we'll need to establish how the erroneous names
got in there.<br>
<br>
<br>
On 10/09/13 19:51, Chris Neff wrote:<br>
</div>
<blockquote
cite="mid:CAAuY0RUbX5o-cmdsBjGpZWHqpUwMguv9xGyhONiZqpuddkCwAQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div><br>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Tue, Sep 10, 2013 at 2:02 PM,
Matthew Dowle <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:mdowle@mdowle.plus.com" target="_blank">mdowle@mdowle.plus.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div><br>
Nothing springs to mind. Latest version v1.8.10 from
CRAN right? Or v1.8.11 on R-Forge?<br>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>Both. And 1.8.8.</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div> <br>
On this bit :
<div class="im"><br>
> So somewhere these key columns think they are
different lengths than they really are, and<br>
> when I try to access it I go into memory I
shouldn't so I segfault. How can I verify this? Is<br>
> there something about the DT I can check to see
what DT thinks these columns are?<br>
<br>
</div>
.Internal(inspect(DT)) reveals the internal structure
including length and truelength on the column pointer
vector as well as each column.<br>
<br>
But it's a really odd way of using data.table.
Iterating by row is going to kill performance;
data.table likes by column.<br>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>Trust me I know this, this isn't my code :) I'm just
the data.table guy who helps debug. I am helping him with
better ways, but I think we can agree that it should at
least not segfault.</div>
<div><br>
</div>
<div><br>
</div>
<div>I ran inspect on the two versions of the data.table,
the one that crashes that is made by doing
rbindlist(apply(d,1,...)) and the one that doesn't that
gets made by doing rbindlist(lapply(1:nrow(d),...)), and
changed the variable names and censored out values.</div>
<div><br>
</div>
<div>First the one that fails (accessing either a$k1 or a$k2
will segfault):</div>
<div><br>
</div>
<div>
<div>> .Internal(inspect(a))</div>
<div>@2cc5be0 19 VECSXP g0c7 [OBJ,NAM(2),ATT] (len=13,
tl=100)</div>
<div> @3b643d0 16 STRSXP g0c7 [NAM(2),ATT] (len=326,
tl=0)</div>
<div> @253e488 09 CHARSXP g1c3 [MARK,gp=0x20,ATT]
"#########"</div>
<div> @253e488 09 CHARSXP g1c3 [MARK,gp=0x20,ATT]
"#########"</div>
<div> @253e3f8 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> @253e3f8 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> @253e3f8 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> ...</div>
<div> ATTRIB:</div>
<div> @ac6c20 02 LISTSXP g1c0 [MARK] </div>
<div> TAG: @963418 01 SYMSXP g1c0 [MARK,gp=0x4000]
"names"</div>
<div> @3ba6ad8 16 STRSXP g1c2 [MARK,NAM(2)] (len=2,
tl=0)</div>
<div> @184aed0 09 CHARSXP g1c3 [MARK,gp=0x21,ATT]
"k1"</div>
<div> @184aed0 09 CHARSXP g1c3 [MARK,gp=0x21,ATT]
"k1"</div>
<div> @3b64e30 16 STRSXP g0c7 [NAM(2),ATT] (len=326,
tl=0)</div>
<div>
@253e440 09 CHARSXP g1c3 [MARK,gp=0x20] "#########"</div>
<div> @253e440 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> @253e440 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> @253e440 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> @253e3b0 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> ...</div>
<div> ATTRIB:</div>
<div> @ac6cc8 02 LISTSXP g1c0 [MARK] </div>
<div> TAG: @963418 01 SYMSXP g1c0 [MARK,gp=0x4000]
"names"</div>
<div> @3ba6a68 16 STRSXP g1c2 [MARK,NAM(2)] (len=2,
tl=0)</div>
<div> @bf8578 09 CHARSXP g1c2 [MARK,gp=0x21] "k2"</div>
<div>
@bf8578 09 CHARSXP g1c2 [MARK,gp=0x21] "k2"</div>
<div> @3b65890 16 STRSXP g0c7 [NAM(2)] (len=326, tl=0)</div>
<div> @24eeb68 09 CHARSXP g1c1 [MARK,gp=0x20]
"#########"</div>
<div> @24eeb08 09 CHARSXP g1c1 [MARK,gp=0x20]
"#########"</div>
<div> @24eeb68 09 CHARSXP g1c1 [MARK,gp=0x20]
"#########"</div>
<div> @24eeb08 09 CHARSXP g1c1 [MARK,gp=0x20]
"#########"</div>
<div> @24eeb68 09 CHARSXP g1c1 [MARK,gp=0x20]
"#########"</div>
<div> ...</div>
<div> @1ff5850 13 INTSXP g0c7 [NAM(2)] (len=326, tl=0)
3,3,3,3,3,...</div>
<div> @1fc6600 13 INTSXP g0c7 [NAM(2)] (len=326, tl=0)
2,1,2,1,3,...</div>
<div> ...</div>
<div>ATTRIB:</div>
<div> @21f6d48 02 LISTSXP g0c0 [] </div>
<div> TAG: @963418 01 SYMSXP g1c0 [MARK,gp=0x4000]
"names"</div>
<div> @3efc1f0 16 STRSXP g0c7 [NAM(2)] (len=13, tl=100)</div>
<div> @184aed0 09 CHARSXP g1c3 [MARK,gp=0x21,ATT]
"k1"</div>
<div>
@bf8578 09 CHARSXP g1c2 [MARK,gp=0x21] "k2"</div>
<div> @108be30 09 CHARSXP g1c2 [MARK,gp=0x21] "v1"</div>
<div> @108be68 09 CHARSXP g1c2 [MARK,gp=0x21] "v2"</div>
<div> @108bf10 09 CHARSXP g1c2 [MARK,gp=0x21] "v3"</div>
<div> ...</div>
<div> TAG: @96d200 01 SYMSXP g1c0 [MARK,gp=0x4000]
"row.names"</div>
<div> @2556908 13 INTSXP g0c1 [] (len=2, tl=0)
-2147483648,-326</div>
<div> TAG: @9638e8 01 SYMSXP g1c0 [MARK,gp=0x4000]
"class"</div>
<div> @2701b38 16 STRSXP g0c2 [NAM(2)] (len=2, tl=0)</div>
<div> @bf8460 09 CHARSXP g1c2 [MARK,gp=0x21]
"data.table"</div>
<div> @9f2688 09 CHARSXP g1c2 [MARK,gp=0x21,ATT]
"data.frame"</div>
<div> TAG: @1e75218 01 SYMSXP g1c0 [MARK]
".internal.selfref"</div>
<div> @21f6e28 22 EXTPTRSXP g0c0 [] </div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
<div>
Secondly the one that works (all values can be accessed
fine:</div>
<div><br>
</div>
<div>> .Internal(inspect(a))</div>
<div>@45b4850 19 VECSXP g0c7 [OBJ,NAM(2),ATT] (len=13,
tl=100)</div>
<div> @33a53a0 16 STRSXP g0c7 [NAM(2)] (len=326, tl=0)</div>
<div> @253e488 09 CHARSXP g1c3 [MARK,gp=0x20,ATT]
"#########"</div>
<div> @253e488 09 CHARSXP g1c3 [MARK,gp=0x20,ATT]
"#########"</div>
<div> @253e3f8 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> @253e3f8 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> @253e3f8 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> ...</div>
<div> @33a5e00 16 STRSXP g0c7 [NAM(2)] (len=326, tl=0)</div>
<div> @253e440 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> @253e440 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> @253e440 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> @253e440 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> @253e3b0 09 CHARSXP g1c3 [MARK,gp=0x20]
"#########"</div>
<div> ...</div>
<div> @33a6860 16 STRSXP g0c7 [NAM(2)] (len=326, tl=0)</div>
<div> @24eeb68 09 CHARSXP g1c1 [MARK,gp=0x20]
"#########"</div>
<div> @24eeb08 09 CHARSXP g1c1 [MARK,gp=0x20]
"#########"</div>
<div> @24eeb68 09 CHARSXP g1c1 [MARK,gp=0x20]
"#########"</div>
<div> @24eeb08 09 CHARSXP g1c1 [MARK,gp=0x20]
"#########"</div>
<div> @24eeb68 09 CHARSXP g1c1 [MARK,gp=0x20]
"#########"</div>
<div> ...</div>
<div> @1ff10f0 13 INTSXP g0c7 [NAM(2)] (len=326, tl=0)
3,3,3,3,3,...</div>
<div> @3a6d0d0 13 INTSXP g0c7 [NAM(2)] (len=326, tl=0)
2,1,2,1,3,...</div>
<div> ...</div>
<div>ATTRIB:</div>
<div> @276c360 02 LISTSXP g0c0 [] </div>
<div> TAG: @963418 01 SYMSXP g1c0 [MARK,gp=0x4000]
"names"</div>
<div> @1fe5670 16 STRSXP g0c7 [NAM(2)] (len=13, tl=100)</div>
<div>
@184aed0 09 CHARSXP g1c3 [MARK,gp=0x21,ATT] "k1"</div>
<div> @bf8578 09 CHARSXP g1c2 [MARK,gp=0x21] "k2"</div>
<div> @108be30 09 CHARSXP g1c2 [MARK,gp=0x21] "v1"</div>
<div> @108be68 09 CHARSXP g1c2 [MARK,gp=0x21] "v2"</div>
<div> @108bf10 09 CHARSXP g1c2 [MARK,gp=0x21] "v3"</div>
<div> ...</div>
<div> TAG: @96d200 01 SYMSXP g1c0 [MARK,gp=0x4000]
"row.names"</div>
<div> @29cbf38 13 INTSXP g0c1 [] (len=2, tl=0)
-2147483648,-326</div>
<div> TAG: @9638e8 01 SYMSXP g1c0 [MARK,gp=0x4000]
"class"</div>
<div> @2d539a0 16 STRSXP g0c2 [NAM(2)] (len=2, tl=0)</div>
<div> @bf8460 09 CHARSXP g1c2 [MARK,gp=0x21]
"data.table"</div>
<div>
@9f2688 09 CHARSXP g1c2 [MARK,gp=0x21,ATT]
"data.frame"</div>
<div> TAG: @1e75218 01 SYMSXP g1c0 [MARK]
".internal.selfref"</div>
<div> @276c440 22 EXTPTRSXP g0c0 [] </div>
<div><br>
</div>
<div>
<br>
</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div>It looks to me to be some differences in the ATTRs
attached to k1 and k2 in the first case? I can't really
parse this as well as you can.</div>
<div><br>
</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div> If it really has to be by row then DT[,
fun(.SD,...), by=1:nrow(DT)] should be better than
apply().<span class=""><font color="#888888"><br>
<br>
Matthew</font></span>
<div>
<div class="h5"><br>
<br>
On 10/09/13 18:47, Chris Neff wrote:<br>
</div>
</div>
</div>
<blockquote type="cite">
<div>
<div class="h5">
<div dir="ltr">Narrowing it down further,
<div><br>
</div>
<div>a$x</div>
<div><br>
</div>
<div>segfaults and</div>
<div><br>
</div>
<div>a[,x]</div>
<div><br>
</div>
<div>segfaults but</div>
<div><br>
</div>
<div>a[,"x", with=FALSE]</div>
<div><br>
</div>
<div>doesn't.</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Tue, Sep 10, 2013 at
1:32 PM, Chris Neff <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:caneff@gmail.com"
target="_blank">caneff@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir="ltr">I'm pretty sure it is some
issue of a column that thinks it is bigger
than it actually is. I have tried, so far
in vain, to make a reproducible example
that I can share. I have one, but can't
share it.
<div> <br>
</div>
<div>What happens is this: </div>
<div><br>
</div>
<div>A data.frame is made:</div>
<div><br>
</div>
<div>> d = data.frame(...)</div>
<div><br>
</div>
<div>Then I call apply over every row,
calling a different function that takes
in a DT as well:</div>
<div><br>
</div>
<div>l = apply(d, 1, function(x)
func(x[1], x[2], DT))</div>
<div><br>
</div>
<div>This returns a data.frame. If I
rbindlist this:</div>
<div><br>
</div>
<div>a = rbindlist(l)</div>
<div><br>
</div>
<div>I can print a just fine, and it will
show me all data like normal. but if I
try to just do </div>
<div><br>
</div>
<div>a$x</div>
<div><br>
</div>
<div>x is one of the columns that was a
key in DT, then it segfaults. If I ask
for a column that was made by "func" and
wasn't a column in DT, it works fine.
If I ask for only the first 10 rows and
then ask for x:</div>
<div><br>
</div>
<div>a[1:10]$x</div>
<div><br>
</div>
<div>it works fine.</div>
<div><br>
</div>
<div>So somewhere these key columns think
they are different lengths than they
really are, and when I try to access it
I go into memory I shouldn't so I
segfault. How can I verify this? Is
there something about the DT I can check
to see what DT thinks these columns are?</div>
<div><br>
</div>
<div><br>
</div>
<div>Also, if instead of apply when making
the list, I do</div>
<div><br>
</div>
<div>l = lapply(1:nrow(d), function(i)
func(x[i,1],x[i,2],DT))</div>
<div><br>
</div>
<div>and rbindlist that, it works fine
too.<br>
<br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset></fieldset>
<br>
</div>
</div>
<div class="im">
<pre>_______________________________________________
datatable-help mailing list
<a moz-do-not-send="true" href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a>
<a moz-do-not-send="true" href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a></pre>
</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
</body>
</html>