<html><body><div style="color:#000; background-color:#fff; font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande, sans-serif;font-size:16px"><div id="yui_3_16_0_1_1413288092843_5703" dir="ltr">I have a very strange row-filtering issue in front of me that I can only reproduce on a very large data set. Let me start off by giving you the end symptoms and then I will talk through some hacks which will avoid the bug.<br></div><div id="yui_3_16_0_1_1413288092843_5702" dir="ltr"><br></div><div id="yui_3_16_0_1_1413288092843_5704" dir="ltr">I have two fields of interest -- pred_bad_t_f and weight.</div><div id="yui_3_16_0_1_1413288092843_5747" dir="ltr">- pred_bad_t_f is of class "integer" with two unique values, 0 and 1</div><div id="yui_3_16_0_1_1413288092843_5826" dir="ltr">- weight is of class "numeric"</div><div id="yui_3_16_0_1_1413288092843_5825" dir="ltr"><br></div><div id="yui_3_16_0_1_1413288092843_5818" dir="ltr"><span id="yui_3_16_0_1_1413288092843_5838" class="" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: 'Lucida Console'; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 15px; orphans: 2; text-align: -webkit-left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; background-color: rgb(225, 226, 229); "></span></div><pre tabindex="0" class="" id="rstudio_console_output" style="font-family: 'Lucida Console'; font-size: 10pt !important; outline-style: none; outline-width: initial; outline-color: initial; border-top-style: none; border-right-style: none; border-bottom-style: none; border-left-style: none; border-width: initial; border-color: initial; white-space: pre-wrap !important; word-break: break-all; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; -webkit-user-select: text; line-height: 1.2; "><span id="yui_3_16_0_1_1413288092843_5848" class="" style="white-space: pre; -webkit-user-select: text; color: blue; ">> </span><span id="yui_3_16_0_1_1413288092843_5845" class="" style="color: blue; ">dt[pred_bad_t_f == 1, sum(weight)]
</span>[1] 6580818130
<span id="yui_3_16_0_1_1413288092843_5847" class="" style="white-space: pre; -webkit-user-select: text; color: blue; ">> </span><span id="yui_3_16_0_1_1413288092843_5846" class="" style="color: blue; ">dt[pred_bad_t_f == 1L, sum(weight)]
</span>[1] 5414941720</pre><div></div><div id="yui_3_16_0_1_1413288092843_5817" dir="ltr"><br></div><div id="yui_3_16_0_1_1413288092843_5816" dir="ltr">As you can see, there is no reason for the second value to be any different. I believe the first value is correct because slight changes to the filtering logic generates that value repeatedly. Below are some examples:<br></div><div id="yui_3_16_0_1_1413288092843_5875" dir="ltr"><br></div><div id="yui_3_16_0_1_1413288092843_5882" dir="ltr"><span id="yui_3_16_0_1_1413288092843_5921" class="" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: 'Lucida Console'; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 15px; orphans: 2; text-align: -webkit-left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; background-color: rgb(225, 226, 229); "></span></div><pre tabindex="0" class="" id="rstudio_console_output" style="font-family: 'Lucida Console'; font-size: 10pt !important; outline-style: none; outline-width: initial; outline-color: initial; border-top-style: none; border-right-style: none; border-bottom-style: none; border-left-style: none; border-width: initial; border-color: initial; white-space: pre-wrap !important; word-break: break-all; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; -webkit-user-select: text; line-height: 1.2; "><span class="" style="white-space: pre; -webkit-user-select: text; color: blue; ">> </span><span id="yui_3_16_0_1_1413288092843_5928" class="" style="color: blue; ">dt[1:nrow( dt)][pred_bad_t_f == 1L, sum(weight)]
</span>[1] 6580818130</pre><div></div><div id="yui_3_16_0_1_1413288092843_5872" dir="ltr"><span class="" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: 'Lucida Console'; font-size: 13px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: 15px; orphans: 2; text-align: -webkit-left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 0px; background-color: rgb(225, 226, 229); "></span></div><pre tabindex="0" class="" id="rstudio_console_output" style="font-family: 'Lucida Console'; font-size: 10pt !important; outline-style: none; outline-width: initial; outline-color: initial; border-top-style: none; border-right-style: none; border-bottom-style: none; border-left-style: none; border-width: initial; border-color: initial; white-space: pre-wrap !important; word-break: break-all; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; -webkit-user-select: text; line-height: 1.2; "><span class="" style="white-space: pre; -webkit-user-select: text; color: blue; ">> </span><span class="" style="color: blue; ">dt[TRUE & pred_bad_t_f == 1L, sum(weight)]
</span>[1] 6580818130</pre><div></div><div dir="ltr"><br></div><div dir="ltr">s<br></div><div></div></div></body></html>