<div dir="ltr"><p style="margin:0px 0px 10px;color:rgb(51,51,51);font-family:'Helvetica Neue',Helvetica,Arial,sans-serif;font-size:14px;line-height:20px"><span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;line-height:normal">Hi Matt,</span></p>
<div style="font-family:arial,sans-serif;font-size:13px"><br></div><div style="font-family:arial,sans-serif;font-size:13px">Thanks for the suggestion. I am placing an example below that I hope illustrates the problem more clearly. Please let me know if I can provide additional detail or clarification.</div>
<div style="font-family:arial,sans-serif;font-size:13px"><br></div><div style="font-family:arial,sans-serif;font-size:13px"><br></div><div style="font-family:arial,sans-serif;font-size:13px">Regards,</div><div style="font-family:arial,sans-serif;font-size:13px">
Matt</div><p style="margin:0px 0px 10px;color:rgb(51,51,51);font-family:'Helvetica Neue',Helvetica,Arial,sans-serif;font-size:14px;line-height:20px"><br></p><p style="margin:0px 0px 10px;color:rgb(51,51,51);font-family:'Helvetica Neue',Helvetica,Arial,sans-serif;font-size:14px;line-height:20px">
<br></p><p style="margin:0px 0px 10px;color:rgb(51,51,51);font-family:'Helvetica Neue',Helvetica,Arial,sans-serif;font-size:14px;line-height:20px">First we create a dummy dataset with ten documents containing one million words. There are three unique words in the set.</p>
<pre class="" style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902);background-color:rgb(245,245,245)">
<code class="" style="padding-top:0px;padding-right:0px;padding-left:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent"><span class="" style="color:rgb(153,0,0);font-weight:bold">library</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">data.table</span><span class="" style="color:rgb(104,118,135)">)</span>
<span class="" style="color:rgb(0,0,0)">options</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">scipen</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,153,153)">2</span><span class="" style="color:rgb(104,118,135)">)</span>
<span class="" style="color:rgb(0,0,0)">set.seed</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,153,153)">1000</span><span class="" style="color:rgb(104,118,135)">)</span>
<span class="" style="color:rgb(0,0,0)">DT</span><span class="" style="color:rgb(104,118,135)"><-</span><span class="" style="color:rgb(0,0,0)">data.table</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">wordindex</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">sample</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,153,153)">1</span><span class="" style="color:rgb(104,118,135)">:</span><span class="" style="color:rgb(0,153,153)">3</span>,<span class="" style="color:rgb(0,153,153)">1000000</span>,<span class="" style="color:rgb(0,0,0)">replace</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(153,0,115)">T</span><span class="" style="color:rgb(104,118,135)">)</span>,<span class="" style="color:rgb(0,0,0)">docindex</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">sample</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,153,153)">1</span><span class="" style="color:rgb(104,118,135)">:</span><span class="" style="color:rgb(0,153,153)">10</span>,<span class="" style="color:rgb(0,153,153)">1000000</span>,<span class="" style="color:rgb(0,0,0)">replace</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(153,0,115)">T</span><span class="" style="color:rgb(104,118,135)">)</span><span class="" style="color:rgb(104,118,135)">)</span>
<span class="" style="color:rgb(0,0,0)">setkey</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">DT</span>,<span class="" style="color:rgb(0,0,0)">docindex</span><span class="" style="color:rgb(104,118,135)">)</span>
<span class="" style="color:rgb(0,0,0)">DT</span><span class="" style="color:rgb(104,118,135)">[</span>,<span class="" style="color:rgb(0,0,0)">position</span><span class="" style="color:rgb(104,118,135)">:</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)"><a href="http://seq.int">seq.int</a></span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,153,153)">1</span><span class="" style="color:rgb(104,118,135)">:</span><span class="" style="color:rgb(0,0,0)">.N</span><span class="" style="color:rgb(104,118,135)">)</span>,<span class="" style="color:rgb(0,0,0)">by</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">docindex</span><span class="" style="color:rgb(104,118,135)">]</span></code></pre>
<pre style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902)">
<code style="padding:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent">##          wordindex docindex position
##       1:         1        1        1
##       2:         1        1        2
##       3:         3        1        3
##       4:         3        1        4
##       5:         1        1        5
##      ---                            
##  999996:         2       10    99811
##  999997:         2       10    99812
##  999998:         3       10    99813
##  999999:         1       10    99814
## 1000000:         3       10    99815</code></pre><p style="margin:0px 0px 10px;color:rgb(51,51,51);font-family:'Helvetica Neue',Helvetica,Arial,sans-serif;font-size:14px;line-height:20px">This is a query to count the occurrences of the first unique word across all documents. It is also beautiful.</p>
<pre class="" style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902);background-color:rgb(245,245,245)">
<code class="" style="padding-top:0px;padding-right:0px;padding-left:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent"><span class="" style="color:rgb(0,0,0)">setkey</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">DT</span>,<span class="" style="color:rgb(0,0,0)">wordindex</span><span class="" style="color:rgb(104,118,135)">)</span>
<span class="" style="color:rgb(0,0,0)">count</span><span class="" style="color:rgb(104,118,135)"><-</span><span class="" style="color:rgb(0,0,0)">DT</span><span class="" style="color:rgb(104,118,135)">[</span><span class="" style="color:rgb(0,0,0)">J</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,153,153)">1</span><span class="" style="color:rgb(104,118,135)">)</span>,<span class="" style="color:rgb(0,0,0)">list</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">count.1</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">.N</span><span class="" style="color:rgb(104,118,135)">)</span>,<span class="" style="color:rgb(0,0,0)">by</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">docindex</span><span class="" style="color:rgb(104,118,135)">]</span>
<span class="" style="color:rgb(0,0,0)">count</span></code></pre><pre style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902)">
<code style="padding:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent">##     docindex count.1
##  1:        1   33533
##  2:        2   33067
##  3:        3   33538
##  4:        4   33053
##  5:        5   33231
##  6:        6   33002
##  7:        7   33369
##  8:        8   33353
##  9:        9   33485
## 10:       10   33225</code></pre><p style="margin:0px 0px 10px;color:rgb(51,51,51);font-family:'Helvetica Neue',Helvetica,Arial,sans-serif;font-size:14px;line-height:20px">It gets messier when we have to take the position ahead into account. This is a query to count the occurrences of the first unique word across all documents UNLESS it is followed by the second unique word. We create a new column containing the word one position ahead and then key on both words.</p>
<pre class="" style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902);background-color:rgb(245,245,245)">
<code class="" style="padding-top:0px;padding-right:0px;padding-left:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent"><span class="" style="color:rgb(0,0,0)">setkey</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">DT</span>,<span class="" style="color:rgb(0,0,0)">docindex</span>,<span class="" style="color:rgb(0,0,0)">position</span><span class="" style="color:rgb(104,118,135)">)</span>
<span class="" style="color:rgb(0,0,0)">DT</span><span class="" style="color:rgb(104,118,135)">[</span>,<span class="" style="color:rgb(0,0,0)">lead_wordindex</span><span class="" style="color:rgb(104,118,135)">:</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">DT</span><span class="" style="color:rgb(104,118,135)">[</span><span class="" style="color:rgb(0,0,0)">list</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">docindex</span>,<span class="" style="color:rgb(0,0,0)">position</span><span class="" style="color:rgb(104,118,135)">+</span><span class="" style="color:rgb(0,153,153)">1</span><span class="" style="color:rgb(104,118,135)">)</span><span class="" style="color:rgb(104,118,135)">]</span><span class="" style="color:rgb(104,118,135)">[</span>,<span class="" style="color:rgb(0,0,0)">wordindex</span><span class="" style="color:rgb(104,118,135)">]</span><span class="" style="color:rgb(104,118,135)">]</span></code></pre>
<pre style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902)">
<code style="padding:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent">##          wordindex docindex position lead_wordindex
##       1:         1        1        1              1
##       2:         1        1        2              3
##       3:         3        1        3              3
##       4:         3        1        4              1
##       5:         1        1        5              2
##      ---                                           
##  999996:         2       10    99811              2
##  999997:         2       10    99812              3
##  999998:         3       10    99813              1
##  999999:         1       10    99814              3
## 1000000:         3       10    99815             NA</code></pre><pre class="" style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902);background-color:rgb(245,245,245)">
<code class="" style="padding-top:0px;padding-right:0px;padding-left:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent"><span class="" style="color:rgb(0,0,0)">setkey</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">DT</span>,<span class="" style="color:rgb(0,0,0)">wordindex</span>,<span class="" style="color:rgb(0,0,0)">lead_wordindex</span><span class="" style="color:rgb(104,118,135)">)</span>
<span class="" style="color:rgb(0,0,0)">countr2</span><span class="" style="color:rgb(104,118,135)"><-</span><span class="" style="color:rgb(0,0,0)">DT</span><span class="" style="color:rgb(104,118,135)">[</span><span class="" style="color:rgb(0,0,0)">J</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">c</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,153,153)">1</span>,<span class="" style="color:rgb(0,153,153)">1</span><span class="" style="color:rgb(104,118,135)">)</span>,<span class="" style="color:rgb(0,0,0)">c</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,153,153)">1</span>,<span class="" style="color:rgb(0,153,153)">3</span><span class="" style="color:rgb(104,118,135)">)</span><span class="" style="color:rgb(104,118,135)">)</span>,<span class="" style="color:rgb(0,0,0)">list</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">count.1</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">.N</span><span class="" style="color:rgb(104,118,135)">)</span>,<span class="" style="color:rgb(0,0,0)">by</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">docindex</span><span class="" style="color:rgb(104,118,135)">]</span>
<span class="" style="color:rgb(0,0,0)">countr2</span></code></pre><pre style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902)">
<code style="padding:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent">##     docindex count.1
##  1:        1   22301
##  2:        2   21835
##  3:        3   22490
##  4:        4   21830
##  5:        5   22218
##  6:        6   21914
##  7:        7   22370
##  8:        8   22265
##  9:        9   22211
## 10:       10   22190</code></pre><p style="margin:0px 0px 10px;color:rgb(51,51,51);font-family:'Helvetica Neue',Helvetica,Arial,sans-serif;font-size:14px;line-height:20px">I have a very large dataset for which the above query fails for memory allocation. As an alternative, we can create this new column for only the relevant subset of data by filtering the original dataset and then joining it back on the desired position:</p>
<pre class="" style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902);background-color:rgb(245,245,245)">
<code class="" style="padding-top:0px;padding-right:0px;padding-left:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent"><span class="" style="color:rgb(0,0,0)">setkey</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">DT</span>,<span class="" style="color:rgb(0,0,0)">wordindex</span><span class="" style="color:rgb(104,118,135)">)</span>
<span class="" style="color:rgb(0,0,0)">filter</span><span class="" style="color:rgb(104,118,135)"><-</span><span class="" style="color:rgb(0,0,0)">DT</span><span class="" style="color:rgb(104,118,135)">[</span><span class="" style="color:rgb(0,0,0)">J</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,153,153)">1</span><span class="" style="color:rgb(104,118,135)">)</span>,<span class="" style="color:rgb(0,0,0)">list</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">wordindex</span>,<span class="" style="color:rgb(0,0,0)">docindex</span>,<span class="" style="color:rgb(0,0,0)">position</span><span class="" style="color:rgb(104,118,135)">)</span><span class="" style="color:rgb(104,118,135)">]</span>
<span class="" style="color:rgb(0,0,0)">filter</span><span class="" style="color:rgb(104,118,135)">[</span>,<span class="" style="color:rgb(0,0,0)">lead_position</span><span class="" style="color:rgb(104,118,135)">:</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">position</span><span class="" style="color:rgb(104,118,135)">+</span><span class="" style="color:rgb(0,153,153)">1</span><span class="" style="color:rgb(104,118,135)">]</span></code></pre>
<pre style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902)">
<code style="padding:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent">##         wordindex wordindex docindex position lead_position
##      1:         1         1        2    99717         99718
##      2:         1         1        3    99807         99808
##      3:         1         1        4   100243        100244
##      4:         1         1        1        1             2
##      5:         1         1        1       42            43
##     ---                                                    
## 332852:         1         1       10    99785         99786
## 332853:         1         1       10    99787         99788
## 332854:         1         1       10    99798         99799
## 332855:         1         1       10    99804         99805
## 332856:         1         1       10    99814         99815</code></pre><pre class="" style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902);background-color:rgb(245,245,245)">
<code class="" style="padding-top:0px;padding-right:0px;padding-left:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent"><span class="" style="color:rgb(0,0,0)">setkey</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">DT</span>,<span class="" style="color:rgb(0,0,0)">docindex</span>,<span class="" style="color:rgb(0,0,0)">position</span><span class="" style="color:rgb(104,118,135)">)</span>
<span class="" style="color:rgb(0,0,0)">filter</span><span class="" style="color:rgb(104,118,135)">[</span>,<span class="" style="color:rgb(0,0,0)">lead_wordindex</span><span class="" style="color:rgb(104,118,135)">:</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">DT</span><span class="" style="color:rgb(104,118,135)">[</span><span class="" style="color:rgb(0,0,0)">J</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">filter</span><span class="" style="color:rgb(104,118,135)">[</span>,<span class="" style="color:rgb(0,0,0)">list</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">docindex</span>,<span class="" style="color:rgb(0,0,0)">lead_position</span><span class="" style="color:rgb(104,118,135)">)</span><span class="" style="color:rgb(104,118,135)">]</span><span class="" style="color:rgb(104,118,135)">)</span><span class="" style="color:rgb(104,118,135)">]</span><span class="" style="color:rgb(104,118,135)">[</span>,<span class="" style="color:rgb(0,0,0)">wordindex</span><span class="" style="color:rgb(104,118,135)">]</span><span class="" style="color:rgb(104,118,135)">]</span></code></pre>
<pre style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902)">
<code style="padding:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent">##         wordindex wordindex docindex position lead_position lead_wordindex
##      1:         1         1        2    99717         99718             NA
##      2:         1         1        3    99807         99808             NA
##      3:         1         1        4   100243        100244             NA
##      4:         1         1        1        1             2              1
##      5:         1         1        1       42            43              1
##     ---                                                                   
## 332852:         1         1       10    99785         99786              3
## 332853:         1         1       10    99787         99788              3
## 332854:         1         1       10    99798         99799              3
## 332855:         1         1       10    99804         99805              3
## 332856:         1         1       10    99814         99815              3</code></pre><pre class="" style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902);background-color:rgb(245,245,245)">
<code class="" style="padding-top:0px;padding-right:0px;padding-left:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent"><span class="" style="color:rgb(0,0,0)">setkey</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">filter</span>,<span class="" style="color:rgb(0,0,0)">wordindex</span>,<span class="" style="color:rgb(0,0,0)">lead_wordindex</span><span class="" style="color:rgb(104,118,135)">)</span>
<span class="" style="color:rgb(0,0,0)">countr2.1</span><span class="" style="color:rgb(104,118,135)"><-</span><span class="" style="color:rgb(0,0,0)">filter</span><span class="" style="color:rgb(104,118,135)">[</span><span class="" style="color:rgb(0,0,0)">J</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">c</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,153,153)">1</span>,<span class="" style="color:rgb(0,153,153)">1</span><span class="" style="color:rgb(104,118,135)">)</span>,<span class="" style="color:rgb(0,0,0)">c</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,153,153)">1</span>,<span class="" style="color:rgb(0,153,153)">3</span><span class="" style="color:rgb(104,118,135)">)</span><span class="" style="color:rgb(104,118,135)">)</span>,<span class="" style="color:rgb(0,0,0)">list</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">count.1</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">.N</span><span class="" style="color:rgb(104,118,135)">)</span>,<span class="" style="color:rgb(0,0,0)">by</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">docindex</span><span class="" style="color:rgb(104,118,135)">]</span>
<span class="" style="color:rgb(0,0,0)">countr2.1</span></code></pre><pre style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902)">
<code style="padding:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent">##     docindex count.1
##  1:        1   22301
##  2:        2   21835
##  3:        3   22490
##  4:        4   21830
##  5:        5   22218
##  6:        6   21914
##  7:        7   22370
##  8:        8   22265
##  9:        9   22211
## 10:       10   22190</code></pre><p style="margin:0px 0px 10px;color:rgb(51,51,51);font-family:'Helvetica Neue',Helvetica,Arial,sans-serif;font-size:14px;line-height:20px">Pretty ugly, I think. In addition, we may want to look more than one word ahead. We have to create yet another column. The easy but costly way is:</p>
<pre class="" style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902);background-color:rgb(245,245,245)">
<code class="" style="padding-top:0px;padding-right:0px;padding-left:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent"><span class="" style="color:rgb(0,0,0)">setkey</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">DT</span>,<span class="" style="color:rgb(0,0,0)">docindex</span>,<span class="" style="color:rgb(0,0,0)">position</span><span class="" style="color:rgb(104,118,135)">)</span>
<span class="" style="color:rgb(0,0,0)">DT</span><span class="" style="color:rgb(104,118,135)">[</span>,<span class="" style="color:rgb(0,0,0)">lead_lead_wordindex</span><span class="" style="color:rgb(104,118,135)">:</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">DT</span><span class="" style="color:rgb(104,118,135)">[</span><span class="" style="color:rgb(0,0,0)">list</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">docindex</span>,<span class="" style="color:rgb(0,0,0)">position</span><span class="" style="color:rgb(104,118,135)">+</span><span class="" style="color:rgb(0,153,153)">2</span><span class="" style="color:rgb(104,118,135)">)</span><span class="" style="color:rgb(104,118,135)">]</span><span class="" style="color:rgb(104,118,135)">[</span>,<span class="" style="color:rgb(0,0,0)">wordindex</span><span class="" style="color:rgb(104,118,135)">]</span><span class="" style="color:rgb(104,118,135)">]</span></code></pre>
<pre style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902)">
<code style="padding:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent">##          wordindex docindex position lead_wordindex lead_lead_wordindex
##       1:         1        1        1              1                   3
##       2:         1        1        2              3                   3
##       3:         3        1        3              3                   1
##       4:         3        1        4              1                   2
##       5:         1        1        5              2                   3
##      ---                                                               
##  999996:         2       10    99811              2                   3
##  999997:         2       10    99812              3                   1
##  999998:         3       10    99813              1                   3
##  999999:         1       10    99814              3                  NA
## 1000000:         3       10    99815             NA                  NA</code></pre><pre class="" style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902);background-color:rgb(245,245,245)">
<code class="" style="padding-top:0px;padding-right:0px;padding-left:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent"><span class="" style="color:rgb(0,0,0)">setkey</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">DT</span>,<span class="" style="color:rgb(0,0,0)">wordindex</span>,<span class="" style="color:rgb(0,0,0)">lead_wordindex</span>,<span class="" style="color:rgb(0,0,0)">lead_lead_wordindex</span><span class="" style="color:rgb(104,118,135)">)</span>
<span class="" style="color:rgb(0,0,0)">countr23</span><span class="" style="color:rgb(104,118,135)"><-</span><span class="" style="color:rgb(0,0,0)">DT</span><span class="" style="color:rgb(104,118,135)">[</span><span class="" style="color:rgb(0,0,0)">J</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,153,153)">1</span>,<span class="" style="color:rgb(0,153,153)">2</span>,<span class="" style="color:rgb(0,153,153)">3</span><span class="" style="color:rgb(104,118,135)">)</span>,<span class="" style="color:rgb(0,0,0)">list</span><span class="" style="color:rgb(104,118,135)">(</span><span class="" style="color:rgb(0,0,0)">count.1</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">.N</span><span class="" style="color:rgb(104,118,135)">)</span>,<span class="" style="color:rgb(0,0,0)">by</span><span class="" style="color:rgb(104,118,135)">=</span><span class="" style="color:rgb(0,0,0)">docindex</span><span class="" style="color:rgb(104,118,135)">]</span>
<span class="" style="color:rgb(0,0,0)">countr23</span></code></pre><pre style="padding:9.5px;font-size:13px;color:rgb(51,51,51);border-top-left-radius:4px;border-top-right-radius:4px;border-bottom-right-radius:4px;border-bottom-left-radius:4px;margin-top:0px;margin-bottom:10px;line-height:20px;word-break:break-all;word-wrap:break-word;white-space:pre-wrap;border:1px solid rgba(0,0,0,0.14902)">
<code style="padding:0px;font-size:12px;color:inherit;border-top-left-radius:3px;border-top-right-radius:3px;border-bottom-right-radius:3px;border-bottom-left-radius:3px;border:0px;background-color:transparent">##     docindex count.1
##  1:        1    3684
##  2:        2    3746
##  3:        3    3717
##  4:        4    3727
##  5:        5    3700
##  6:        6    3779
##  7:        7    3702
##  8:        8    3756
##  9:        9    3702
## 10:       10    3744</code></pre><p style="margin:0px 0px 10px;color:rgb(51,51,51);font-family:'Helvetica Neue',Helvetica,Arial,sans-serif;font-size:14px;line-height:20px">However, I currently have to use the ugly filter-and-join way because of size.</p>
<p style="margin:0px 0px 10px;color:rgb(51,51,51);font-family:'Helvetica Neue',Helvetica,Arial,sans-serif;font-size:14px;line-height:20px">So the question is, is there an easier and more beautiful way?</p></div><div class="gmail_extra">
<br><br><div class="gmail_quote">On Sat, Jun 28, 2014 at 6:00 PM, Matt Dowle <span dir="ltr"><<a href="mailto:mdowle@mdowle.plus.com" target="_blank">mdowle@mdowle.plus.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  
    
  
  <div bgcolor="#FFFFFF" text="#000000">
    <div><br>
      Hi Matt,<br>
      <br>
      Great.  If you can prepare some dummy data with the appropriate
      properties and a parameter or two to scale up the size (or just
      provide an online large example to download) and a query that gets
      to the right answer but is slow or ugly,   then we've got
      something to chew on ...<span class="HOEnZb"><font color="#888888"><br>
      <br>
      Matt</font></span><div><div class="h5"><br>
      <br>
      On 28/06/14 10:55, Matthew DeAngelis wrote:<br>
    </div></div></div><div><div class="h5">
    <blockquote type="cite">
      <div dir="ltr">Hi Matt,
        <div><br>
        </div>
        <div>You have the right of it. The problem is somewhat
          complicated, however, since I would want to substitute
          "DT[word=="good"..." with "DT[J("good")..." after setting the
          key to word and reordering the rows. Hence the two-step
          process I have now where I key by document and position first,
          create the lag_word column, key by the word and lag_word
          columns and query by row.</div>
        <div><br>
        </div>
        <div><br>
        </div>
        <div>Matt</div>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On Fri, Jun 27, 2014 at 3:17 PM, Matt
          Dowle <span dir="ltr"><<a href="mailto:mdowle@mdowle.plus.com" target="_blank">mdowle@mdowle.plus.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div bgcolor="#FFFFFF" text="#000000">
              <div><br>
                Hi,<br>
                <br>
                Not sure exactly what you need but looks interesting.<br>
                <br>
                Something a bit like this ?<br>
                <br>
                DT[ word == "good", .SD[ lag(word, N) != "not" ], 
                by=document]<br>
                <br>
                Your idea being you don't want to have to repeat all the
                pre and post words alongside each word but rather
                express it in the query. Makes sense.   Leads to
                classifying "not good" and "not very good" as both
                negative phrases I guess.<br>
                <br>
                Matt
                <div>
                  <div><br>
                    <br>
                    <br>
                    On 26/06/14 21:56, Matthew DeAngelis wrote:<br>
                  </div>
                </div>
              </div>
              <blockquote type="cite">
                <div>
                  <div>
                    <div dir="ltr">Hello data.table gurus,
                      <div><br>
                      </div>
                      <div>I have been using data.table to efficiently
                        work with textual data and I love it for that
                        purpose. I have transformed my data so that it
                        looks something like this:</div>
                      <div><br>
                      </div>
                      <div>
                        <table dir="ltr" style="table-layout:fixed;font-size:13px;font-family:arial,sans,sans-serif;border-collapse:collapse;border:1px solid rgb(204,204,204)" cellpadding="0" cellspacing="0" border="1">
                          <colgroup><col width="100"><col width="100"><col width="100"></colgroup><tbody>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">word</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">document</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">position</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">I</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">1</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                1</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">have</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">1</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                2</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">transformed</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">1</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                3</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">my</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">1</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                4</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">data</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">1</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                5</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">so</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                1</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">that</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                2</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">it</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                3</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">looks</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                4</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">something</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                5</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">like</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                6</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">this</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                7</td>
                            </tr>
                          </tbody>
                        </table>
                        <br>
                      </div>
                      <div>(I actually use a unique number for each
                        word, so that I am able to use data.table's
                        excellent features to do lightning-fast word
                        counts. This has revolutionized my workflow over
                        looping through text files with Perl.)</div>
                      <div><br>
                      </div>
                      <div>My problem is that I sometimes need to search
                        for phrases or to select words based on their
                        context (for instance, I may want to exclude a
                        word if it is preceded by "not" or followed by a
                        word that changes its meaning). Currently, I am
                        using the solution <a href="http://stackoverflow.com/questions/11397771/r-data-table-grouping-for-lagged-regression" target="_blank">here</a> to create a new
                        column for a word in another position, like
                        this:</div>
                      <div><br>
                      </div>
                      <div>
                        <table dir="ltr" style="table-layout:fixed;font-size:13px;font-family:arial,sans,sans-serif;border-collapse:collapse;border:1px solid rgb(204,204,204)" cellpadding="0" cellspacing="0" border="1">
                          <colgroup><col width="100"><col width="100"><col width="100"><col width="100"></colgroup><tbody>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">word</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">document</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">position</td>
                              <td style="padding:2px 3px;vertical-align:bottom">lead_word</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">I</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">1</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">1</td>
                              <td style="padding:2px 3px;vertical-align:bottom">have</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">have</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">1</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom">transformed</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">transformed</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">1</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">3</td>
                              <td style="padding:2px 3px;vertical-align:bottom"> my</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">my</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">1</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                4</td>
                              <td style="padding:2px 3px;vertical-align:bottom">data</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">data</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                1</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">5</td>
                              <td style="padding:2px 3px;vertical-align:bottom">NA</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">so</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">1</td>
                              <td style="padding:2px 3px;vertical-align:bottom">that</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">that</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom"> it</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">it</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                3</td>
                              <td style="padding:2px 3px;vertical-align:bottom">looks</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">looks</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">4</td>
                              <td style="padding:2px 3px;vertical-align:bottom">something</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom"> something</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">5</td>
                              <td style="padding:2px 3px;vertical-align:bottom">like</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">like</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                6</td>
                              <td style="padding:2px 3px;vertical-align:bottom">this</td>
                            </tr>
                            <tr style="height:21px">
                              <td style="padding:2px 3px;vertical-align:bottom">this</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">
                                2</td>
                              <td style="padding:2px 3px;vertical-align:bottom;text-align:center">7</td>
                              <td style="padding:2px 3px;vertical-align:bottom">NA</td>
                            </tr>
                          </tbody>
                        </table>
                        <br>
                        using a command like:
                        DT[,lead_word:=DT[list(document,position+1),word].<br>
                        <br>
                      </div>
                      <div>This approach has two problems, however.
                        First, it consumes more resources as the dataset
                        grows. I am currently working with a file
                        containing over 150 million rows, so adding a
                        column is costly. Second, I may want to check
                        both one and two words ahead, so that I have to
                        add two columns, and this can quickly get out of
                        hand.</div>
                      <div><br>
                      </div>
                      <div>Is there a better way to use data.table to
                        check the value in a row N distance from the row
                        of interest within a group and select a row
                        based on that value? Perhaps the .I variable
                        could be useful here?</div>
                      <div><br>
                      </div>
                      <div>I appreciate any suggestions.</div>
                      <div><br>
                      </div>
                      <div><br>
                      </div>
                      <div>Regards,</div>
                      <div>Matt</div>
                    </div>
                    <br>
                    <fieldset></fieldset>
                    <br>
                  </div>
                </div>
                <pre>_______________________________________________
datatable-help mailing list
<a href="mailto:datatable-help@lists.r-forge.r-project.org" target="_blank">datatable-help@lists.r-forge.r-project.org</a>
<a href="https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help" target="_blank">https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help</a></pre>
              </blockquote>
              <br>
            </div>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </div></div></div>

</blockquote></div><br></div>