[datatable-help] Filtering Based on Previous Observation

Michael Smith my.r.help at gmail.com
Thu May 1 14:42:56 CEST 2014


Awesome, thanks to all of you who have replied. I learned some nice new
data.table/programming tricks!

M



On 04/30/2014 08:00 PM, Gabor Grothendieck wrote:
> On Tue, Apr 29, 2014 at 10:04 AM, Michael Smith <my.r.help at gmail.com> wrote:
>> All,
>>
>> Is there some data.table-idiomatic way to filter based on a previous
>> observation/row? For example, I want to remove a row if
>> DT$a[row]==DT$a[row-1].
>>
>> It could be done by first calculating the lag and then filtering based
>> on that, but I wonder if there's a more direct way.
>>
>> The following example works, but my feeling is there should be a more
>> elegant solution:
>>
>> ( DT <- data.table(a = c(1, 2, 2, 3), b = 8:5) )
>> DT[, L.a := c(NA, head(a, -1))][a != L.a | is.na(L.a)][, L.a := NULL][]
> 
> If the unique elements always appear consecutively then the following
> would work.
> 
> (For example, if `a` were in ascending order (as in the example) or
> descending order then  that would be satisfied.  If DT were keyed
> on 'a' then this would always be the case.)
> 
> DT[ !duplicated(a) ]
> 
> Note that 'a' need not be numeric.
> 


More information about the datatable-help mailing list