[datatable-help] subset data table in i with multiple criteria for multiple variables

Bacou, Melanie mel at mbacou.com
Thu Dec 10 03:42:26 CET 2015


Carl,
Are you just looking for the following syntax?

dt[a<35 & b>35, lapply(.SD, median, na.rm = T), by = d]

You can include as many conditions as necessary in `i`. You can also 
chain data.tables:

dt[a<35][b>35][, lapply(.SD, median, na.rm = T), by = d]

--Mel.

On 12/9/2015 1:02 PM, carlsutton wrote:
> Is there a way to subset a data table using "i" with multiple criteria using
> multiple variables(columns)?  I have some test code shown below on what I
> have tried.  And yes, I have read the documentation, taking the data camp
> class (Multiple viewing, I'm  a slow learner) and have not seen anything
> relevant.  Also checked for questions on this topic in this forum and did
> not find an answer for my query.
> Attempting to upload R file
> dataTableExamples.R
> <http://r.789695.n4.nabble.com/file/n4715347/dataTableExamples.R>
> Probably should have stayed in bed today the way things are going.
>
> A cut and paste from RStudio
> #  Data Table exercises
> require(data.table)
> a <- seq(2L,40L, by = 4L)
> b <- seq(15L,105L,by = 10L)
> c <- 1:10L
> d <- rep(c(100L,150L),5L)
> e <- 101:110L
> dt <- data.table(a,b,c,d,e)
> dt
> dta <- subset(dt, a < 35)
> dtb <- subset(dta, b > 35)
> dtb
> dtb[, lapply(.SD,median), by = d]
> #  Now attempt to subset the rows in i
> vec <- c(a<35, b>35)
> dtvec <- dt[vec, lapply(.SD, median, na.rm = TRUE), by = d]
> dtvec
>
> And console output
>> #  Now attempt to subset the rows in i
>> vec <- c(a<35, b>35)
>> dtvec <- dt[vec, lapply(.SD, median, na.rm = TRUE), by = d]
> Error in `[.data.table`(dt, vec, lapply(.SD, median, na.rm = TRUE), by = d)
> :
>    Column 1 of result for group 2 is type 'double' but expecting type
> 'integer'. Column types must be consistent for each group.
>> dtvec
>       a  b  c   d   e
>   1:  2 15  1 100 101
>   2:  6 25  2 150 102
>   3: 10 35  3 100 103
>   4: 14 45  4 150 104
>   5: 18 55  5 100 105
>   6: 22 65  6 150 106
>   7: 26 75  7 100 107
>   8: 30 85  8 150 108
>   9: 34 95  9 100 109
> 10: NA NA NA  NA  NA
> 11: NA NA NA  NA  NA
> 12: NA NA NA  NA  NA
> 13: NA NA NA  NA  NA
> 14: NA NA NA  NA  NA
> 15: NA NA NA  NA  NA
> 16: NA NA NA  NA  NA
>
> The error message has me confused.
>   /Column 1 of result for group 2 is type/
> What group 2?  I have only grouped on column "d".  Result 1 is ????  No idea
> what "result 1" is referring to, is it the subset in "i", the median of col
> a??  No clue.
>
> I have only created integer variable for the data table, so why the
> rejection " Column 1 of result for group 2 is type 'double' but expecting
> type 'integer'. Column types must be consistent for each group."  What
> double?  I have not created any double numbers.
>
> Carl Sutton
>
>
>
> -----
> Carl Sutton
> --
> View this message in context: http://r.789695.n4.nabble.com/subset-data-table-in-i-with-multiple-criteria-for-multiple-variables-tp4715347.html
> Sent from the datatable-help mailing list archive at Nabble.com.
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help



More information about the datatable-help mailing list