[datatable-help] subset data table in i with multiple criteria for multiple variables

carlsutton suttoncarl at ymail.com
Thu Dec 10 05:14:30 CET 2015


 Gosh that is simple, elegant and wonderful.  Feel kinda sorta silly for not thinking of it.   Thank you for enlightening me.
My Dad once said if there was a hard way to do a simple task, he was confident I would find it.  That malady has stuck with me oh so many years.  But in 10 years as an aerospace engineer and 35 as a CPA, there was nothing simple, and an unknown unknown could be devastating.  BTW, worked as  a programmer(Fortran in the 60's) to pay for college, and did some programming at Lockheed  for the flutter group.  .Arun covered chaining in the Data Camp course and I use it frequently.  On my personal project I am mired down in data exploration and investigating variable distributions, means, medians, et al.  Some behave as expected, others have me scratching my head and muttering.  Somewhere somehow it all will make sense, but the big picture is eluding me.

Carl Sutton CPA
 


    On Wednesday, December 9, 2015 6:43 PM, mbacou [via R] <ml-node+s789695n4715360h54 at n4.nabble.com> wrote:
 
 

  Carl,
Are you just looking for the following syntax?

dt[a<35 & b>35, lapply(.SD, median, na.rm = T), by = d]

You can include as many conditions as necessary in `i`. You can also 
chain data.tables:

dt[a<35][b>35][, lapply(.SD, median, na.rm = T), by = d]

--Mel.

On 12/9/2015 1:02 PM, carlsutton wrote:
> Is there a way to subset a data table using "i" with multiple criteria using
> multiple variables(columns)?  I have some test code shown below on what I
> have tried.  And yes, I have read the documentation, taking the data camp
> class (Multiple viewing, I'm  a slow learner) and have not seen anything
> relevant.  Also checked for questions on this topic in this forum and did
> not find an answer for my query.
> Attempting to upload R file
> dataTableExamples.R
> <http://r.789695.n4.nabble.com/file/n4715347/dataTableExamples.R>
> Probably should have stayed in bed today the way things are going.
>
> A cut and paste from RStudio
> #  Data Table exercises
> require(data.table)
> a <- seq(2L,40L, by = 4L)
> b <- seq(15L,105L,by = 10L)
> c <- 1:10L
> d <- rep(c(100L,150L),5L)
> e <- 101:110L
> dt <- data.table(a,b,c,d,e)
> dt
> dta <- subset(dt, a < 35)
> dtb <- subset(dta, b > 35)
> dtb
> dtb[, lapply(.SD,median), by = d]
> #  Now attempt to subset the rows in i
> vec <- c(a<35, b>35)
> dtvec <- dt[vec, lapply(.SD, median, na.rm = TRUE), by = d]
> dtvec
>
> And console output
>> #  Now attempt to subset the rows in i
>> vec <- c(a<35, b>35)
>> dtvec <- dt[vec, lapply(.SD, median, na.rm = TRUE), by = d]
> Error in `[.data.table`(dt, vec, lapply(.SD, median, na.rm = TRUE), by = d)
> :
>    Column 1 of result for group 2 is type 'double' but expecting type
> 'integer'. Column types must be consistent for each group.
>> dtvec
>       a  b  c   d   e
>   1:  2 15  1 100 101
>   2:  6 25  2 150 102
>   3: 10 35  3 100 103
>   4: 14 45  4 150 104
>   5: 18 55  5 100 105
>   6: 22 65  6 150 106
>   7: 26 75  7 100 107
>   8: 30 85  8 150 108
>   9: 34 95  9 100 109
> 10: NA NA NA  NA  NA
> 11: NA NA NA  NA  NA
> 12: NA NA NA  NA  NA
> 13: NA NA NA  NA  NA
> 14: NA NA NA  NA  NA
> 15: NA NA NA  NA  NA
> 16: NA NA NA  NA  NA
>
> The error message has me confused.
>   /Column 1 of result for group 2 is type/
> What group 2?  I have only grouped on column "d".  Result 1 is ????  No idea
> what "result 1" is referring to, is it the subset in "i", the median of col
> a??  No clue.
>
> I have only created integer variable for the data table, so why the
> rejection " Column 1 of result for group 2 is type 'double' but expecting
> type 'integer'. Column types must be consistent for each group."  What
> double?  I have not created any double numbers.
>
> Carl Sutton
>
>
>
> -----
> Carl Sutton
> --
> View this message in context: http://r.789695.n4.nabble.com/subset-data-table-in-i-with-multiple-criteria-for-multiple-variables-tp4715347.html
> Sent from the datatable-help mailing list archive at Nabble.com.
> _______________________________________________
> datatable-help mailing list
> [hidden email]
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
[hidden email]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
 
 
   If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/subset-data-table-in-i-with-multiple-criteria-for-multiple-variables-tp4715347p4715360.html   To unsubscribe from subset data table in i with multiple criteria for multiple variables, click here.
 NAML 

 




-----
Carl Sutton
--
View this message in context: http://r.789695.n4.nabble.com/subset-data-table-in-i-with-multiple-criteria-for-multiple-variables-tp4715347p4715361.html
Sent from the datatable-help mailing list archive at Nabble.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20151209/c78d719a/attachment.html>


More information about the datatable-help mailing list