[datatable-help] Sistematically construction of 2 variables in data table

Michael Smith my.r.help at gmail.com
Sat Oct 25 03:12:14 CEST 2014


Would help if you add dsum to your example data for reproducibility.

On 10/25/2014 01:09 AM, Frank S. wrote:
>  Dear all,
> I'm writing to you because I'm not able to construct in a long data 2
> new variables
> based on other columns for each id. Small example with only one subject:
> 
> dt <- data.table(
>   id = rep(1,7),
>   bday = rep(as.Date("1960-10-29"),7),
>   start = rep(as.Date("2005-02-27"),7),
>   marker0 = rep(125,7),
>   datep =
> as.Date(c('2005-04-20','2005-10-28','2005-12-31','2006-08-10','2006-12-31','2007-02-19','2007-05-15')),
>   marker = c(10,2,0,5,3,7,1)
>  )
> 
>  
> 
> I would want to construct sistematically a new data table from three
> conditions (I have the first):
> 
>  
> 
> 1) It only keeps rows whose datep variable is 31st december or is the
> last date within id (I can get it)
> newdt <- unique(rbind(dt[which(month(datep)==12 &
> as.POSIXlt(datep)$mday==31)],
>  dt[, .SD[.N],
> by='id'])[,list(id,init=0,marker0,sum=0,dsum=datep)])[order(id,dsum)]
> 
>    id init marker0   sum             dsum
> 1:  1    0     125          0     2005-12-31
> 2:  1    0     125          0     2006-12-31
> 3:  1    0     125          0     2007-05-15   
> 
>  
> 
> 2) It has a so-called init variable, whose value is defined in 1st row
> as difference (yr) between
>    start and bday, in 2nd row as difference between dsum of 1st row and
> bday and finally the 3rd
>    row as difference between dsum of 2nd row and bday:
>    id      
> bday                                                               
>  ini                         marker0    sum       dsum
> 1:  1 1960-10-29                                difftime(start,
> bday)/365.25        125            0  2005-12-31    
> 2:  1 1960-10-29   difftime(as.Date("2005-12-31"), bday)/365.25    
> 125            0  2006-12-31    
> 3:  1 1960-10-29   difftime(as.Date("2006-12-31"), bday)/365.25    
> 125            0  2007-05-15    
> 
>  
> 
> 3) It has also a sum variable, whose value is defined as follows:
>    3a) For first row within each id: The corresponding marker0 value.
>    3b) For each of the following rows within id: Previous sum value plus
>   the sum of marker's values across the previous year.
> 
>    id     init  marker0          sum                        dsum
> 1:  1   44.3      125             125                    2005-12-31    
> 2:  1   45.2      125    125+10+2=137        2006-12-31    
> 3:  1   46.2      125     137+5+3=145         2007-05-15
> 
>  
> 
> Thanks to all R-data table help community for your continuous help!
> 
> 
> 
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
> 


More information about the datatable-help mailing list