[datatable-help] Joins in data.table

Short, Tom TShort at epri.com
Fri Jul 16 13:37:46 CEST 2010


Harish, I'm not sure on the answer to your first question.

I'm not sure what VLOOKUP does, but based on your description, is this
what you want?

> x <- data.table( a=c("a","b","c"), b=1:3, key="a" )
> y <- data.table( a=c("b","d","e"), d=c(8,9,10), key="a" ) # key added
> 
> x$d <- y[x, d]     # or  y[x]$d
> x
     a b  d
[1,] a 1 NA
[2,] b 2  8
[3,] c 3 NA


- Tom


 

> -----Original Message-----
> From: datatable-help-bounces at lists.r-forge.r-project.org 
> [mailto:datatable-help-bounces at lists.r-forge.r-project.org] 
> On Behalf Of Harish
> Sent: Friday, July 16, 2010 03:04
> To: datatable-help at lists.r-forge.r-project.org
> Subject: [datatable-help] Joins in data.table
> 
> Question 1 -- Why get a full row of NAs with x[y]?  Bug?
> Question 2 -- What is an easy way to do the equivalent of 
> VLOOKUP in Excel?
> 
> ====================================
> 
> Question 1 -- Why get a full row of NAs with x[y]?  Bug?
> 
> I was experimenting with x[y] syntax and I don't understand 
> the use case for getting NA with nomatch=NA.
> 
> x <- data.table( a=c("a","b","c"), b=1:3, key="a" ) y <- 
> data.table( a=c("b","d","e"), d=c(8,9,10) )
> 
> x[y] returns:
> 
>         a  b
> [1,]    b  2
> [2,] <NA> NA
> [3,] <NA> NA
> 
> 
> What is the point of getting an entire row of NA?  I don't 
> even know what the NA row corresponds to in data.table y.  If 
> I got the values in column "a", then it would make more sense.
> 
> I would expect:
> 
>      a  b
> [1,] b  2
> [2,] d NA
> [3,] e NA
> 
> Am I misunderstanding this join?
> 
> (I do realize that nomatch=0 will prevent the NA rows.)
> 
> Also, is there still a plan to allowing access to y's columns 
> within the "j" expression as indicated in FAQ 1.10?
>    x[y, d]
> 
> (I did ask this before and I believe Matthew said that it is 
> in the works.  I just want to make sure I didn't misunderstand.)
> 
> Otherwise, the whole x[y] join is only a subset.  (I do 
> understand that I can use merge() to do a full merge.)
> 
> ====================================
> 
> Question 2 -- What is an easy way to do the equivalent of 
> VLOOKUP in Excel?
> 
> Now, suppose both my data.tables (x and y) had a lot of 
> columns in it.  I want to add a new column "d" to x.  The 
> value "d" will be from the data.table y such that x$a == y$a.
> 
> What is the easiest way for me to do it?  Do I have to 
> merge() both the data.tables and remove the unwanted columns 
> of y from the result?
> 
> If the feature in the FAQ 1.10 were implemented, could I could do
>    x[y, list(.SD,d)]
> or something to that effect?
> 
> 
> Regards,
> Harish
> 
> 
> 
>       
> _______________________________________________
> datatable-help mailing list
> datatable-help at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/d
atatable-help
> 


More information about the datatable-help mailing list