[datatable-help] Joins in data.table

Harish harishv_99 at yahoo.com
Fri Jul 16 19:56:00 CEST 2010


Thanks Tom.  That answers my question #2.

It looks rather obvious once I see the solution.  :)


Harish


--- On Fri, 7/16/10, Short, Tom <TShort at epri.com> wrote:

> From: Short, Tom <TShort at epri.com>
> Subject: RE: [datatable-help] Joins in data.table
> To: "Harish" <harishv_99 at yahoo.com>
> Cc: datatable-help at lists.r-forge.r-project.org
> Date: Friday, July 16, 2010, 4:37 AM
> Harish, I'm not sure on the answer to
> your first question.
> 
> I'm not sure what VLOOKUP does, but based on your
> description, is this
> what you want?
> 
> > x <- data.table( a=c("a","b","c"), b=1:3, key="a"
> )
> > y <- data.table( a=c("b","d","e"), d=c(8,9,10),
> key="a" ) # key added
> > 
> > x$d <- y[x, d]     # or 
> y[x]$d
> > x
>      a b  d
> [1,] a 1 NA
> [2,] b 2  8
> [3,] c 3 NA
> 
> 
> - Tom
> 
> 
>  
> 
> > -----Original Message-----
> > From: datatable-help-bounces at lists.r-forge.r-project.org
> 
> > [mailto:datatable-help-bounces at lists.r-forge.r-project.org]
> 
> > On Behalf Of Harish
> > Sent: Friday, July 16, 2010 03:04
> > To: datatable-help at lists.r-forge.r-project.org
> > Subject: [datatable-help] Joins in data.table
> > 
> > Question 1 -- Why get a full row of NAs with
> x[y]?  Bug?
> > Question 2 -- What is an easy way to do the equivalent
> of 
> > VLOOKUP in Excel?
> > 
> > ====================================
> > 
> > Question 1 -- Why get a full row of NAs with
> x[y]?  Bug?
> > 
> > I was experimenting with x[y] syntax and I don't
> understand 
> > the use case for getting NA with nomatch=NA.
> > 
> > x <- data.table( a=c("a","b","c"), b=1:3, key="a" )
> y <- 
> > data.table( a=c("b","d","e"), d=c(8,9,10) )
> > 
> > x[y] returns:
> > 
> >         a  b
> > [1,]    b  2
> > [2,] <NA> NA
> > [3,] <NA> NA
> > 
> > 
> > What is the point of getting an entire row of
> NA?  I don't 
> > even know what the NA row corresponds to in data.table
> y.  If 
> > I got the values in column "a", then it would make
> more sense.
> > 
> > I would expect:
> > 
> >      a  b
> > [1,] b  2
> > [2,] d NA
> > [3,] e NA
> > 
> > Am I misunderstanding this join?
> > 
> > (I do realize that nomatch=0 will prevent the NA
> rows.)
> > 
> > Also, is there still a plan to allowing access to y's
> columns 
> > within the "j" expression as indicated in FAQ 1.10?
> >    x[y, d]
> > 
> > (I did ask this before and I believe Matthew said that
> it is 
> > in the works.  I just want to make sure I didn't
> misunderstand.)
> > 
> > Otherwise, the whole x[y] join is only a subset. 
> (I do 
> > understand that I can use merge() to do a full
> merge.)
> > 
> > ====================================
> > 
> > Question 2 -- What is an easy way to do the equivalent
> of 
> > VLOOKUP in Excel?
> > 
> > Now, suppose both my data.tables (x and y) had a lot
> of 
> > columns in it.  I want to add a new column "d" to
> x.  The 
> > value "d" will be from the data.table y such that x$a
> == y$a.
> > 
> > What is the easiest way for me to do it?  Do I
> have to 
> > merge() both the data.tables and remove the unwanted
> columns 
> > of y from the result?
> > 
> > If the feature in the FAQ 1.10 were implemented, could
> I could do
> >    x[y, list(.SD,d)]
> > or something to that effect?
> > 
> > 
> > Regards,
> > Harish
> > 
> > 
> > 
> >       
> > _______________________________________________
> > datatable-help mailing list
> > datatable-help at lists.r-forge.r-project.org
> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/d
> atatable-help
> > 
> 


      


More information about the datatable-help mailing list