[datatable-help] Joins in data.table
Harish
harishv_99 at yahoo.com
Fri Jul 16 19:56:00 CEST 2010
Thanks Tom. That answers my question #2.
It looks rather obvious once I see the solution. :)
Harish
--- On Fri, 7/16/10, Short, Tom <TShort at epri.com> wrote:
> From: Short, Tom <TShort at epri.com>
> Subject: RE: [datatable-help] Joins in data.table
> To: "Harish" <harishv_99 at yahoo.com>
> Cc: datatable-help at lists.r-forge.r-project.org
> Date: Friday, July 16, 2010, 4:37 AM
> Harish, I'm not sure on the answer to
> your first question.
>
> I'm not sure what VLOOKUP does, but based on your
> description, is this
> what you want?
>
> > x <- data.table( a=c("a","b","c"), b=1:3, key="a"
> )
> > y <- data.table( a=c("b","d","e"), d=c(8,9,10),
> key="a" ) # key added
> >
> > x$d <- y[x, d] # or
> y[x]$d
> > x
> a b d
> [1,] a 1 NA
> [2,] b 2 8
> [3,] c 3 NA
>
>
> - Tom
>
>
>
>
> > -----Original Message-----
> > From: datatable-help-bounces at lists.r-forge.r-project.org
>
> > [mailto:datatable-help-bounces at lists.r-forge.r-project.org]
>
> > On Behalf Of Harish
> > Sent: Friday, July 16, 2010 03:04
> > To: datatable-help at lists.r-forge.r-project.org
> > Subject: [datatable-help] Joins in data.table
> >
> > Question 1 -- Why get a full row of NAs with
> x[y]? Bug?
> > Question 2 -- What is an easy way to do the equivalent
> of
> > VLOOKUP in Excel?
> >
> > ====================================
> >
> > Question 1 -- Why get a full row of NAs with
> x[y]? Bug?
> >
> > I was experimenting with x[y] syntax and I don't
> understand
> > the use case for getting NA with nomatch=NA.
> >
> > x <- data.table( a=c("a","b","c"), b=1:3, key="a" )
> y <-
> > data.table( a=c("b","d","e"), d=c(8,9,10) )
> >
> > x[y] returns:
> >
> > a b
> > [1,] b 2
> > [2,] <NA> NA
> > [3,] <NA> NA
> >
> >
> > What is the point of getting an entire row of
> NA? I don't
> > even know what the NA row corresponds to in data.table
> y. If
> > I got the values in column "a", then it would make
> more sense.
> >
> > I would expect:
> >
> > a b
> > [1,] b 2
> > [2,] d NA
> > [3,] e NA
> >
> > Am I misunderstanding this join?
> >
> > (I do realize that nomatch=0 will prevent the NA
> rows.)
> >
> > Also, is there still a plan to allowing access to y's
> columns
> > within the "j" expression as indicated in FAQ 1.10?
> > x[y, d]
> >
> > (I did ask this before and I believe Matthew said that
> it is
> > in the works. I just want to make sure I didn't
> misunderstand.)
> >
> > Otherwise, the whole x[y] join is only a subset.
> (I do
> > understand that I can use merge() to do a full
> merge.)
> >
> > ====================================
> >
> > Question 2 -- What is an easy way to do the equivalent
> of
> > VLOOKUP in Excel?
> >
> > Now, suppose both my data.tables (x and y) had a lot
> of
> > columns in it. I want to add a new column "d" to
> x. The
> > value "d" will be from the data.table y such that x$a
> == y$a.
> >
> > What is the easiest way for me to do it? Do I
> have to
> > merge() both the data.tables and remove the unwanted
> columns
> > of y from the result?
> >
> > If the feature in the FAQ 1.10 were implemented, could
> I could do
> > x[y, list(.SD,d)]
> > or something to that effect?
> >
> >
> > Regards,
> > Harish
> >
> >
> >
> >
> > _______________________________________________
> > datatable-help mailing list
> > datatable-help at lists.r-forge.r-project.org
> > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/d
> atatable-help
> >
>
More information about the datatable-help
mailing list