[datatable-help] merging data tables on date ranges

Nathaniel Graham npgraham1 at gmail.com
Wed Feb 5 02:36:38 CET 2014


I'm trying to figure out how to merge two data tables using the dates in
both.  One table a set of people, which has the to and from dates and their
address at each place they've lived.  The other has their work history,
also with to and from dates.  Obviously, there isn't a one-to-one
relationship; individuals may have several jobs while staying in the same
place, several homes over the course of a job, and any sort of overlapping
you can imagine.  Both tables are reasonably large; the residences has
about 950k rows, and the employment has about 1.2M rows.

To give you a bit of flavor, the first ten rows of each:
> work.history[1:10, list(icrdn, fromdate, todate, state, postalcode)]
    icrdn fromdate   todate state postalcode
 1:   145 Apr 1988 Jan 1990    FL      33432
 2:   145 Jan 1990 Jan 1997    FL      33432
 3:   145 Jan 1997 Dec 2011    FL      33444
 4:   145 Jan 1997 Dec 2011    FL      33444
 5:   145 Jan 1997 Dec 2011    FL      33444
 6:   170 Oct 1983 Apr 2002    NE      68114
 7:   170 Sep 1972 Dec 2011    IL      60443
 8:   170 Sep 1972 Dec 2011    IL 61821-3066
 9:   183 Aug 2000 Dec 2011    GA      30305
10:   183 Aug 2000 Dec 2011    GA      30305
> residences[1:10]
    icrdn fromdate  todate state postalcode
 1:   145  10/1992 03/2004    FL      33432
 2:   145  03/2004            FL      33487
 3:   170  09/1995            IL      61821
 4:   183  05/1993 08/2000    GA      30342
 5:   183  08/2000 09/2001    GA      30342
 6:   183  09/2001 08/2004    GA      30305
 7:   183  08/2004            GA      30073
 8:   183  02/2005            GA      30342
 9:   183  06/2006            GA      30075
10:   183  07/1974 05/1993    GA      30338

The 'icrdn' column is an identifier unique to each person.

What I'm looking for is a data table with a row for each residence-job
pair.  Any residence that doesn't have a job in the sample can be safely
dropped, and vice-versa.

Thanks in advance for any help anyone can offer.
-------
Nathaniel Graham
npgraham1 at gmail.com
npgraham1 at uky.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140204/5b74f166/attachment.html>


More information about the datatable-help mailing list