[datatable-help] merging data tables on date ranges
Nathaniel Graham
npgraham1 at gmail.com
Wed Feb 5 02:36:38 CET 2014
I'm trying to figure out how to merge two data tables using the dates in
both. One table a set of people, which has the to and from dates and their
address at each place they've lived. The other has their work history,
also with to and from dates. Obviously, there isn't a one-to-one
relationship; individuals may have several jobs while staying in the same
place, several homes over the course of a job, and any sort of overlapping
you can imagine. Both tables are reasonably large; the residences has
about 950k rows, and the employment has about 1.2M rows.
To give you a bit of flavor, the first ten rows of each:
> work.history[1:10, list(icrdn, fromdate, todate, state, postalcode)]
icrdn fromdate todate state postalcode
1: 145 Apr 1988 Jan 1990 FL 33432
2: 145 Jan 1990 Jan 1997 FL 33432
3: 145 Jan 1997 Dec 2011 FL 33444
4: 145 Jan 1997 Dec 2011 FL 33444
5: 145 Jan 1997 Dec 2011 FL 33444
6: 170 Oct 1983 Apr 2002 NE 68114
7: 170 Sep 1972 Dec 2011 IL 60443
8: 170 Sep 1972 Dec 2011 IL 61821-3066
9: 183 Aug 2000 Dec 2011 GA 30305
10: 183 Aug 2000 Dec 2011 GA 30305
> residences[1:10]
icrdn fromdate todate state postalcode
1: 145 10/1992 03/2004 FL 33432
2: 145 03/2004 FL 33487
3: 170 09/1995 IL 61821
4: 183 05/1993 08/2000 GA 30342
5: 183 08/2000 09/2001 GA 30342
6: 183 09/2001 08/2004 GA 30305
7: 183 08/2004 GA 30073
8: 183 02/2005 GA 30342
9: 183 06/2006 GA 30075
10: 183 07/1974 05/1993 GA 30338
The 'icrdn' column is an identifier unique to each person.
What I'm looking for is a data table with a row for each residence-job
pair. Any residence that doesn't have a job in the sample can be safely
dropped, and vice-versa.
Thanks in advance for any help anyone can offer.
-------
Nathaniel Graham
npgraham1 at gmail.com
npgraham1 at uky.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20140204/5b74f166/attachment.html>
More information about the datatable-help
mailing list