[Recordlinkage-commits] compare.dedup

E. Lin exl022 at gmail.com
Tue Dec 11 21:03:37 CET 2012


Hi

I'm a newbie at using record linkage, and I'm trying it out trying to dedup
a field of company names (using compare.dedup() on just the company name
field alone) .  I have about 20,000 dirty records.  I run up against memory
constraints using the entire set (can't allocate a vector of size . ..  ).
 Are there good approaches for doing this in pieces or other ways people
have been successful for this number of records?

thanks!
E
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/recordlinkage-commits/attachments/20121211/c013f9ec/attachment.html>


More information about the Recordlinkage-commits mailing list