[datatable-help] Question on data.table::chmatch and fastmatch::fmatch
stat quant
statquant at outlook.com
Tue Jan 29 18:46:50 CET 2013
Hello all,
I have a lot of character columns in my data.table (usually only a few
factors like: 5e6 values drawn from {"A","B","C","D"}).
Looking on page 7-8 of the package vignette M.Dowle mention that:
- the package fastmatch is a faster alternative for string lookups,
using fastmatch::fmatch will build a hash map and will speed up things
considerably
- ...but poinpoint that the first pass is less efficient (compared to
data.table::chmatch)
- and finish by saying that he suggested Simon Urbanek (the fastmatch
package maintainer) to adopt chmatch for the first call.
I have a few questions regarding data.table/fastmatch:
- if I use something like DT[ fmatch(X,"A"),...], shall I expect
lightening-quick subsequent selects, I mean, would DT[
fmatch(X,c("B","D")),...] be much quicker (the select part of if)
- Are M.D or Simon Urbanek planing to use one-another code to enhance
both package ?
Thanks for reading
Regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130129/11f91cf8/attachment.html>
More information about the datatable-help
mailing list