[Rcpp-devel] Regular Expressions
Dirk Eddelbuettel
edd at debian.org
Sat Mar 2 02:56:41 CET 2013
Gabor,
Here is a quick variant of one of the Boost regexp examples, particularly
http://www.boost.org/doc/libs/1_53_0/libs/regex/example/snippets/credit_card_example.cpp
// cf www.boost.org/doc/libs/1_53_0/libs/regex/example/snippets/credit_card_example.cpp
#include <Rcpp.h>
#include <string>
#include <boost/regex.hpp>
bool validate_card_format(const std::string& s) {
static const boost::regex e("(\\d{4}[- ]){3}\\d{4}");
return boost::regex_match(s, e);
}
const boost::regex e("\\A(\\d{3,4})[- ]?(\\d{4})[- ]?(\\d{4})[- ]?(\\d{4})\\z");
const std::string machine_format("\\1\\2\\3\\4");
const std::string human_format("\\1-\\2-\\3-\\4");
std::string machine_readable_card_number(const std::string& s) {
return boost::regex_replace(s, e, machine_format, boost::match_default | boost::format_sed);
}
std::string human_readable_card_number(const std::string& s) {
return boost::regex_replace(s, e, human_format, boost::match_default | boost::format_sed);
}
// [[Rcpp::export]]
Rcpp::DataFrame regexDemo(std::vector<std::string> s) {
int n = s.size();
std::vector<bool> valid(n);
std::vector<std::string> machine(n);
std::vector<std::string> human(n);
for (int i=0; i<n; i++) {
valid[i] = validate_card_format(s[i]);
machine[i] = machine_readable_card_number(s[i]);
human[i] = human_readable_card_number(s[i]);
}
return Rcpp::DataFrame::create(Rcpp::Named("input") = s,
Rcpp::Named("valid") = valid,
Rcpp::Named("machine") = machine,
Rcpp::Named("human") = human);
}
which we can test with the same input as the example has:
R> Rcpp::sourceCpp('/tmp/boostreex.cpp')
R> s <- c("0000111122223333", "0000 1111 2222 3333", "0000-1111-2222-3333", "000-1111-2222-3333")
R> regexDemo(s)
input valid machine human
1 0000111122223333 FALSE 0000111122223333 0000-1111-2222-3333
2 0000 1111 2222 3333 TRUE 0000111122223333 0000-1111-2222-3333
3 0000-1111-2222-3333 TRUE 0000111122223333 0000-1111-2222-3333
4 000-1111-2222-3333 FALSE 000111122223333 000-1111-2222-3333
R>
On Linux, you generally don't have to do anything to get Boost headers as
they end up in /usr/include (or /usr/local/include) so for me, this just
builds. For R on Windows, you are quite likely to get by with the
CRAN-provided boost tarball and an additional -I$(BOOSTLIB) etc.
Hth, Dirk
--
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
More information about the Rcpp-devel
mailing list