[Rcpp-devel] regular expression in Rcpp

Mark Leeds markleeds2 at gmail.com
Tue Jan 13 07:27:22 CET 2015


Hi All: I was trying to do something with regular expressions in Rcpp so I
piggy backed
heavily off of Dirk's boost.regex example in the Rcpp Gallery where he
takes streams of digits and checks them for machine and human readability.

My problem is actually pretty different and simpler. Essentially, if a
character string ends in "rhofixed" or "norhofixed", then that part of the
character string should be removed. The R code below illiustrates what I'm
trying to do.

But, when I write the Rcpp code to do the same thing and  set
Sys.setenv("PKG_LIBS"="-lboost_regex"), I don't get the same result. I
don't know if  it's due to the regex engine being different in boost or I
could be doing something else wrong. Thanks for any help.

=====================================================================
# R CODE TO ILLUSTRATE WHAT I WANT
=====================================================================

s <- c("lngimbintrhofixed","lngimbnointnorhofixed","test")
result <- sub("(no)?rhofixed$","",s)
print(result)

 print(result)
[1] "lngimbint"   "lngimbnoint" "test"

#=====================================================================
Rcpp CODE ATTEMPT HEAVILY BASED OFF DIRK"S EXAMPLE
#=====================================================================

#include <Rcpp.h>
#include <string>
#include <boost/regex.hpp>

using namespace Rcpp;
using namespace std;

bool validate_modelstring(const std::string& s) {
  static const boost::regex e("^(.*)((no)?rhofixed)$");
     return(boost::regex_match(s, e));
}

const std::string model_format("\\1");
const boost::regex e("\\A^(.*)((no)?rhofixed)$\\z");

std::string filtered_rhopart(const std::string& s) {
     return boost::regex_replace(s, e, model_format, boost::match_default |
boost::format_sed);
}

// [[Rcpp::export]]
Rcpp::DataFrame regexTest(std::vector<std::string> s) {

     int n = s.size();
     std::vector<bool> valid(n);
     std::vector<string> outmodel(n);

     for (int i=0; i<n; i++) {
          valid[i]  = validate_modelstring(s[i]);
          if (valid[i]) {
          outmodel[i] = filtered_rhopart(s[i]);
     } else {
        outmodel[i] = s[i];
     }
     }

     return Rcpp::DataFrame::create(Rcpp::Named("input") = s,
                    Rcpp::Named("valid") = valid,
                    Rcpp::Named("output") = outmodel);
}

/*** R
s <- c("lngimbintrhofixed","lngimbnointnorhofixed","test")
regexTest(s)
*/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20150113/ea45d590/attachment.html>


More information about the Rcpp-devel mailing list