[Rcpp-devel] trying to insert a number as first element of already existing vector
Serguei Sokol
serguei.sokol at gmail.com
Wed Dec 12 11:07:55 CET 2018
Le 12/12/2018 à 02:25, Mark Leeds a écrit :
>
> Just to close this thread out, I did a more comprehensive benchmark
> using 8 different approaches
> and it looks like
>
> A) Jan's solution using memcopy and NumericVector.
>
> B) A push_front solution using NumericVector
>
> C) Serguei's const trick solution using NumericVector
>
> are the top 3 solutions in terms of speed with B) push_front
> technically the winner !!!!!
Hm. I am wondering how much the jury was payed under the table by the
promoters of push_front()? ;)
More seriously, in your results push_front() is in mybar7 which is
second after mybar3 in every timing position (min, lq etc.):
> Unit: milliseconds
> expr min lq mean
> median uq max neval
> mybar3(testvec, testelem) 27.93793 31.94207 36.76242 37.17255
> 41.52102 47.31534 100
> mybar7(testvec, testelem) 30.80926 33.41609 38.45877 37.71916
> 43.70371 48.88513 100
To be honest, on my machine, mybar3 is far behind other champions (72 ms
vs 60 ms) and there is no clear leader among the latter (mybar4 to 8) as
the leadership is changing from one run to another.
Best,
Serguei.
> Thanks to everyone for their help. I learned much.
>
> #======================================================================
>
> #include <Rcpp.h>
> using namespace Rcpp;
>
> // [[Rcpp::export]]
> std::vector<double> mybar(const std::vector<double>& x, double firstelem) {
> std::vector<double> tmp(x.size() + 1);
> tmp[0] = firstelem;
> for (int i = 1; i < (x.size()+1); i++)
> tmp[i] = x[i-1];
> return tmp;
> }
> // [[Rcpp::export]]
> std::vector<double> mybar2(const std::vector<double>& x, double firstelem) {
> std::vector<double> tmp(x.size() + 1);
> tmp[0] = firstelem;
> std::copy(x.begin(), x.end(), tmp.begin()+1);
> return tmp;
> }
>
> // [[Rcpp::export]]
> NumericVector mybar3(NumericVector x, double firstelem) {
> NumericVector tmp(x.size() + 1);
> tmp[0] = firstelem;
> std::copy(x.begin(), x.end(), tmp.begin()+1);
> return tmp;
> }
>
> // [[Rcpp::export]]
> NumericVector mybar4(NumericVector x, double firstelem) {
> NumericVector result(x.size() + 1);
> result[0] = firstelem;
> std::memcpy(result.begin()+1, x.begin(), x.size()*sizeof(double));
> return result;
> }
>
> // [[Rcpp::export]]
> NumericVector mybar5(NumericVector x, NumericVector y) {
> NumericVector result(x.size() + y.size());
> std::memcpy(result.begin(), x.begin(), x.size()*sizeof(double));
> std::memcpy(result.begin()+x.size(), y.begin(),
> y.size()*sizeof(double));
> return result;
> }
>
> // [[Rcpp::export]]
> NumericVector mybar6(NumericVector x, double firstelem) {
> x.insert(0, firstelem);
> return x;
> }
>
> // [[Rcpp::export]]
> NumericVector mybar7(NumericVector x, double firstelem) {
> x.push_front(firstelem);
> return x;
> }
>
> // [[Rcpp::export]]
> NumericVector mybar8(const NumericVector &x, const NumericVector &y) {
> NumericVector result(x.size() + y.size());
> std::memcpy(result.begin(), x.begin(), x.size()*sizeof(double));
> std::memcpy(result.begin()+x.size(), y.begin(),
> y.size()*sizeof(double));
> return result;
> }
>
>
> /*** R
>
> library(microbenchmark)
>
> n=1E7
> testvec = c(1,seq_len(n))
> testelem <- 7
> microbenchmark(c(testelem, testvec), mybar(testvec,testelem),
> mybar2(testvec,testelem),
> mybar3(testvec,testelem),
> mybar4(testvec,testelem),
> mybar5(testvec,testelem),
> mybar6(testvec,testelem),
> mybar7(testvec,testelem),
> mybar8(testvec,testelem)
> )
>
>
> */
>
> microbenchmark(c(testelem, testvec), mybar(testvec,testelem),
> + mybar2(testvec,testelem),
> + mybar3(testvec,testelem),
> + mybar4(testvec,testelem) .... [TRUNCATED]
> Unit: milliseconds
> expr min lq mean
> median uq max neval
> c(testelem, testvec) 33.82390 37.41429 42.70048 42.48487
> 47.72840 81.53239 100
> mybar(testvec, testelem) 93.35373 100.67106 105.30134 105.67559
> 109.62234 125.15337 100
> mybar2(testvec, testelem) 88.00770 94.62231 98.84161 98.51031
> 102.49516 114.58349 100
> mybar3(testvec, testelem) 27.93793 31.94207 36.76242 37.17255
> 41.52102 47.31534 100
> mybar4(testvec, testelem) 31.37486 34.73718 39.72786 40.83917
> 44.21151 49.48883 100
> mybar5(testvec, testelem) 30.90608 35.25496 40.24085 40.59592
> 44.88581 50.33709 100
> mybar6(testvec, testelem) 33.24435 38.32075 43.11721 43.46578
> 47.93726 52.72538 100
> mybar7(testvec, testelem) 30.80926 33.41609 38.45877 37.71916
> 43.70371 48.88513 100
> mybar8(testvec, testelem) 30.88067 35.01826 40.38411 40.02501
> 44.49641 73.84147 100
> >
>
>
>
>
> On Mon, Dec 10, 2018 at 8:42 AM Serguei Sokol <serguei.sokol at gmail.com
> <mailto:serguei.sokol at gmail.com>> wrote:
>
> Le 10/12/2018 à 13:04, Jan van der Laan a écrit :
> > Small addendum: A large part of the performance gain in my
> example comes
> > from using NumericVector instead of std::vector<double>. Which
> avoids a
> > conversion. An example using std::copy with Numeric vector runs
> in the
> > same time as the version using memcpy.
>
> Yep.
> Few more percents of mean cpu time can be saved by using "const &"
> trick :
>
> // [[Rcpp::export]]
> NumericVector mybar5(const NumericVector &x, const NumericVector &y) {
> NumericVector result(x.size() + y.size());
> std::memcpy(result.begin(), x.begin(), x.size()*sizeof(double));
> std::memcpy(result.begin()+x.size(), y.begin(),
> y.size()*sizeof(double));
> return result;
> }
>
> # output
> Unit: microseconds
> expr min lq mean median
> uq
> max
> c(testelem, testvec) 258.343 338.3110 418.0047 343.4450
> 378.7850
> 3077.347
> mybar(testvec, testelem) 352.699 366.8770 498.3948 374.6635
> 450.4420
> 3046.408
> mybar2(testvec, testelem) 334.820 348.3685 425.0098 354.7240
> 366.5270
> 3024.128
> mybar3(testvec, testelem) 233.689 244.8640 315.7256 247.5180
> 255.0955
> 2945.068
> mybar4(testvec, testelem) 232.083 241.9655 340.0751 245.0035
> 252.8260
> 2934.312
> mybar5(testvec, testelem) 150.787 242.7685 285.4264 245.9465
> 254.1880
> 2049.493
>
> Serguei.
>
> >
> > Jan
> >
> >
> >
> > On 10-12-18 12:28, Jan van der Laan wrote:
> >>
> >> For performance memcpy is probably fastest. This gives the same
> >> performance a c().
> >>
> >> // [[Rcpp::export]]
> >> NumericVector mybar3(NumericVector x, double firstelem) {
> >> NumericVector result(x.size() + 1);
> >> result[0] = firstelem;
> >> std::memcpy(result.begin()+1, x.begin(),
> x.size()*sizeof(double));
> >> return result;
> >> }
> >>
> >>
> >> Or a more general version concatenating vector of arbitrary lengths:
> >>
> >>
> >> // [[Rcpp::export]]
> >> NumericVector mybar4(NumericVector x, NumericVector y) {
> >> NumericVector result(x.size() + y.size());
> >> std::memcpy(result.begin(), x.begin(), x.size()*sizeof(double));
> >> std::memcpy(result.begin()+x.size(), y.begin(),
> >> y.size()*sizeof(double));
> >> return result;
> >> }
> >>
> >>
> >>
> >> > n=1E7
> >> > testvec = c(1,seq_len(n))
> >> > testelem <- 7
> >> > microbenchmark(c(testelem, testvec), mybar(testvec,testelem),
> >> + mybar2(testvec,testelem),
> >> + mybar3(testvec,testelem),
> >> + mybar4(testvec,testelem)
> >> + )
> >> Unit: milliseconds
> >> expr min lq mean median
> >> uq max neval
> >> c(testelem, testvec) 36.48577 36.93754 41.10550 43.76742
> >> 44.20709 46.09741 100
> >> mybar(testvec, testelem) 102.54042 103.21756 106.88749 104.32033
> >> 110.31527 119.55512 100
> >> mybar2(testvec, testelem) 95.64696 96.19447 100.24691 102.61380
> >> 103.58189 109.28290 100
> >> mybar3(testvec, testelem) 36.45794 36.87915 40.43486 37.18063
> >> 43.49643 95.49049 100
> >> mybar4(testvec, testelem) 36.51334 37.05409 41.39680 43.20627
> >> 43.57958 94.95482 100
> >>
> >>
> >> Best,
> >> Jan
> >>
> >>
> >>
> >> On 10-12-18 12:10, Serguei Sokol wrote:
> >>> Le 09/12/2018 à 09:35, Mark Leeds a écrit :
> >>>> Hi All: I wrote below and it works but I have a strong feeling
> >>>> there's a better way to do it.
> >>> If performance is an issue, you can save few percents of cpu
> time by
> >>> using std::copy() instead of explicit for loop. Yet, for this
> >>> operation R's c() remains the best bet. It is more then twice
> faster
> >>> than both Rcpp versions below:
> >>>
> >>> #include <Rcpp.h>
> >>> using namespace Rcpp;
> >>>
> >>> // [[Rcpp::export]]
> >>> std::vector<double> mybar(const std::vector<double>& x, double
> >>> firstelem) {
> >>> std::vector<double> tmp(x.size() + 1);
> >>> tmp[0] = firstelem;
> >>> for (int i = 1; i < (x.size()+1); i++)
> >>> tmp[i] = x[i-1];
> >>> return tmp;
> >>> }
> >>> // [[Rcpp::export]]
> >>> std::vector<double> mybar2(const std::vector<double>& x, double
> >>> firstelem) {
> >>> std::vector<double> tmp(x.size() + 1);
> >>> tmp[0] = firstelem;
> >>> std::copy(x.begin(), x.end(), tmp.begin()+1);
> >>> return tmp;
> >>> }
> >>>
> >>> /*** R
> >>> library(microbenchmark)
> >>> n=100000
> >>> testvec = c(1,seq_len(n))
> >>> testelem <- 7
> >>> microbenchmark(c(testelem, testvec), mybar(testvec,testelem),
> >>> mybar2(testvec,testelem))
> >>> */
> >>>
> >>> # Ouput
> >>> Unit: microseconds
> >>> expr min lq mean
> >>> median uq
> >>> c(testelem, testvec) 247.098 248.5655 444.8657 257.3300
> >>> 630.7725
> >>> mybar(testvec, testelem) 594.978 622.3560 1226.5683 637.0230
> >>> 1386.8385
> >>> mybar2(testvec, testelem) 576.191 604.7565 1029.2124 616.1055
> >>> 1351.6740
> >>> max neval
> >>> 7587.977 100
> >>> 22149.605 100
> >>> 11651.831 100
> >>>
> >>>
> >>> Best,
> >>> Serguei.
> >>>
> >>>> I looked on the net and found some material from back in ~2014
> about
> >>>> concatenating
> >>>> vectors but I didn't see anything final about it. Thanks for any
> >>>> insights.
> >>>>
> >>>> Also, the documentation for Rcpp is beyond incredible (thanks to
> >>>> dirk, romain, kevin and all the other people I'm leaving out
> ) but
> >>>> is there a general methodology for finding equivalents of R
> >>>> functions. For example, if I want a cumsum function in Rcpp,
> how do
> >>>> I know whether to use the stl with accumulate or if there's
> already
> >>>> one built in so
> >>>> that I just call cumsum.
> >>>>
> >>>> Thanks.
> >>>>
> >>>> #=======================================================
> >>>>
> >>>> #include <Rcpp.h>
> >>>> using namespace Rcpp;
> >>>>
> >>>> // [[Rcpp::export]]
> >>>> std::vector<double> mybar(const std::vector<double>& x, double
> >>>> firstelem) {
> >>>> std::vector<double> tmp(x.size() + 1);
> >>>> tmp[0] = firstelem;
> >>>> for (int i = 1; i < (x.size()+1); i++)
> >>>> tmp[i] = x[i-1];
> >>>> return tmp;
> >>>> }
> >>>>
> >>>> /*** R
> >>>>
> >>>> testvec = c(1,2,3)
> >>>> testelem <- 7
> >>>> mybar(testvec,testelem)
> >>>>
> >>>> */
> >>>>
> >>>> #===============================
> >>>> # OUTPUT FROM RUNNING ABOVE
> >>>> #=================================
> >>>> > testvec <- c(1,2,3)
> >>>> > testelem <- 7
> >>>> > mybar(testvec,testelem)
> >>>> [1] 7 1 2 3
> >>>> >
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Rcpp-devel mailing list
> >>>> Rcpp-devel at lists.r-forge.r-project.org
> <mailto:Rcpp-devel at lists.r-forge.r-project.org>
> >>>>
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
> >>>>
> >>>
> >>> _______________________________________________
> >>> Rcpp-devel mailing list
> >>> Rcpp-devel at lists.r-forge.r-project.org
> <mailto:Rcpp-devel at lists.r-forge.r-project.org>
> >>>
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
> >> _______________________________________________
> >> Rcpp-devel mailing list
> >> Rcpp-devel at lists.r-forge.r-project.org
> <mailto:Rcpp-devel at lists.r-forge.r-project.org>
> >>
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
> >
>
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> <mailto:Rcpp-devel at lists.r-forge.r-project.org>
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>
More information about the Rcpp-devel
mailing list