[Rcpp-devel] Rcpp "version" of R's match function
willem.ligtenberg at openanalytics.eu
Fri Nov 16 09:11:33 CET 2012
I agree that we might want to have these things in a package say RcppBase.
Then Rcpp can remain the R to cpp layer. And anybody who want to implement
an R base function in cpp, can contribute it to RcppBase.
On Fri, Nov 16, 2012 at 8:56 AM, Romain Francois
<romain at r-enthusiasts.com>wrote:
> We need to fond the right compromise between bloating Rcpp (which is
> already quite huge:
> wc src/* inst/include/**/** inst/include/* 2> /dev/null | tail -n1
> 66180 784183 6425152 total
> and support generic enough things.
> I can see things like union and setdiff being generic enough (we already
> have unique btw).
> Then for other things, being in another package is not that bad.
> An after all, this is what Rcpp really is about: give others the tools.
> Le 15/11/12 20:07, Søren Højsgaard a écrit :
> Dear list
>> [>>] I am not sure if Hadleys remark below was an invitation to make a
>> "wishlish", but I'll take the risk:
>> 1) I have made several packages related to graphical models for
>> multivariate data. Much of these packages deals with "book keeping":
>> operations on sets of subsets of a finite set of variables, so in these
>> packages there is much use of union(), setdiff(), etc and these function
>> all heavily use match(). The same applies to unique() which is also based
>> on match(). It would be very nice to have these in c++ form. Hence, with a
>> c++ version of match() these should be low-hanging apples.
>> 2) Also of relevance to the graphical model packages is a c++ version of
>> aperm() for permuting an array.
>> 3) There are operations on such arrays which I imagine could be
>> conveniently made in the Rcpp-framwork. Consider a 2x2x2 contingency table
>> with dimnames a,b,c. Call this table n(a,b,c). The all-two-factor
>> log-linear model will have generators (a,b)(a,c)(c,b). Iterative
>> proportional fitting works as follows: Let m(a,b,c) denotes the array of
>> fitted values (at the current iteration). Then the update for the (c,b)
>> generator is
>> m(a,b,c) <- m(a,b,c) n(c,b)/m(c,b)
>> To do this one must have
>> marginalization: n(a,b,c) -> n(b,c)
>> permutation: n(b,c) -> n(c,b)
>> division: n(c,b)/m(c,b)
>> multiplication: m(a,b,c) * ( n(c,b)/m(c,b) )
>> I am aware that iterative proportional fitting is already implemented in
>> loglin, but there are other kind of (graphical) models where similar
>> updates are needed. In connection with message passing in Bayesian
>> networks, one operation often needed is
>> m(a,b,c) <- n(a,b) * n(c,b)
>> which will result in an array with dimensions (a,b,c). All of this stuff
>> is implemented in the gRbase backage as R functions, and it would be very
>> convenient to have these operations as c++ functions. In the gRbase
>> implementation it is required that the arrays do have dimnames, and I guess
>> it must be so also in c++.
>> I am perfectly aware that I should program these facilities in c++ using
>> Rcpp, but I just can't resist to mention these wishes, in case they are
>> "almost there" in c++.
>> Best regards
>> Hmmm - see http://cran.r-project.org/web/**packages/fastmatch/index.html<http://cran.r-project.org/web/packages/fastmatch/index.html>
>> PS. Would you be interested in a set of R functions that from a quick
>> skim of the R sources that I think could be much much faster if implemented
>> in Rcpp?
>> RStudio / Rice University
> Romain Francois
> Professional R Enthusiast
> +33(0) 6 28 91 30 30
> R Graph Gallery: http://gallery.r-enthusiasts.**com<http://gallery.r-enthusiasts.com>
> `- http://bit.ly/SweN1Z : SuperStorm Sandy
> blog: http://romainfrancois.blog.**free.fr<http://romainfrancois.blog.free.fr>
> |- http://bit.ly/RE6sYH : OOP with Rcpp modules
> `- http://bit.ly/Thw7IK : Rcpp modules more flexible
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-**project.org<Rcpp-devel at lists.r-forge.r-project.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Rcpp-devel