[Rcpp-devel] Rcpp "version" of R's match function

Willem Ligtenberg willem.ligtenberg at openanalytics.eu
Fri Nov 16 09:11:33 CET 2012


I agree that we might want to have these things in a package say RcppBase.
Then Rcpp can remain the R to cpp layer. And anybody who want to implement
an R base function in cpp, can contribute it to RcppBase.

Willem


On Fri, Nov 16, 2012 at 8:56 AM, Romain Francois
<romain at r-enthusiasts.com>wrote:

> We need to fond the right compromise between bloating Rcpp (which is
> already quite huge:
>
> wc src/* inst/include/**/** inst/include/* 2> /dev/null | tail -n1
>    66180  784183 6425152 total
>
> and support generic enough things.
>
> I can see things like union and setdiff being generic enough (we already
> have unique btw).
>
>
> Then for other things, being in another package is not that bad.
> An after all, this is what Rcpp really is about: give others the tools.
>
> Romain
>
> Le 15/11/12 20:07, Søren Højsgaard a écrit :
>
>  Dear list
>>
>> [>>] I am not sure if Hadleys remark below was an invitation to make a
>> "wishlish", but I'll take the risk:
>>
>> 1) I have made several packages related to graphical models for
>> multivariate data. Much of these packages deals with "book keeping":
>> operations on sets of subsets of a finite set of variables, so in these
>> packages there is much use of union(), setdiff(), etc and these function
>> all heavily use match(). The same applies to unique() which is also based
>> on match(). It would be very nice to have these in c++ form. Hence, with a
>> c++ version of match() these should be low-hanging apples.
>>
>> 2) Also of relevance to the graphical model packages is a c++ version of
>> aperm() for permuting an array.
>>
>> 3) There are operations on such arrays which I imagine could be
>> conveniently made in the Rcpp-framwork. Consider a 2x2x2 contingency table
>> with dimnames a,b,c. Call this table n(a,b,c). The all-two-factor
>> log-linear model will have generators (a,b)(a,c)(c,b). Iterative
>> proportional fitting works as follows: Let m(a,b,c) denotes the array of
>> fitted values (at the current iteration). Then the update for the (c,b)
>> generator is
>>
>>   m(a,b,c) <- m(a,b,c) n(c,b)/m(c,b)
>>
>> To do this one must have
>>   marginalization: n(a,b,c) -> n(b,c)
>>   permutation: n(b,c) -> n(c,b)
>>   division: n(c,b)/m(c,b)
>>   multiplication: m(a,b,c) * ( n(c,b)/m(c,b) )
>>
>> I am aware that iterative proportional fitting is already implemented in
>> loglin, but there are other kind of (graphical) models where similar
>> updates are needed. In connection with message passing in Bayesian
>> networks, one operation often needed is
>>
>>   m(a,b,c) <- n(a,b) * n(c,b)
>>
>> which will result in an array with dimensions (a,b,c). All of this stuff
>> is implemented in the gRbase backage as R functions, and it would be very
>> convenient to have these operations as c++ functions. In the gRbase
>> implementation it is required that the arrays do have dimnames, and I guess
>> it must be so also in c++.
>>
>> I am perfectly aware that I should program these facilities in c++ using
>> Rcpp, but I just can't resist to mention these wishes, in case they are
>> "almost there" in c++.
>>
>> Best regards
>> Søren
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Hmmm - see http://cran.r-project.org/web/**packages/fastmatch/index.html<http://cran.r-project.org/web/packages/fastmatch/index.html>
>>
>> Hadley
>>
>> PS.  Would you be interested in a set of R functions that from a quick
>> skim of the R sources that I think could be much much faster if implemented
>> in Rcpp?
>>
>>
>> --
>> RStudio / Rice University
>> http://had.co.nz/
>>
>>
>
> --
> Romain Francois
> Professional R Enthusiast
> +33(0) 6 28 91 30 30
>
> R Graph Gallery: http://gallery.r-enthusiasts.**com<http://gallery.r-enthusiasts.com>
> `- http://bit.ly/SweN1Z : SuperStorm Sandy
>
> blog:            http://romainfrancois.blog.**free.fr<http://romainfrancois.blog.free.fr>
> |- http://bit.ly/RE6sYH : OOP with Rcpp modules
> `- http://bit.ly/Thw7IK : Rcpp modules more flexible
>
> ______________________________**_________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-**project.org<Rcpp-devel at lists.r-forge.r-project.org>
> https://lists.r-forge.r-**project.org/cgi-bin/mailman/**
> listinfo/rcpp-devel<https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20121116/23dedb17/attachment.html>


More information about the Rcpp-devel mailing list