[Rcpp-devel] getting ncol(DF) in Rcpp
Silkworth,David J.
SILKWODJ at airproducts.com
Tue Jun 28 05:31:46 CEST 2011
I figured that getting the ncol(DF) information would be something simpler than I resorted to.
As it turned out, my impression of the time it took to convert the dataframe to a matrix was confused with running it through Excel, using RExcel. In the R console this was momentary even for the 43,000 line dataframe. It turned out that no matter how I would try to work on this, such conversion was necessary and R could not be beaten. Then, by access to matrix math functions in R, and use of some sapply functions, everything I wanted to do ended up best done in R.
Just goes to show, if you can avoid an explicit loop in R, there is a chance that even the interpreted language can do you many favors.
-----Original Message-----
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas Bates
Sent: Monday, June 27, 2011 1:47 PM
To: Silkworth,David J.
Cc: rcpp-devel at r-forge.wu-wien.ac.at
Subject: Re: [Rcpp-devel] getting ncol(DF) in Rcpp
A data.frame in R is a curious object that is really a list of the columns. So
myDF.size()
returns the number of columns.
Try the enclosed R source file.
On Mon, Jun 27, 2011 at 12:30 PM, Silkworth,David J.
<SILKWODJ at airproducts.com> wrote:
> You guys know I am here just to give you a chuckle.
>
> I wanted to build a function passing just a dataframe to Rcpp. In
> order to use this dataframe, I need to know how many columns it has at
> runtime. My attempts at getting this ncol information were thwarted
> on several counts. The Dimension class appears to only work on STL
> containers, which Rcpp::DataFrame is not. I resorted to the
> Environment facility to attempt a feeble-minded RInside, (since I
> can't understand RInside anyway).
>
> Environment base("package:base");
> Function ncol = base["ncol"];
> Rcpp::NumericVector test(1);
> test[0]=ncol(myDF);
>
> This fails to compile with the following error:
> error: cannot convert 'SEXPREC*' to
> 'Rcpp::traits::storage_type<14>::type'
>
> However, just short of sending another single element vector with this
> information as an argument to Rcpp I tried the following, AND IT WORKED!
>
> (My debug technique is to send items back to R for inspection. This
> is just some test code to show that an integer value of myNames.size()
> will be useful as a proxy for ncol(DF) in further code development.)
>
> src <- '
> Rcpp::DataFrame myDF=(arg1);
> Environment base("package:base");
> Function names = base["names"];
> Rcpp::CharacterVector myNames(names(myDF)); Rcpp::NumericVector
> ncol(1); ncol[0]=myNames.size(); return(ncol); '
>
> fun <- cxxfunction(signature(arg1 = "numeric"),
> src, plugin = "Rcpp")
>
> vec1<-rep(5,5)
> vec2<-c(1:5)
> DF<-data.frame(vec1,vec2)
> test<-fun(DF)
>
> Okay, how's that for a laugher.
>
> In my real case I am using the same dataframe that I needed to clean
> up in my 'redimension' chain. My solution there works quite fine.
> Now I have yet to decompose this dataframe back into vectors and a
> matrix to enable entries to be accessed in Rcpp. But at least I have
> the dimensions for the matrix now.
>
> It takes about 3 seconds for R to extract a matrix based on
> DF[,3:ncol(DF)] on a dataframe with 46,000 rows. I am counting on
> Rcpp code to execute this more efficiently. One could argue that I
> should never have left Rcpp in the first place. But that is another story.
>
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-deve
> l
>
More information about the Rcpp-devel
mailing list