[Rcpp-devel] getting ncol(DF) in Rcpp

Silkworth,David J. SILKWODJ at airproducts.com
Tue Jun 28 05:31:46 CEST 2011


I figured that getting the ncol(DF) information would be something simpler than I resorted to.

As it turned out, my impression of the time it took to convert the dataframe to a matrix was confused with running it through Excel, using RExcel.  In the R console this was momentary even for the 43,000 line dataframe.  It turned out that no matter how I would try to work on this, such conversion was necessary and R could not be beaten.  Then, by access to matrix math functions in R, and use of some sapply functions, everything I wanted to do ended up best done in R.

Just goes to show, if you can avoid an explicit loop in R, there is a chance that even the interpreted language can do you many favors.

-----Original Message-----
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas Bates
Sent: Monday, June 27, 2011 1:47 PM
To: Silkworth,David J.
Cc: rcpp-devel at r-forge.wu-wien.ac.at
Subject: Re: [Rcpp-devel] getting ncol(DF) in Rcpp

A data.frame in R is a curious object that is really a list of the columns.  So

myDF.size()

returns the number of columns.

Try the enclosed R source file.

On Mon, Jun 27, 2011 at 12:30 PM, Silkworth,David J.
<SILKWODJ at airproducts.com> wrote:
> You guys know I am here just to give you a chuckle.
>
> I wanted to build a function passing just a dataframe to Rcpp.  In 
> order to use this dataframe, I need to know how many columns it has at 
> runtime.  My attempts at getting this ncol information were thwarted 
> on several counts.  The Dimension class appears to only work on STL 
> containers, which Rcpp::DataFrame is not.  I resorted to the 
> Environment facility to attempt a feeble-minded RInside, (since I 
> can't understand RInside anyway).
>
> Environment base("package:base");
> Function ncol = base["ncol"];
> Rcpp::NumericVector test(1);
> test[0]=ncol(myDF);
>
> This fails to compile with the following error:
> error: cannot convert 'SEXPREC*' to
> 'Rcpp::traits::storage_type<14>::type'
>
> However, just short of sending another single element vector with this 
> information as an argument to Rcpp I tried the following, AND IT WORKED!
>
> (My debug technique is to send items back to R for inspection.  This 
> is just some test code to show that an integer value of myNames.size() 
> will be useful as a proxy for ncol(DF) in further code development.)
>
> src <- '
> Rcpp::DataFrame myDF=(arg1);
> Environment base("package:base");
> Function names = base["names"];
> Rcpp::CharacterVector myNames(names(myDF)); Rcpp::NumericVector 
> ncol(1); ncol[0]=myNames.size(); return(ncol); '
>
>  fun <- cxxfunction(signature(arg1 = "numeric"),
>  src, plugin = "Rcpp")
>
> vec1<-rep(5,5)
> vec2<-c(1:5)
> DF<-data.frame(vec1,vec2)
> test<-fun(DF)
>
> Okay, how's that for a laugher.
>
> In my real case I am using the same dataframe that I needed to clean 
> up in my 'redimension' chain.  My solution there works quite fine.  
> Now I have yet to decompose this dataframe back into vectors and a 
> matrix to enable entries to be accessed in Rcpp.  But at least I have 
> the dimensions for the matrix now.
>
> It takes about 3 seconds for R to extract a matrix based on 
> DF[,3:ncol(DF)] on a dataframe with 46,000 rows.  I am counting on 
> Rcpp code to execute this more efficiently.  One could argue that I 
> should never have left Rcpp in the first place.  But that is another story.
>
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel at lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-deve
> l
>



More information about the Rcpp-devel mailing list