[Rcpp-devel] Rcpp can not return big DataFrame

Dirk Eddelbuettel edd at debian.org
Wed Mar 27 14:41:11 CET 2013


Hi,

On 27 March 2013 at 21:19, 该走了 wrote:
| Dear Rcpp developer,
|   I am tried return a big DataFrame from Rcpp to R, but met some problem!

If you check the list archives you will see that has been discussed before.
 
| ### begin dataframetest.cpp
| 
| #include <Rcpp.h>
| using namespace Rcpp;
| using namespace std;
| 
| // [[Rcpp::export]]
| DataFrame dataframetest(NumericVector close){
|   int nrow = close.size();
|   vector<double>  txn_qty = vector<double>(nrow);
|   vector<double> txn_prc = vector<double>(nrow);
|   vector<double>  txn_fee = vector<double>(nrow);
|   vector<double>  pos_qty = vector<double>(nrow);
|   vector<double>  close_prc = as<vector<double> >(close);
|   vector<double>  PL = vector<double>(nrow);
|   DataFrame PLrecord = DataFrame::create(Named("txn.qty", txn_qty),
| Named("txn.prc", txn_prc),
| Named("txn.fee", txn_fee),
| Named("pos.qty", pos_qty),
| Named("close.prc", close_prc),
| Named("PL", PL));
|   return PLrecord;
| }
| #### end  dataframetest.cpp
| 
| ### R code 
| n <- 4e5
| x.prc <- 1:n
| library(Rcpp)
| sourceCpp("./dataframetest.cpp")
| aa <- dataframetest(x.prc)
| 
| ##### end R code 
| 
|  When n is big, like 4e5, then it will exhaust the memory or crash; when n is
| small, like  4e3, it can return the correct DataFrame. I was wondering if

I agree. 

But it probably "just" has to do with temp objects, which are co-managed by
R, so this is hard to sort out.

| Rcpp::DataFrame can handle so big DataFrame. In my opinion, n = 4e5 is not big,
| I can create such a long data.frame from R code easily, without any problem.
| Why Rcpp can not? Or I miss something? 

You are welcome to debug it.  Maybe valgrind will help.

Or if you don't want to or can't, just return a list of vectors and call
as.data.frame() on it when you back in R.  

That's what we used to do anyway before we added the convenience wrapping. 

Dirk

| 
| ### R code
| n <- 4e5
| x.prc <- rnorm(n)
| a <- data.frame(x = x.prc, 
|        y = x.prc, 
|                 d = x.prc,
|                 e = x.prc, 
|                 f = x.prc, 
|                 k = x.prc)
| head(a)
|             x           y           d           e           f           k
| 1 -0.45145433 -0.45145433 -0.45145433 -0.45145433 -0.45145433 -0.45145433
| 2 -0.55851370 -0.55851370 -0.55851370 -0.55851370 -0.55851370 -0.55851370
| 3  0.18209145  0.18209145  0.18209145  0.18209145  0.18209145  0.18209145
| 4 -0.56092768 -0.56092768 -0.56092768 -0.56092768 -0.56092768 -0.56092768
| 5  0.25689622  0.25689622  0.25689622  0.25689622  0.25689622  0.25689622
| 6 -0.04558792 -0.04558792 -0.04558792 -0.04558792 -0.04558792 -0.04558792
| 
| #### sessionInfo
| sessionInfo()
| R version 2.15.3 (2013-03-01)
| Platform: x86_64-suse-linux-gnu (64-bit)
| 
| locale:
|  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
|  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
|  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
|  [7] LC_PAPER=C                 LC_NAME=C                 
|  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
| [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
| 
| attached base packages:
| [1] stats     graphics  grDevices utils     datasets  methods   base     
| 
| other attached packages:
| [1] Rcpp_0.10.3      data.table_1.8.8
| 
| loaded via a namespace (and not attached):
| [1] compiler_2.15.3 tools_2.15.3   
| 
| 
| ----------------------------------------------------------------------
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com  


More information about the Rcpp-devel mailing list