[Rcpp-devel] Rcpp can not return big DataFrame
Dirk Eddelbuettel
edd at debian.org
Wed Mar 27 14:41:11 CET 2013
Hi,
On 27 March 2013 at 21:19, 该走了 wrote:
| Dear Rcpp developer,
| I am tried return a big DataFrame from Rcpp to R, but met some problem!
If you check the list archives you will see that has been discussed before.
| ### begin dataframetest.cpp
|
| #include <Rcpp.h>
| using namespace Rcpp;
| using namespace std;
|
| // [[Rcpp::export]]
| DataFrame dataframetest(NumericVector close){
| int nrow = close.size();
| vector<double> txn_qty = vector<double>(nrow);
| vector<double> txn_prc = vector<double>(nrow);
| vector<double> txn_fee = vector<double>(nrow);
| vector<double> pos_qty = vector<double>(nrow);
| vector<double> close_prc = as<vector<double> >(close);
| vector<double> PL = vector<double>(nrow);
| DataFrame PLrecord = DataFrame::create(Named("txn.qty", txn_qty),
| Named("txn.prc", txn_prc),
| Named("txn.fee", txn_fee),
| Named("pos.qty", pos_qty),
| Named("close.prc", close_prc),
| Named("PL", PL));
| return PLrecord;
| }
| #### end dataframetest.cpp
|
| ### R code
| n <- 4e5
| x.prc <- 1:n
| library(Rcpp)
| sourceCpp("./dataframetest.cpp")
| aa <- dataframetest(x.prc)
|
| ##### end R code
|
| When n is big, like 4e5, then it will exhaust the memory or crash; when n is
| small, like 4e3, it can return the correct DataFrame. I was wondering if
I agree.
But it probably "just" has to do with temp objects, which are co-managed by
R, so this is hard to sort out.
| Rcpp::DataFrame can handle so big DataFrame. In my opinion, n = 4e5 is not big,
| I can create such a long data.frame from R code easily, without any problem.
| Why Rcpp can not? Or I miss something?
You are welcome to debug it. Maybe valgrind will help.
Or if you don't want to or can't, just return a list of vectors and call
as.data.frame() on it when you back in R.
That's what we used to do anyway before we added the convenience wrapping.
Dirk
|
| ### R code
| n <- 4e5
| x.prc <- rnorm(n)
| a <- data.frame(x = x.prc,
| y = x.prc,
| d = x.prc,
| e = x.prc,
| f = x.prc,
| k = x.prc)
| head(a)
| x y d e f k
| 1 -0.45145433 -0.45145433 -0.45145433 -0.45145433 -0.45145433 -0.45145433
| 2 -0.55851370 -0.55851370 -0.55851370 -0.55851370 -0.55851370 -0.55851370
| 3 0.18209145 0.18209145 0.18209145 0.18209145 0.18209145 0.18209145
| 4 -0.56092768 -0.56092768 -0.56092768 -0.56092768 -0.56092768 -0.56092768
| 5 0.25689622 0.25689622 0.25689622 0.25689622 0.25689622 0.25689622
| 6 -0.04558792 -0.04558792 -0.04558792 -0.04558792 -0.04558792 -0.04558792
|
| #### sessionInfo
| sessionInfo()
| R version 2.15.3 (2013-03-01)
| Platform: x86_64-suse-linux-gnu (64-bit)
|
| locale:
| [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
| [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
| [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
| [7] LC_PAPER=C LC_NAME=C
| [9] LC_ADDRESS=C LC_TELEPHONE=C
| [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
|
| attached base packages:
| [1] stats graphics grDevices utils datasets methods base
|
| other attached packages:
| [1] Rcpp_0.10.3 data.table_1.8.8
|
| loaded via a namespace (and not attached):
| [1] compiler_2.15.3 tools_2.15.3
|
|
| ----------------------------------------------------------------------
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
--
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
More information about the Rcpp-devel
mailing list