<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /></head><body style='font-size: 10pt; font-family: Verdana,Geneva,sans-serif'>
<p>Hi,</p>
<p>Sorry to get back to this, but I can't figure out how to solve it. Do not hesitate to direct me to another forum if this one is not appropriate. I don't think there's anything wrong with Rcpp, but I might not proceed with the loop in Rcpp the right way.</p>
<p>So I tried various regressions in the Rcpp for loop (fastglm, Rfast::glm_poisson, even fastlm; see code at end of email) and they all output the same saved matrix, but a different one than without including a regression step. Compared to a pure R code, the R code with or without a regression output the same matrix as the Rcpp loop without a regression (and therefore different from the ones with a regression), i.e:</p>
<p>- R loop without regression, R loop with regression (glm or fastglm), Rcpp loop without regression = all.equal TRUE.</p>
<p>- Rcpp loop with regression (fastglm, Rfast, or fastlm) = all.equal TRUE but different from above.</p>
<p>I'm totally puzzled on why I've got these results. How can I get the same output with Rcpp + regression as with the R loop +/- regression?</p>
<p>Denis</p>
<p><br /></p>
<p>NumericMatrix define_e(int iter, NumericMatrix obs_mat, NumericVector draws) {<br /> Rcpp::NumericVector d = obs_mat(_, 1);<br /> Rcpp::NumericVector pr = no_init(d.length());<br /> Rcpp::NumericMatrix e(d.length(), iter);<br /> for (int i = 0; i < iter; i++) {<br /> pr = obs_mat(_, 0) * draws(i);<br /> e(_, i) = cpprbinom(d.length(), 1, pr);<br /> }<br /> return e;<br />}<br /><br />NumericMatrix define_ereg1(int iter, NumericMatrix obs_mat, NumericVector draws) {<br /> Environment pkg = Environment::namespace_env("fastglm");<br /> Function f = pkg["fastglm"];<br /><br /> Rcpp::NumericVector d = obs_mat(_, 1);<br /> Rcpp::NumericVector pr = no_init(d.length());<br /> Rcpp::NumericMatrix e(d.length(), iter);<br /> Rcpp::NumericMatrix ematrix(d.length(), 2);<br /> Rcpp::NumericVector v(d.length(), 1);<br /> ematrix(_, 0) = v;<br /> Rcpp::List mod_pois;<br /> for (int i = 0; i < iter; i++) {<br /> pr = obs_mat(_, 0) * draws(i);<br /> e(_, i) = cpprbinom(d.length(), 1, pr);<br /><br /> ematrix(_, 1) = e;<br /> mod_pois = f(Named("x", ematrix), Named("y", d), Named("family", "poisson"));<br /> }<br /> return e;<br />}<br /><br />NumericMatrix define_ereg2(int iter, NumericMatrix obs_mat, NumericVector draws) {<br /> Environment pkg = Environment::namespace_env("Rfast");<br /> Function f = pkg["glm_poisson"];<br /><br /> Rcpp::NumericVector d = obs_mat(_, 1);<br /> Rcpp::NumericVector pr = no_init(d.length());<br /> Rcpp::NumericMatrix e(d.length(), iter);<br /> Rcpp::NumericMatrix ematrix(d.length(), 2);<br /> Rcpp::NumericVector v(d.length(), 1);<br /> ematrix(_, 0) = v;<br /> Rcpp::List mod_pois;<br /> for (int i = 0; i < iter; i++) {<br /> pr = obs_mat(_, 0) * draws(i);<br /> e(_, i) = cpprbinom(d.length(), 1, pr);<br /><br /> ematrix(_, 1) = e;<br /> mod_pois = f(Named("x", ematrix(_, 1)), Named("y", d));<br /> }<br /> return e;<br />}<br /><br />NumericMatrix define_ereg3(int iter, NumericMatrix obs_mat, NumericVector draws) {<br /> Environment pkg = Environment::namespace_env("RcppArmadillo");<br /> Function f = pkg["fastLm"];<br /><br /> Rcpp::NumericVector d = obs_mat(_, 1);<br /> Rcpp::NumericVector pr = no_init(d.length());<br /> Rcpp::NumericMatrix e(d.length(), iter);<br /> Rcpp::NumericMatrix ematrix(d.length(), 2);<br /> Rcpp::NumericVector v(d.length(), 1);<br /> ematrix(_, 0) = v;<br /> Rcpp::List mod_pois;<br /> for (int i = 0; i < iter; i++) {<br /> pr = obs_mat(_, 0) * draws(i);<br /> e(_, i) = cpprbinom(d.length(), 1, pr);<br /><br /> ematrix(_, 1) = e;<br /> mod_pois = f(Named("X", ematrix), Named("y", d));<br /> }<br /> return e;<br />}</p>
<p>draws <- rbeta(200, 100, 60)<br />obs_mat <- cbind(c(rep(1, 40), rep(0, 60)), c(rep(1, 20), rep(0, 80)))<br />iter <- length(draws)<br />set.seed(1234)<br />t0 <- define_e(iter, obs_mat, draws)<br /><br />e <- matrix(NA, nrow = nrow(obs_mat), ncol = iter)<br />e_obs <- obs_mat[, 1]<br />p <- rep(NA, nrow(obs_mat))<br />set.seed(1234)<br />for (i in 1:iter) {<br /> p <- e_obs * draws[i]<br /> e[, i] <- rbinom(nrow(obs_mat), 1, p)<br />}<br />all.equal(t0, e) # TRUE<br /><br />e1 <- matrix(NA, nrow = nrow(obs_mat), ncol = iter)<br />e_obs <- obs_mat[, 1]<br />p <- rep(NA, nrow(obs_mat))<br />set.seed(1234)<br />for (i in 1:iter) {<br /> p <- e_obs * draws[i]<br /> e1[, i] <- rbinom(nrow(obs_mat), 1, p)<br /> X <- cbind(rep(1, nrow(obs_mat)), e1[, i])<br /> d <- obs_mat[, 2]<br /> mod_pois <- glm(d ~ X, family = poisson(link = "log"))<br />}<br />all.equal(e, e1) # TRUE<br />all.equal(e1, t0) # TRUE</p>
<p>e2 <- matrix(NA, nrow = nrow(obs_mat), ncol = iter)<br />e_obs <- obs_mat[, 1]<br />p <- rep(NA, nrow(obs_mat))<br />set.seed(1234)<br />for (i in 1:iter) {<br /> p <- e_obs * draws[i]<br /> e2[, i] <- rbinom(nrow(obs_mat), 1, p)<br /> X <- cbind(rep(1, nrow(obs_mat)), e2[, i])<br /> d <- obs_mat[, 2]<br /> mod_pois <- fastglm::fastglm(X, d, family = poisson(link = "log"))<br />}<br />all.equal(e, e2) # TRUE<br />all.equal(e2, t0) # TRUE</p>
<p>set.seed(1234)<br />t1 <- define_ereg1(iter, obs_mat, draws)<br />all.equal(t1, e) # [1] "Mean relative difference: 1.788299"</p>
<p>set.seed(1234)<br />t2 <- define_ereg2(iter, obs_mat, draws)<br />all.equal(t2, e) # [1] "Mean relative difference: 1.788299"<br />all.equal(t2, t1) # TRUE</p>
<p>set.seed(1234)<br />t3 <- define_ereg3(iter, obs_mat, draws)<br />all.equal(t3, e) # "Mean relative difference: 1.788299"<br />all.equal(t3, t1) # TRUE<br />all.equal(t3, t2) # TRUE</p>
<div id="signature"></div>
<p><br /></p>
<p id="reply-intro">Le 2024-10-01 13:19, Dirk Eddelbuettel a écrit :</p>
<blockquote type="cite" style="padding: 0 0.4em; border-left: #1010ff 2px solid; margin: 0">
<div class="pre" style="margin: 0; padding: 0; font-family: monospace"><br />On 1 October 2024 at 11:57, Denis Haine wrote:<br />| But adding the regression to define_e1() (as define_e2() in previous email)<br />| does not give the same. The first iteration in the loop is identical in both,<br />| but not the subsequent ones. That's why I was thinking it could be linked to<br />| the seed, but (sorry for my limited knowledge of C++) it would be strange to<br />| manually "re-seed" at each iteration. Moreover, I believe the seed set in R is<br />| transferred to Rcpp and so does not need to be taken specifically care of.<br /><br />Nice. So it seems to me that you identified fastglm as the source of your<br />problem. When we call it, your results differ and there is a side-effect.<br /><br />But when we don't it is still true. I added<br /><br /> Environment pkg2 = Environment::namespace_env("base");<br /> Function f2 = pkg2["dim"];<br /><br /> // ...<br /><br /> RObject ro = f2(ematrix);<br /><br />while keeping the lines relevant to fastglm calls and result storage<br />commented out. Now we get TRUE from all.equal(t0,t1).<br /><br />That seems to rule out Rcpp::Function, and may indicate you need to look into<br />what fastglm may be doing.<br /><br />Cheers, Dirk</div>
</blockquote>
</body></html>