Tim - 11 months ago 39

R Question

I am working on code implementing random generation algorithm for sampling from tails of normal distribution proposed by Christian Robert. The problem is that while code in R worked properly, then after translating it to C++ if fails. I can't see any reason for that and I'd be grateful for explaining me what went wrong and why.

*Notice that the code below is far from elegant and efficient, it is simplified to make reproducible example.*

Here is the function in R:

`rtnormR <- function(mean = 0, sd = 1, lower = -Inf, upper = Inf) {`

lower <- (lower - mean) / sd

upper <- (upper - mean) / sd

if (lower < upper && lower >= 0) {

while (TRUE) {

astar <- (lower + sqrt(lower^2 + 4)) / 2

z <- rexp(1, astar) + lower

u <- runif(1)

if ((u <= exp(-(z - astar)^2 / 2)) && (z <= upper)) break

}

} else {

z <- NaN

}

z*sd + mean

}

and here C++ version:

`#include <Rcpp.h>`

using namespace Rcpp;

// [[Rcpp::export]]

double rtnormCpp(double mean, double sd, double lower, double upper) {

double z_lower = (lower - mean) / sd;

double z_upper = (upper - mean) / sd;

bool stop = false;

double astar, z, u;

if (z_lower < z_upper && z_lower >= 0) {

while (!stop) {

astar = (z_lower + std::sqrt(std::pow(z_lower, 2) + 4)) / 2;

z = R::exp_rand() * astar + z_lower;

u = R::unif_rand();

if ((u <= std::exp(-std::pow(z-astar, 2) / 2)) && (z <= z_upper))

stop = true;

}

} else {

z = NAN;

}

return z*sd + mean;

}

Now compare the samples obtained using both functions (they are compared to

`dtnorm`

`xx = seq(-6, 6, by = 0.001)`

hist(replicate(5000, rtnormR(mean = 0, sd = 1, lower = 3, upper = 5)), freq= FALSE, ylab = "", xlab = "", main = "rtnormR")

lines(xx, msm::dtnorm(xx, mean = 0, sd = 1, lower = 3, upper = 5), col = "red")

hist(replicate(5000, rtnormCpp(mean = 0, sd = 1, lower = 3, upper = 5)), freq= FALSE, ylab = "", xlab = "", main = "rtnormCpp")

lines(xx, msm::dtnorm(xx, mean = 0, sd = 1, lower = 3, upper = 5), col = "red")

As you can see,

`rtnormCpp`

Answer Source

While one can use either `scale`

or `rate`

in `rexp()`

, the default parameterization is `rate`

- so `rexp(1,astar)`

has a mean of `1/astar`

, not `astar`

.

If you change the relevant line of C++ code to

```
z = R::exp_rand() / astar + z_lower;
```

everything seems to work fine.