Massimo - 7 months ago 36

R Question

In R, I need to calculate the conditional expectation *f(z)=E[x|x < z]*, where *x* is distributed according to a parametric distribution (namely, a log-normal).

To calculate e.g. *f(2)* I've done the following:

`zz <- rlnorm(1000,meanlog=.7,sdlog=.5)`

mean(zz[zz<2])

However, I wonder whether there's a more direct way, not requiring to generate samples.

Answer

You are looking at truncated distribution. Integrate `x * f(x)`

on `(-Inf, z)`

, then divide this integral by `F(z)`

. [* f(x) is unconditional PDF; F(x) is unconditional CDF.*]

```
## integrand
f <- function(x, mu, sigma) x * dlnorm(x, mu, sigma)
## conditional expectation
g <- function(z, mu, sigma) {
int <- integrate(f, lower = -Inf, upper = z, mu = mu, sigma = sigma)
int$value / plnorm(z, mu, sigma)
}
## theoretical value
g(2, 0.7, 0.5)
# [1] 1.401472
## sample estimate
set.seed(0)
zz <- rlnorm(1000,meanlog=.7,sdlog=.5)
mean(zz[zz<2])
# [1] 1.40316
```

I had planed to write one or two lines of LaTeX to show why we want an integral as above, but it looks like that the Wikipedia link is informative enough.

For some reason, I am not able to plot the resulting function

`g`

.`plot(1, g(1:10,0.7,0.5))`

is giving an error.

To plot `g`

you need to make it a vectorized function first. There has been some posts about plotting integral, like R plotting integral. Here is what we can do:

```
vg <- Vectorize(g, vectorize.args = "z")
plot(1:10, vg(1:10, 0.7, 0.5), type = "l")
```