upman - 1 year ago 60

R Question

I am working on a project for class: It's about the statistical evaluation of four different French Roulette (n=37) strategies.

The first two are very simple:

- A. Betting on Red one Time
- B. Betting on a Number one Time

Please find the code below:

`BettingOnRed <- function(){`

ball <- sample(1:37, 1, replace=TRUE)

if (ball <= 18) amount_won <- 1

else amount_won <- -1

c(amount_won, 1)

}

BettingOnNumber <- function() {

myNumber <- 17

ball <- sample(0:36, 1, replace=TRUE)

if (myNumber == ball) amount_won <- 35

else amount_won <- -1

c(amount_won, 1)

}

Each function returns a vector of length=2 containing the amount won and the number of bets made (which is always equal to one in these two functions: this value plays a role in the other strategies...).

Even though they appear to be simple, if I calculate the percentage error of the expected winnings and the proportion of wins per game, I get partly huge errors I really can't explain. Please see the table below:

In order to calculate the expected values, I set up a function

`simulation()`

What I can't understand is: Why is the percentage error of the winnings per game B so huge, whereas the percentage error of the proportion of games won of B is so small ?

Please find here the formulas I used to calculate the exact values (=expected values) and the percentage error for game B:

- Let be the estimation of winnings per game B.
`EstWin`

- Let be the estimation of the proportion of games B won.
`EstProp`

The respective exact values are:

- ExactWin = 1/37*35 - 36/37 = -1/37
- ExactProp = 1/37

Percentage Errors:

- PercErrorWin = (EstWin - ExactWin)/ExactWin
- PercErrorProp = (EstProp - ExactProp)/ExactProp

I don't get this error, why are the errors not the same? Am I missing an important fact about probability here ?

Please find here the responsible part of my function 'simulation':

(It takes as first argument one of the two functions from above!)

`simulation <- function(f, n = 100000){`

result <- numeric(8)

winnings <- numeric(n)

games_won <- numeric(n)

for (i in 1:n){

fnct <- f()

winnings[i] <- fnct[1]

games_won[i] <- ifelse(fnct[1] > 0, 1, 0)

}

result[1] <- mean(winnings)

result[2] <- mean(games_won)

result

}

Note that the function is bigger, but I just deleted the unnecessary part for this problem.

Answer Source

**tl;dr** your results seem correct; there's more variation than you think (variation in bet-on-number is *much* greater than variation in bet-on-red ...)

There are lots of aspects of your simulations that could be streamlined, but I think your basic framework is correct. Really the only thing that you're missing is the amount of variation expected in the output; if you examine this you'll see that the deviations between observed and expected are really not surprising. (You could actually compute this variance analytically, but here I'll do it by brute force.)

Simulate 100 runs, each with 100,000 games. I'm using `plyr::raply()`

for convenience (it assembles your results automatically and implements a progress bar), but you could do it just as well with `replicate()`

, or with a `for`

loop.

```
set.seed(101)
library(plyr)
rr <- raply(100,simulation(BettingOnNumber,100000),.progress="text")
```

Plot the distribution of mean winnings: blue=expected, red=observed from your single simulation.

```
par(las=1,bty="l")
hist(rr[,1],col="gray",breaks=30,
xlab="mean amount won in 100,000 games",
ylab="Frequency (100 runs)")
exp_val <- -0.02703
obs_val <- -0.04852
abline(v=c(obs_val,exp_val),col=c("red","blue"),lwd=2)
```

Here's a computation of how surprising this degree of deviation is:

```
mean(abs(rr[,1]-exp_val)>abs(obs_val-exp_val)) ## 0.21
```

This means that you'd get the degree of deviation you expected about 21% of the time.

Try this experiment with the betting-on-red strategy and you'll see how much smaller the variance is ...