abvaekvnl - 8 months ago 53

R Question

I want to tune a neural network with dropout using h2o in R. Here I provide a reproducible example for the iris dataset. I'm avoiding to tune

`eta`

`epsiplon`

`require(h2o)`

h2o.init()

data(iris)

iris = iris[sample(1:nrow(iris)), ]

irisTrain = as.h2o(iris[1:90, ])

irisValid = as.h2o(iris[91:120, ])

irisTest = as.h2o(iris[121:150, ])

hyper_params <- list(

input_dropout_ratio = list(0, 0.15, 0.3),

hidden_dropout_ratios = list(0, 0.15, 0.3, c(0,0), c(0.15,0.15),c(0.3,0.3)),

hidden = list(64, c(32,32)))

grid = h2o.grid("deeplearning", x=colnames(iris)[1:4], y=colnames(iris)[5],

training_frame = irisTrain, validation_frame = irisValid,

hyper_params = hyper_params, adaptive_rate = TRUE,

variable_importances = TRUE, epochs = 50, stopping_rounds=5,

stopping_tolerance=0.01, activation=c("RectifierWithDropout"),

seed=1, reproducible=TRUE)

The output is:

`Details: ERRR on field: _hidden_dropout_ratios: Must have 1 hidden layer dropout ratios.`

The problem is in

`hidden_dropout_ratios`

`activation="Rectifier`

`hidden_dropout_ratios`

Attempt 1: Unsuccessful and I'm not tuning

`hidden`

`hyper_params <- list(`

input_dropout_ratio = c(0, 0.15, 0.3),

hidden_dropout_ratios = list(c(0.3,0.3), c(0.5,0.5)),

hidden = c(32,32))

ERRR on field: _hidden_dropout_ratios: Must have 1 hidden layer dropout ratios.

Attempt 2: Successful but I'm not tuning

`hidden`

`hyper_params <- list(`

input_dropout_ratio = c(0, 0.15, 0.3),

hidden_dropout_ratios = c(0.3,0.3),

hidden = c(32,32))

Answer

You have to fix the number of hidden layers in a grid, if experimenting with hidden_dropout_ratios. At first I messed around with combining multiple grids; then, when researching for my H2O book, I saw someone mention, in passing, how grids get combined automatically if you give them the same name.

So, you still need to call `h2o.grid()`

for each number of hidden layers, but they can all be in the same grid at the end. Here is your example modified for that:

```
require(h2o)
h2o.init()
data(iris)
iris = iris[sample(1:nrow(iris)), ]
irisTrain = as.h2o(iris[1:90, ])
irisValid = as.h2o(iris[91:120, ])
irisTest = as.h2o(iris[121:150, ])
hyper_params1 <- list(
hidden_dropout_ratios = list(0, 0.15, 0.3),
hidden = list(64)
)
hyper_params2 <- list(
hidden_dropout_ratios = list(c(0,0), c(0.15,0.15),c(0.3,0.3)),
hidden = list(c(32,32))
)
grid = h2o.grid("deeplearning", x=colnames(iris)[1:4], y=colnames(iris)[5],
grid_id = "stackoverflow",
training_frame = irisTrain, validation_frame = irisValid,
hyper_params = hyper_params1, adaptive_rate = TRUE,
variable_importances = TRUE, epochs = 50, stopping_rounds=5,
stopping_tolerance=0.01, activation=c("RectifierWithDropout"),
seed=1, reproducible=TRUE)
grid = h2o.grid("deeplearning", x=colnames(iris)[1:4], y=colnames(iris)[5],
grid_id = "stackoverflow",
training_frame = irisTrain, validation_frame = irisValid,
hyper_params = hyper_params2, adaptive_rate = TRUE,
variable_importances = TRUE, epochs = 50, stopping_rounds=5,
stopping_tolerance=0.01, activation=c("RectifierWithDropout"),
seed=1, reproducible=TRUE)
```

When I went to print the grid, I was reminded there is a bug with grid output when using list hyper-parameters, such as hidden or hidden_dropout_ratios. Your code is a nice self-contained example, so I'll report that now. In the meantime, here is a one-liner to show the values of the hyper-parameter corresponding to each:

```
sapply(models, function(m) c(
paste(m@parameters$hidden, collapse = ","),
paste(m@parameters$hidden_dropout_ratios, collapse=",")
))
```

Which gives:

```
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "32,32" "64" "32,32" "64" "32,32" "64"
[2,] "0,0" "0" "0.15,0.15" "0.15" "0.3,0.3" "0.3"
```

I.e. no hidden dropout is better than a little, which is better than a lot. And two hidden layers is better than one.

P.S. The only other change I made to your code is to remove `input_dropout_ratio`

as I guessed your intention was for it to be zero.

`input_dropout_ratio`

: controls dropout between input layer and the first hidden layer. Can be used independently of the activation function.`hidden_dropout_ratios`

: controls dropout between each hidden layer and the next layer (which is either the next hidden layer, or the output layer). If specified, you must specify one of the "WithDropout" activation functions.