Nick - 4 months ago 40

Python Question

really finding it hard to understand the input dimensions to the convolutional 1d layer in keras:

Input shape

3D tensor with shape: (samples, steps, input_dim).

Output shape

3D tensor with shape: (samples, new_steps, nb_filter). steps value might have changed due to padding.

I want my network to take in a time series of prices (101, in order) and output 4 probabilities. My current non-convolutional network which does this fairly well (with a training set of 28000) looks like this:

`standardModel = Sequential()`

standardModel.add(Dense(input_dim=101, output_dim=100, W_regularizer=l2(0.5), activation='sigmoid'))

standardModel.add(Dense(4, W_regularizer=l2(0.7), activation='softmax'))

To improve this, I want to make a feature map from the input layer which has a local receptive field of length 10. (and therefore has 10 shared weights and 1 shared bias). I then want to use max pooling and feed this in to a hidden layer of 40 or so neurons and then output this with 4 neurons with softmax in the outer layer.

picture (it's quite awful sorry!)

So ideally, the convolutional layer would take a 2d tensor of dimensions:

(minibatch_size, 101)

and output a 3d tensor of dimensions

(minibatch_size, 91, no_of_featuremaps)

However, the keras layer seems to require a dimension in the input called step. I've tried understanding this and still don't quite get it. In my case, should step be 1 as each step in the vector is an increase in the time by 1? Also, what is new_step?

In addition, how do you turn the output of the pooling layers (a 3d tensor) into input suitable for the standard hidden layer (i.e a Dense keras layer) in the form of a 2d tensor?

Update: After the very helpful suggestions given, I tried making a convolutional network like so:

`conv = Sequential()`

conv.add(Convolution1D(64, 10, input_shape=(1,101)))

conv.add(Activation('relu'))

conv.add(MaxPooling1D(2))

conv.add(Flatten())

conv.add(Dense(10))

conv.add(Activation('tanh'))

conv.add(Dense(4))

conv.add(Activation('softmax'))

The line conv.Add(Flatten()) throws a range exceeds valid bounds error. Interestingly, this error is

`conv = Sequential()`

conv.add(Convolution1D(64, 10, input_shape=(1,101)))

conv.add(Activation('relu'))

conv.add(MaxPooling1D(2))

conv.add(Flatten())

doing

`print conv.input_shape`

print conv.output_shape

results in

`(None, 1, 101`

(None, -256)

being returned

Update 2:

Changed

`conv.add(Convolution1D(64, 10, input_shape=(1,101)))`

to

`conv.add(Convolution1D(10, 10, input_shape=(101,1))`

and it started working. However, is there any important different between

inputting (None, 101, 1) to a 1d conv layer or (None, 1, 101) that I should be aware of? Why does (None, 1, 101) not work?

Answer

The reason why it look like this is that Keras designer intended to make 1-dimensional convolutional framework to be interpreted as a framework to deal with sequences. To fully understand the difference - try to imagine that you have a sequence of a multiple feature vectors. Then your output will be at least two dimensional - where first dimension is connected with time and other dimensions are connected with features. 1-dimensional convolutional framework was designed to in some way bold this time dimension and try to find the reoccuring patterns in data - rather than performing a classical multidimensional convolutional transformation.

In your case you must simply reshape your data to have shape (dataset_size, 101, 1) - because you have only one feature. It could be easly done using `numpy.reshape`

function. To understand what does a new step mean - you must understand that you are doing the convolution over time - so you change the temporal structure of your data - which lead to new time-connected structure. In order to get your data to a format which is suitable for dense / static layers use `keras.layers.flatten`

layer - the same as in classic convolutional case.

**UPDATE:** As I mentioned before - the first dimension of input is connected with time. So the difference between `(1, 101)`

and `(101, 1)`

lies in that in first case you have one time step with 101 features and in second - 101 timesteps with 1 feature. The problem which you mentioned after your first change has its origin in making pooling with size 2 on such input. Having only one timestep - you cannot pool any value on a time window of size 2 - simply because there is not enough timesteps to do that.