dnth dnth - 3 months ago 78
Python Question

Deep Convolutional Neural Network in Keras

Hi im trying to increase the depth of an existing convolutional nets in keras. Below is the existing network:

model = Sequential()

model.add(Convolution2D(32, nb_conv, nb_conv, border_mode='valid', input_shape=(1, img_rows, img_cols)))
model.add(Activation('relu'))
model.add(Convolution2D(32, nb_conv, nb_conv))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))

model.add(Convolution2D(64, nb_conv, nb_conv, border_mode='valid'))
model.add(Activation('relu'))
model.add(Convolution2D(64, nb_conv, nb_conv))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(nb_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adadelta')


I am trying to add the depth of the network by adding in few convolution layers as below:

model = Sequential()

model.add(Convolution2D(32, nb_conv, nb_conv, border_mode='valid', input_shape=(1, img_rows, img_cols)))
model.add(Activation('relu'))
model.add(Convolution2D(32, nb_conv, nb_conv))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))

model.add(Convolution2D(64, nb_conv, nb_conv, border_mode='valid'))
model.add(Activation('relu'))
model.add(Convolution2D(64, nb_conv, nb_conv))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))

model.add(Convolution2D(128, nb_conv, nb_conv, border_mode='valid'))
model.add(Activation('relu'))
model.add(Convolution2D(128, nb_conv, nb_conv))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(nb_classes))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adadelta')


Keras is giving me an error. Im not sure what is wrong but seems like the tensor shape is wrong. Below is the error.

This could be a known bug in CUDA, please see the GpuCorrMM() documentation.

Apply node that caused the error: GpuCorrMM{valid, (1, 1)}(GpuContiguous.0, GpuContiguous.0)
Toposort index: 181
Inputs types: [CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, 4D)]
Inputs shapes: [(128, 128, 2, 2), (128, 128, 3, 3)]
Inputs strides: [(512, 4, 2, 1), (1152, 9, 3, 1)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[GpuElemwise{Add}[(0, 0)](GpuCorrMM{valid, (1, 1)}.0, GpuReshape{4}.0)]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.


My input is 28 by 28 pixel image.
Can anyone point me to what is wrong with my model?

Answer

The answer most probably has to do with the image size. MY image size is a 28x28 image. When we perform convolution and pooling (without zero padding), the size of the feature maps will reduce. Therefore the number of convolution and pooling layer will be limited depending on the dimension of the input image.

Following formula from http://cs231n.github.io/convolutional-networks/

feature map dimension = K*((W−F+2P)/S+1), where W - input volume size, F the receptive field size of the Conv Layer neurons, S - the stride with which they are applied, P - the amount of zero padding used on the border, K - the depth of conv layer

Lets no worry about the depth of the conv layer for now. We just want to calculate the height and width of the resulting feature maps after each [CONV -> CONV -> POOL] operation.

For the first network on the top I applied [CONV -> CONV -> POOL] twice. Lets calculate the resulting feature map.

Given F=3, P=0, S=1, W=28 the output of the first [CONV -> CONV -> POOL] operation is:

[CONV]

feature map dimension = W−F+2P)/S+1 = (28 - 3 + 0)/1 = 26

[CONV]

feature map dimension = W−F+2P)/S+1 = (26-3 + 0)/1 = 23

[POOL]

applying pooling operation results in 23/2 = 11

This implies that after the first [CONV -> CONV -> POOL] operation, the feature map now has 11x11 pixels

Lets apply the second [CONV -> CONV -> POOL] operation to the 11x11 feature map. We find that we will end up with a feature map of 2x2 pixels.

Now if we try to apply the third [CONV -> CONV -> POOL] operation as what I wanted to do in the second network. We find that the dimension of the 2x2 feature map is too small for another [CONV -> CONV -> POOL] operation.

I guess this is the reason for the error.

Applying my speculation above i tried to train the second network with bigger image and the error does not appear.

Comments