marcman - 11 months ago 263

Python Question

I am using PyCaffe to implement a neural network inspired by the VGG 16 layer network. I want to use the pre-trained model available from their GitHub page. Generally this works by matching layer names.

For my

`"fc6"`

`layer {`

name: "fc6"

type: "InnerProduct"

bottom: "pool5"

top: "fc6"

inner_product_param {

num_output: 4096

}

}

Here is the prototxt file for the VGG-16 deploy architecture. Note that the

`"fc6"`

I have been following this tutorial pretty closely, and the block of code that's giving me an issue is the following:

`solver = caffe.SGDSolver(osp.join(model_root, 'solver.prototxt'))`

solver.net.copy_from(model_root + 'VGG_ILSVRC_16_layers.caffemodel')

solver.test_nets[0].share_with(solver.net)

solver.step(1)

The first line loads my solver prototxt and then the second line copies the weights from the pre-trained model (

`VGG_ILSVRC_16_layers.caffemodel`

`Cannot copy param 0 weights from layer 'fc6'; shape mismatch. Source param`

shape is 1 1 4096 25088 (102760448); target param shape is 4096 32768 (134217728).

To learn this layer's parameters from scratch rather than copying from a saved

net, rename the layer.

The gist of it is that their model expects the layer to be of size 1x1x4096 while mine is just 4096. But I don't get how I can change this?

I found this answer in the Users Google group instructing me to do net surgery to reshape the pre-trained model before copying, but in order to do that I need the

`lmdb`

Answer Source

The problem is not with 4096 but rather with 25088. You need to calculate the output feature maps for each layer of your network based on the input feature maps. Note that the `fc`

layer takes an input of fixed size so the output of the previous `conv`

layer must match the input size required by the `fc`

layer. Calculate your fc6 input feature map size (this is the output feature map of the previous `conv`

layer) using the input feature map size of the previous `conv`

layer. Here's the formula:

```
H_out = ( H_in + 2 x Padding_Height - Kernel_Height ) / Stride_Height + 1
W_out = (W_in + 2 x Padding_Width - Kernel_Width) / Stride_Width + 1
```