MD. Nazmul Kibria MD. Nazmul Kibria - 24 days ago 20
C++ Question

Layer drop and update caffe model

I need to update a caffe model from an existing caffe model where I will drop last two layers. It is needed to reduce caffe model size so that it would be easier and lesser size to deploy. Say my existing caffe model is A1.caffemodel which has 5 convolution layers and 3 fully connected layers. I want to generate a new model from it named B1.caffemodel which will have 5 convolution layers and 1 fully connected layer (last 2 fc layers discarded).

I appreciate your all valuable suggestions and helpful code snippet.




Update:

I have implemented according to below accepted answer in c++, I felt it needs to be shared:

Net<float> caffe_net("B.prototxt", caffe::TEST);
caffe_net.CopyTrainedLayersFrom("A.caffemodel");

caffe::NetParameter net_param;
caffe_net.ToProto(&net_param);
caffe::WriteProtoToBinaryFile(net_param, "B.caffemodel");

Answer

Fully connected layers can indeed be very heavy. Please look at section "3.1 Truncated SVD for faster detection" at Girshick, R Fast-RCNN ICCV 2015 describing how to use SVD trick to significalntly reduce the burden of fully connected layers. Hence, you can replace your three fully connected layers with 6 very thin layers.

Steps to go from model A to B:

  1. Create B.prototxt that has the 5 convolution layers with the same "name"s as A.

  2. Give the single fully connected layer in B a new "name" that does not exist in A.

  3. in python

    import caffe
    B = caffe.Net('/path/to/B.prototxt', '/path/to/weights_A.caffemodel', caffe.TEST)
    B.save('/path/to/weights_B.caffemodel')
    
  4. Now you have weights for B that are the same as the weights of A for all convolutional layers and random for the new single fully connected layer.

  5. fine tune model B starting from '/path/to/weights_B.caffemodel' to learn the weights for the new single fully connected layer.

Comments