Demonedge - 1 year ago 125

Python Question

I want to transfer some weights trained by another network to TensorFlow, the weights are stored in a single vector like this:

`[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]`

By using numpy, I can reshape it to two 3 by 3 filters like this:

`1 2 3 9 10 11`

3 4 5 12 13 14

6 7 8 15 16 17

Thus, the shape of my filters are

`(1,2,3,3)`

`(3,3,2,1)`

`tf_weights = tf.Variable(tf.random_normal([3,3,2,1]))`

After reshaping the tf_weights to the expected shape, the weight becomes a mess and I can't get the expected convolution result.

To be specific, when the shape of an image or filter is [number,channel,size,size], I wrote a convolution function and it gives the correct answer,but it's too slow:

`def convol(images,weights,biases,stride):`

"""

Args:

images:input images or features, 4-D tensor

weights:weights, 4-D tensor

biases:biases, 1-D tensor

stride:stride, a float number

Returns:

conv_feature: convolved feature map

"""

image_num = images.shape[0] #the number of input images or feature maps

channel = images.shape[1] #channels of an image,images's shape should be like [n,c,h,w]

weight_num = weights.shape[0] #number of weights, weights' shape should be like [n,c,size,size]

ksize = weights.shape[2]

h = images.shape[2]

w = images.shape[3]

out_h = (h+np.floor(ksize/2)*2-ksize)/2+1

out_w = out_h

conv_features = np.zeros([image_num,weight_num,out_h,out_w])

for i in range(image_num):

image = images[i,...,...,...]

for j in range(weight_num):

sum_convol_feature = np.zeros([out_h,out_w])

for c in range(channel):

#extract a single channel image

channel_image = image[c,...,...]

#pad the image

padded_image = im_pad(channel_image,ksize/2)

#transform this image to a vector

im_col = im2col(padded_image,ksize,stride)

weight = weights[j,c,...,...]

weight_col = np.reshape(weight,[-1])

mul = np.dot(im_col,weight_col)

convol_feature = np.reshape(mul,[out_h,out_w])

sum_convol_feature = sum_convol_feature + convol_feature

conv_features[i,j,...,...] = sum_convol_feature + biases[j]

return conv_features

Instead, by using tensorflow's conv2d like this:

`img = np.zeros([1,3,224,224])`

img = img - 1

img = np.rollaxis(img, 1, 4)

weight_array = googleNet.layers[1].weights

weight_array = np.reshape(weight_array,[64,3,7,7])

biases_array = googleNet.layers[1].biases

tf_weight = tf.Variable(weight_array)

tf_img = tf.Variable(img)

tf_img = tf.cast(tf_img,tf.float32)

tf_biases = tf.Variable(biases_array)

conv_feature = tf.nn.bias_add(tf.nn.conv2d(tf_img,tf_weight,strides=[1,2,2,1],padding='SAME'),tf_biases)

sess = tf.Session()

sess.run(tf.initialize_all_variables())

feautre = sess.run(conv_feature)

The feature map I got is wrong.

Answer Source

Don't use `np.reshape`

. It might mess up the order of your values.

Use `np.rollaxis`

instead:

```
>>> a = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18])
>>> a = a.reshape((1,2,3,3))
>>> a
array([[[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]]]])
>>> b = np.rollaxis(a, 1, 4)
>>> b.shape
(1, 3, 3, 2)
>>> b = np.rollaxis(b, 0, 4)
>>> b.shape
(3, 3, 2, 1)
```

Note that the order of the two axes with size 3 haven't changed. If I were to label them, the two `rollaxis`

operations caused the shapes to change as (1, 2, 3^{1}, 3^{2}) -> (1, 3^{1}, 3^{2}, 2) -> (3^{1}, 3^{2}, 2, 1). Your final array looks like:

```
>>> b
array([[[[ 1],
[10]],
[[ 2],
[11]],
[[ 3],
[12]]],
[[[ 4],
[13]],
[[ 5],
[14]],
[[ 6],
[15]]],
[[[ 7],
[16]],
[[ 8],
[17]],
[[ 9],
[18]]]])
```