BettaMG - 3 months ago 111

Python Question

I have a matrix A which is defined as a tensor in tensorflow, of n rows and p columns. Moreover, I have say k matrices B1,..., Bk with p rows and q columns. My goal is to obtain a resulting matrix C of n rows and q columns where each row of C is the matrix product of the corresponding row in A with one of the B matrices. Which B to choose is determined by a give index vector I of dimension n that can take values ranging from 1 to k. In my case, the B are weight variables while I is another tensor variable given as input.

An example of code in numpy would look as follows:

`A = array([[1, 0, 1],`

[0, 0, 1],

[1, 1, 0],

[0, 1, 0]])

B1 = array([[1, 1],

[2, 1],

[3, 6]])

B2 = array([[1, 5],

[3, 2],

[0, 2]])

B = [B1, B2]

I = [1, 0, 0, 1]

n = A.shape[0]

p = A.shape[1]

q = B1.shape[1]

C = np.zeros(shape = (n,q))

for i in xrange(n):

C[i,:] = np.dot(A[i,:],B[I[i]])

How can this be translated in tensor flow?

In my specific case the variables are defined as:

`A = tf.placeholder("float", [None, p])`

B1 = tf.Variable(tf.random_normal(p,q))

B2 = tf.Variable(tf.random_normal(p,q))

I = tf.placeholder("float",[None])

Answer

This is a bit tricky and there are probably better solutions. Taking your first example, my approach computes C as follows:

```
C = diag([0,1,1,0]) * A * B1 + diag([1,0,0,1]) * A * B2
```

where `diag([0,1,1,0])`

is the diagonal matrix having vector `[0,1,1,0]`

in its diagonal. This can be achieved through tf.diag() in TensorFlow.

For convenience, let me assume that k<=n (otherwise some B matrices would remain unused). The following script obtains those diagonal values from vector I and computes C as mentioned above:

```
k = 2
n = 4
p = 3
q = 2
a = array([[1, 0, 1],
[0, 0, 1],
[1, 1, 0],
[0, 1, 0]])
index_input = [1, 0, 0, 1]
import tensorflow as tf
# Creates a dim·dim tensor having the same vector 'vector' in every row
def square_matrix(vector, dim):
return tf.reshape(tf.tile(vector,[dim]), [dim,dim])
A = tf.placeholder(tf.float32, [None, p])
B = tf.Variable(tf.random_normal(shape=[k,p,q]))
# For the first example (with k=2): B = tf.constant([[[1, 1],[2, 1],[3, 6]],[[1, 5],[3, 2],[0, 2]]], tf.float32)
C = tf.Variable(tf.zeros((n, q)))
I = tf.placeholder(tf.int32,[None])
# Create a n·n tensor 'indices_matrix' having indices_matrix[i]=I for 0<=i<n (each row vector is I)
indices_matrix = square_matrix(I, n)
# Create a n·n tensor 'row_matrix' having row_matrix[i]=[i,...,i] for 0<=i<n (each row vector is a vector of i's)
row_matrix = tf.transpose(square_matrix(tf.range(0, n, 1), n))
# Find diagonal values by comparing tensors indices_matrix and row_matrix
equal = tf.cast(tf.equal(indices_matrix, row_matrix), tf.float32)
# Compute C
for i in range(k):
diag = tf.diag(tf.gather(equal, i))
mul = tf.matmul(diag, tf.matmul(A, tf.gather(B, i)))
C = C + mul
sess = tf.Session()
sess.run(tf.initialize_all_variables())
print(sess.run(C, feed_dict={A : a, I : index_input}))
```

As an improvement, C may be computed using a vectorized implementation instead of using a for loop.