user3147590 - 1 year ago 1518

Python Question

I've recently reviewed an interesting implementation for convolutional text classification. However all TensorFlow code I've reviewed uses a random (not pre-trained) embedding vectors like the following:

`with tf.device('/cpu:0'), tf.name_scope("embedding"):`

W = tf.Variable(

tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0),

name="W")

self.embedded_chars = tf.nn.embedding_lookup(W, self.input_x)

self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)

Does anybody know how to use the results of Word2vec or a GloVe pre-trained word embedding instead of a random one?

Answer

There are a few ways that you can use a pre-trained embedding in TensorFlow. Let's say that you have the embedding in a NumPy array called `embedding`

, with `vocab_size`

rows and `embedding_dim`

columns and you want to create a tensor `W`

that can be used in a call to `tf.nn.embedding_lookup()`

.

Simply create

`W`

as a`tf.constant()`

that takes`embedding`

as its value:`W = tf.constant(embedding, name="W")`

This is the easiest approach, but it is not memory efficient because the value of a

`tf.constant()`

is stored multiple times in memory. Since`embedding`

can be very large, you should only use this approach for toy examples.Create

`W`

as a`tf.Variable`

and initialize it from the NumPy array via a`tf.placeholder()`

:`W = tf.Variable(tf.constant(0.0, shape=[vocab_size, embedding_dim]), trainable=False, name="W") embedding_placeholder = tf.placeholder(tf.float32, [vocab_size, embedding_dim]) embedding_init = W.assign(embedding_placeholder) # ... sess = tf.Session() sess.run(embedding_init, feed_dict={embedding_placeholder: embedding})`

This avoid storing a copy of

`embedding`

in the graph, but it does require enough memory to keep two copies of the matrix in memory at once (one for the NumPy array, and one for the`tf.Variable`

). Note that I've assumed that you want to hold the embedding matrix constant during training, so`W`

is created with`trainable=False`

.If the embedding was trained as part of another TensorFlow model, you can use a

`tf.train.Saver`

to load the value from the other model's checkpoint file. This means that the embedding matrix can bypass Python altogether. Create`W`

as in option 2, then do the following:`W = tf.Variable(...) embedding_saver = tf.train.Saver({"name_of_variable_in_other_model": W}) # ... sess = tf.Session() embedding_saver.restore(sess, "checkpoint_filename.ckpt")`