Kevin Kevin - 3 months ago 15
Python Question

Can't read data on TensorFlow

Prior to this I converted my input images to TFRecords files. Now I have the following methods that I've mostly gathered from the tutorials and modified a little:

def read_and_decode(filename_queue):
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
serialized_example,
# Defaults are not specified since both keys are required.
features={
'image/encoded': tf.FixedLenFeature([], tf.string),
'image/class/label': tf.FixedLenFeature([], tf.int64),
})
image = tf.decode_raw(features['image/encoded'], tf.uint8)
label = tf.cast(features['image/class/label'], tf.int32)

reshaped_image = tf.reshape(image,[size[0], size[1], 3])
reshaped_image = tf.image.resize_images(reshaped_image, size[0], size[1], method = 0)
reshaped_image = tf.image.per_image_whitening(reshaped_image)
return reshaped_image, label

def inputs(train, batch_size, num_epochs):
filename = os.path.join(FLAGS.train_dir,
TRAIN_FILE if train else VALIDATION_FILE)

filename_queue = tf.train.string_input_producer(
[filename], num_epochs=num_epochs)

# Even when reading in multiple threads, share the filename
# queue.
image, label = read_and_decode(filename_queue)

# Shuffle the examples and collect them into batch_size batches.
# (Internally uses a RandomShuffleQueue.)
# We run this in two threads to avoid being a bottleneck.
images, sparse_labels = tf.train.shuffle_batch(
[image, label], batch_size=batch_size, num_threads=2,
capacity=1000 + 3 * batch_size,
# Ensures a minimum amount of shuffling of examples.
min_after_dequeue=1000)
return images, sparse_labels


But when I try to call a batch on iPython/Jupyter, the process never ends (there appears to be a loop). I call it this way:

batch_x, batch_y = inputs(True, 100,1)
print batch_x.eval()

Answer

It looks like you are missing a call to tf.train.start_queue_runners(), which starts the background threads that drive the input pipeline (e.g. some of these are the threads implied by num_threads=2 in the call to tf.train.shuffle_batch(), and the tf.train.string_input_producer() also requires a background thread). The following small change should unblock things:

batch_x, batch_y = inputs(True, 100,1)
tf.initialize_all_variables.run()    # Initializes variables.
tf.initialize_local_variables.run()  # Needed after TF version 0.10.
tf.train.start_queue_runners()       # Starts the necessary background threads.
print batch_x.eval()
Comments