I'm implementing a regression network mapping images to poses, using Tensorflow/python API and am trying to process the output of a FixedLengthRecordReader.
I'm trying to adapt the cifar10 example minimally for my purposes.
The cifar10 example reads the raw bytes, decodes, then splits.
result.key, value = reader.read(filename_queue)
# Convert from a string to a vector of uint8 that is record_bytes long.
record_bytes = tf.decode_raw(value, tf.uint8)
# The first bytes represent the label, which we convert from uint8->int32.
result.label = tf.cast(
tf.slice(record_bytes, , [label_bytes]), tf.int32)
# The remaining bytes after the label represent the image, which we reshape
# from [depth * height * width] to [depth, height, width].
depth_major = tf.reshape(tf.slice(record_bytes, [label_bytes], [image_bytes]),
[result.depth, result.height, result.width])
# Convert from [depth, height, width] to [height, width, depth].
result.uint8image = tf.transpose(depth_major, [1, 2, 0])
key, value = reader.read(filename_queue)
Found a solution here: decode twice and throw out half. Not that efficient (and if anyone has a better solution, I'd be glad to hear it) but it seems to work.