Lukasz - 1 year ago 109
Python Question

# Theano Using `scan` Instead Of `for` Loop In Linear Regression

I'm trying to get a better grasp of the

`scan`
functionality in
`theano`
, my understanding is that it behaves like a
`for`
loop based on this document. I've created a very simple working example to find the weight and bias when performing linear regression.

``````#### Libraries
# Third Party Libraries
import numpy as np
import theano
import theano.tensor as T

# not intended for mini-batch
def gen_data(num_points=50, slope=1, bias=10, x_max=50):
f = lambda z: slope * z + bias
x = np.zeros(shape=(num_points), dtype=theano.config.floatX)
y = np.zeros(shape=(num_points), dtype=theano.config.floatX)

for i in range(num_points):
x_temp = np.random.uniform()*x_max
x[i] = x_temp
y[i] = f(x_temp) + np.random.normal(scale=3.0)

return (x, y)

#############################################################
#############################################################
train_x, train_y = gen_data(num_points=50, slope=2, bias=5)
epochs = 50

# Declaring variable
learn_rate = T.scalar(name='learn_rate', dtype=theano.config.floatX)
x = T.vector(name='x', dtype=theano.config.floatX)
y = T.vector(name='y', dtype=theano.config.floatX)
# Variables that will be updated
theta = theano.shared(np.random.rand(), name='theta')
bias = theano.shared(np.random.rand(), name='bias')

hyp = T.dot(theta, x) + bias
cost = T.mean((hyp - y)**2)/2
f_cost = theano.function(inputs=[x, y], outputs=cost)

train = theano.function(inputs=[x, y, learn_rate], outputs=cost,

print('weight: {}, bias: {}'.format(theta.get_value(), bias.get_value()))

for i in range(epochs): # Try changing this to a `scan`
train(train_x, train_y, 0.001)

print('------------------------------')
print('weight: {}, bias: {}'.format(theta.get_value(), bias.get_value()))
``````

I would like to change that
`for`
loop to a
`theano.scan`
function, but every attempt I've made has yielded one error message after the next.

In order to use `theano.scan` I imported `OrderedDict` from `collection` to use for the shared variables. Using a `dict` will result in the following error message:

``````Expected OrderedDict or OrderedUpdates, got <class 'dict'>. This can make your script non-deterministic.
``````

Secondly, I defined a function where the loss and gradient are to be computed. The function returns the `loss` and an `OrderedDict()`. The functions

``````def cost(inputs, outputs, learn_rate, theta, bias):
hyp = T.dot(theta, inputs) + bias
loss = T.mean((hyp - outputs)**2)/2

``````

This was followed by defining `theano.scan()` as such:

``````results, updates = theano.scan(fn=cost,
non_sequences=[x, y, learn_rate, theta, bias],
n_steps=epochs)
``````

I chose to include`x` and `y` as `non_sequences` due to the relative small size of this toy example and since it is about twice as fast compared to passing them as `sequences`.

Lastly, `theano.function()` was defined using `results, updates` from `theano.scan()`

``````train = theano.function(inputs=[x, y, learn_rate, epochs], outputs=results,
``````

Putting it all toghether we have:

``````#### Libraries
# Standard Libraries
from collections import OrderedDict

# Third Party Libraries
# import matplotlib.pyplot as plt
import numpy as np
# from sklearn import linear_model
import theano
import theano.tensor as T

# def gen_data(num_points=50, slope=1, bias=10, x_max=50):
#     pass # Use the code in the above post to generate sample points

########################################################################
# Generate Data
train_x, train_y = gen_data(num_points=50, slope=2)

# Declaring variable
x = T.vector(name='x', dtype=theano.config.floatX)
y = T.vector(name='y', dtype=theano.config.floatX)

learn_rate = T.scalar(name='learn_rate', dtype=theano.config.floatX)
epochs = T.iscalar(name='epochs')

# Variables that will be updated, hence are declared as `theano.share`
theta = theano.shared(np.random.rand(), name='theta')
bias = theano.shared(np.random.rand(), name='bias')

def cost(inputs, outputs, learn_rate, theta, bias):
hyp = T.dot(theta, inputs) + bias
loss = T.mean((hyp - outputs)**2)/2

non_sequences=[x, y, learn_rate, theta, bias],
n_steps=epochs)

#                              sequences=[x, y],
#                              non_sequences = [learn_rate, theta, bias],
#                              n_steps=epochs)

train = theano.function(inputs=[x, y, learn_rate, epochs], outputs=results,
I've included the code to pass `x` and `y` as `sequences` for completeness. Simply uncomment out that part of the code and AND comment out the other instance of `theano.scan()`.