So I'm trying to implement linear regression using the gradient descent method from scratch for learning purposes. One part of my code is really bugging me. For some reason the variable x is being altered after I run a line of code and I'm not sure why.
The variables are as follow. x and y are numpy arrays and I've given them random numbers for this example.
theta = [0,0]
alpha = .01
m = len(x)
theta = theta - alpha*1/m*sum([((theta+theta*x) - y)**2 for (x,y) in zip(x,y)])
What is happening is that python is computing the list zip(x,y), then each iteration of your for loop is overwriting (x,y) with the corresponding element of zip(x,y). When your for loop terminates (x,y) contains zip(x,y)[-1].
theta = theta - alpha*1/m*sum([((theta+theta*xi) - yi)**2 for (xi,yi) in zip(x,y)])