O.rka - 5 months ago 249x
Python Question

# Use attribute and target matrices for TensorFlow Linear Regression Python

I'm trying to follow this tutorial.

TensorFlow just came out and I'm really trying to understand it. I'm familiar with penalized linear regression like Lasso, Ridge, and ElasticNet and its usage in

`scikit-learn`
.

For
`scikit-learn`
Lasso regression, all I need to input into the regression algorithm is
`DF_X`
[an M x N dimensional attribute matrix (pd.DataFrame)] and
`SR_y`
[an M dimensional target vector (pd.Series)]. The
`Variable`
structure in TensorFlow is a bit new to me and I'm not sure how to structure my input data into what it wants.

It seems as if softmax regression is for classification. How can I restructure my
`DF_X`
(M x N attribute matrix) and
`SR_y`
(M dimensional target vector) to input into
`tensorflow`
for linear regression?

My current method for doing a Linear Regression uses pandas, numpy, and sklearn and it's shown below. I think this question will be really helpful for people getting familiar with TensorFlow:

``````#!/usr/bin/python
import pandas as pd
import numpy as np
import tensorflow as tf
from sklearn.linear_model import LassoCV

#Create DataFrames for attribute and target matrices
DF_X = pd.DataFrame(np.array([[0,0,1],[2,3,1],[4,5,1],[3,4,1]]),columns=["att1","att2","att3"],index=["s1","s2","s3","s4"])
SR_y = pd.Series(np.array([3,2,5,8]),index=["s1","s2","s3","s4"],name="target")

print DF_X
#att1  att2  att3
#s1     0     0     1
#s2     2     3     1
#s3     4     5     1
#s4     3     4     1

print SR_y
#s1    3
#s2    2
#s3    5
#s4    8
#Name: target, dtype: int64

#Create Linear Model (Lasso Regression)
model = LassoCV()
model.fit(DF_X,SR_y)

print model
#LassoCV(alphas=None, copy_X=True, cv=None, eps=0.001, fit_intercept=True,
#max_iter=1000, n_alphas=100, n_jobs=1, normalize=False, positive=False,
#precompute='auto', random_state=None, selection='cyclic', tol=0.0001,
#verbose=False)

print model.coef_
#[ 0.         0.3833346  0.       ]
``````

Softmax is an only addition function (in logistic regression for example), it is not a model like

``````model = LassoCV()
model.fit(DF_X,SR_y)
``````

Therefore you can't simply give it data with fit method. However, you can simply create your model with the help of TensorFlow functions.

First of all, you have to create a computational graph, for example for linear regression you will create tensors with the size of your data. They are only tensors and you will give them your array in another part of the program.

``````import tensorflow as tf
x = tf.placeholder("float", [4, 3])
y_ = tf.placeholder("float",[4])
``````

When you create two variables, that will contain initial weights of our model

``````W = tf.Variable(tf.zeros([3,1]))
b = tf.Variable(tf.zeros([1]))
``````

And now you can create the model (you want to create regression, not classification therefore you don't need to use tf.nn.softmax )

``````y=tf.matmul(x,W) + b
``````

As you have regression and linear model you will use

``````loss=tf.reduce_sum(tf.square(y_ - y))
``````

Then we will train our model with the same step as in the tutorial

``````train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
``````

Now that you created the computational graph you have to write one more part of the program, where you will use this graph to work with your data.

``````init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
sess.run(train_step, feed_dict={x:np.asarray(DF_X),y_:np.asarray(SR_y)})
``````

Here you give your data to this computational graph with the help of feed_dict. In TensorFlow you provide information in numpy arrays. If you want to see your mistake you can write

``````sess.run(loss,feed_dict={x:np.asarray(DF_X),y_:np.asarray(SR_y)})
``````