Pragyaditya Das - 11 months ago 226

Python Question

I am trying to implement linear regression using python.

I did the following steps:

`import pandas as p`

import numpy as n

data = p.read_csv("...path\Housing.csv", usecols=[1]) # I want the first col

data1 = p.read_csv("...path\Housing.csv", usecols=[3]) # I want the 3rd col

x = data

y = data1

Then I try to obtain the co-efficients, and use the following:

`regression_coeff = n.polyfit(x,y,1)`

And then I get the following error:

`raise TypeError("expected 1D vector for x")`

TypeError: expected 1D vector for x

I am unable to get my head around this, as when I print

`x`

`y`

Can someone please help?

Dataset can be found here: DataSets

The original code is:

`import pandas as p`

import numpy as n

data = pd.read_csv('...\housing.csv', usecols = [1])

data1 = pd.read_csv('...\housing.csv', usecols = [3])

x = data

y = data1

regression = n.polyfit(x, y, 1)

Answer

This should work:

```
np.polyfit(data.values.flatten(), data1.values.flatten(), 1)
```

`data`

is a dataframe and its values are 2D:

```
>>> data.values.shape
(546, 1)
```

`flatten()`

turns it into 1D array:

```
>> data.values.flatten().shape
(546,)
```

which is needed for `polyfit()`

.

Simpler alternative:

```
df = pd.read_csv("Housing.csv")
np.polyfit(df['price'], df['bedrooms'], 1)
```

Source (Stackoverflow)