user3079834 - 3 months ago 27

Python Question

I'm stuck solving this issue for two days now. I have some datapoints I put in a

`scatter plot`

Which is nice, but now I also want to add a regression line, so I had a look at this example from sklearn and changed the code to this

`import numpy as np`

import matplotlib.pyplot as plt

from sklearn.pipeline import Pipeline

from sklearn.preprocessing import PolynomialFeatures

from sklearn.linear_model import LinearRegression

from sklearn.model_selection import cross_val_score

degrees = [3, 4, 5]

X = combined[['WPI score']]

y = combined[['CPI score']]

plt.figure(figsize=(14, 5))

for i in range(len(degrees)):

ax = plt.subplot(1, len(degrees), i + 1)

plt.setp(ax, xticks=(), yticks=())

polynomial_features = PolynomialFeatures(degree=degrees[i], include_bias=False)

linear_regression = LinearRegression()

pipeline = Pipeline([("polynomial_features", polynomial_features), ("linear_regression", linear_regression)])

pipeline.fit(X, y)

# Evaluate the models using crossvalidation

scores = cross_val_score(pipeline, X, y, scoring="neg_mean_squared_error", cv=10)

X_test = X #np.linspace(0, 1, len(combined))

plt.plot(X, pipeline.predict(X_test), label="Model")

plt.scatter(X, y, label="CPI-WPI")

plt.xlabel("X")

plt.ylabel("y")

plt.legend(loc="best")

plt.title("Degree {}\nMSE = {:.2e}(+/- {:.2e})".format(degrees[i], -scores.mean(), scores.std()))

plt.savefig(pic_path + 'multi.png', bbox_inches='tight')

plt.show()

which has the following output:

Note that

`X`

`y`

`DataFrames`

`(151, 1)`

What I want is a nice smooth line, but I seem not to be able to figure out, how to do this.

The question here is: How do I get a single smooth, curvy polynomial line instead of multiple ones with seemingly random pattern.

The problem is, when I use the

`linspace`

`X_test = np.linspace(1, 4, 151)`

X_test = X_test[:, np.newaxis]

I get a even more random pattern:

Answer