So I built a simple linear regression model with a handful of features. When I try to predict for new input, the output is inconsistent. For example:
In : model.predict(X_new)
Out: array([ 7.15993216e+08, 1.13548305e+09])
In : model.predict(X_training[:1].append(X_new))[1:]
Out: array([ 272682.59925699, 1179906.89475647])
This seems to be an issue with the sorting order of the pandas data frame. A solution for this is to pre-sort both training and testing data sets by the same column order. Something along the lines of:
This cements the column order in the training and testing arrays.