user12202013 user12202013 - 3 months ago 44
Python Question

Easy way to apply transformation from `pandas.get_dummies` to new data?

Suppose I have a data frame

with strings that I want converted to indicators. I use
to convert this to a dataset that I can now use for building a model.

Now I have a single new observation that I want to run through my model. Obviously I can't use
because it doesn't contain all of the classes and won't make the same indicator matrices. Is there a good way to do this?


you can create the dummies from the single new observation, and then reindex this frames columns using the columns from the original indicator matrix:

import pandas as pd
df = pd.DataFrame({'cat':['a','b','c','d'],'val':[1,2,5,10]})
df1 = pd.get_dummies(pd.DataFrame({'cat':['a'],'val':[1]}))
dummies_frame = pd.get_dummies(df)
df1.reindex(columns = dummies_frame.columns, fill_value=0)


        val     cat_a   cat_b   cat_c   cat_d
  0     1       1       0       0       0