user12202013 user12202013 - 10 months ago 93
Python Question

Easy way to apply transformation from `pandas.get_dummies` to new data?

Suppose I have a data frame

with strings that I want converted to indicators. I use
to convert this to a dataset that I can now use for building a model.

Now I have a single new observation that I want to run through my model. Obviously I can't use
because it doesn't contain all of the classes and won't make the same indicator matrices. Is there a good way to do this?

Answer Source

you can create the dummies from the single new observation, and then reindex this frames columns using the columns from the original indicator matrix:

import pandas as pd
df = pd.DataFrame({'cat':['a','b','c','d'],'val':[1,2,5,10]})
df1 = pd.get_dummies(pd.DataFrame({'cat':['a'],'val':[1]}))
dummies_frame = pd.get_dummies(df)
df1.reindex(columns = dummies_frame.columns, fill_value=0)


        val     cat_a   cat_b   cat_c   cat_d
  0     1       1       0       0       0