Alex Alex - 3 years ago 249
Python Question

Apply a custom function on columns in a pandas dataframe

I want to do something equivalent to

Select x,y,z from data where f(x, Y);

And f is my customized function that looks into the values of specific columns in a row and returns True or False. I tried the following:

df = df.ix[_is_detection_in_window(df['Product'], df['CreatedDate'])== True]

But I get

TypeError: 'Series' objects are mutable, thus they cannot be hashed

I think it does not iterate over the rows.
I also tried:

i = 0
for index, row in df.iterrows():
if _is_detection_in_window(row['Product'], row['CreatedDate']):
print 'in range '
new_df.iloc[i] = row
i+= 1
df = new_df

but I get :

IndexError: single positional indexer is out-of-bounds

Answer Source

It seems like your function doesn't accept Series, but that can be changed using np.vectorize:

v = np.vectorize(_is_detection_in_window)
df = df.loc[v(df['Product'], df['CreatedDate'])]

Furthermore, you should refrain from using .ix which is now deprecated as of v20.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download