Pythonic Pythonic - 1 year ago 365
Python Question

Pandas - What is the fastest way to count values greater than x in a rolling window?

Given a pandas Series

, for each value
I need count how many values in
are greater than

The code below does the job through a python for loop, which is slow on serious computational tasks

Does Pandas offer a similar functionality, possibly wrapping some optimised Numpy function?

import numpy as np
import pandas

window = 30 # any arbitrary window
a = pandas.Series(np.random.rand(100)) # dummy variable, arbitrary length

counter = pandas.Series(data=np.NaN, index=a.index)

for i in a.index[window:]:
counter[i] = (a[i-window:i-1] < a[i]).sum()

print counter

Answer Source

You can use pd.rolling_apply

import numpy as np
import pandas as pd

window = 30
df = pd.DataFrame(np.random.randn(100), columns=['Data'])

counts = pd.rolling_apply(df, window+1, lambda s: (s < s[-1]).sum())

Make sure to add one to the window size.