Shang Wang - 1 year ago 138
Python Question

# How to calculate running mean traffic for last minute

I have a python server that accepts time series data. Now I need to calculate the average traffic for the last minute, output like 90 samples/minute. I'm currently using a python list to hold all time stamps and use a pretty awful way(in my opinion) to calculate this. The code roughly looks like this:

``````class TrafficCalculator(object):
timestamps = []

def run():
while True:
# this gets one record of traffic
data = self.accept_data()
# get record's timestamp
timestamp = data.timestamp
# add to list
self.timestamps.append(timestamp)
# get the time one minute ago
minute_ago = timestamp - datetime.timedelta(minutes=1)
# find out the first index of the timestamp in the past that's within 1 minute
for i, t in enumerate(self.timestamp):
if t > minute_ago:
break
# see how many records are within last minute
result = len(self.timestamp[i:])
# throw away the earlier data
self.timestamp = self.timestamp[i:]
``````

As you can see, I have to do this for every record, if my traffic gets intense, the performance is miserable.

Is there a better data structure or algorithm I can use to make this more performant? Even further, how do I write a test to verify my algorithm? Thanks!

Use Queue to hold `<traffic, timestamp>` pair. Here `timestamp` is the time it has been pushed on Queue(arrives from server). Track the `sum` of the traffics of Queue. When a new traffic arrives and the difference between its timestamp and Queue's front element's timestamp more than 1 minute, pop front from Queue. And subtract the poped traffic value from sum. Push the new traffic into queue and add to sum.
The space complexity is `O(maximum traffic can be arrived within 1 minute)`. Time complexity is `O(1)` for getting average at any time.