ifsession ifsession - 2 years ago 55
Python Question

Most efficient way to add an item and at the same time remove one from a list with a fixed length

I'm parsing some data which can have duplicates. To get rid of them, I use a small list with the last five non-duplicate items and check if the current item is not in the list. I have a solution that works, but there should be a better way. Any ideas?

My current code to achieve this:

activities = []
index = 0

# Open file
# Loop lines (each line is an activity)
# Parse line to activity object

if activity not in activities:
# session is part of SQLAlchemy but this isn't that important

# The part from here on is the one I want changed
if len(activities) == 5:

activities.insert(index, activity)

if index == 4:
index = 0
index = index + 1

EDIT: The problem is not in removing the duplicates inside this list. This is just to check if the new activity is in one of the last added activities. I'm parsing A LOT of data and checking the new activity against all old ones would be a huge bottleneck. The data is sorted by date and can really have a duplicate just in the last few activities (so I'm checking the last 5). Getting the unique values is not the problem, I'm just asking for a solution that does the same thing as mine already does, but would be better.

Answer Source

collections.deque with limited maxlen will be effective in the insert+delete operation,

from collections import deque

activities = deque(maxlen=5)
# if len(activities) == 5 then the leftmost item will be removed before the push

but # some code in-between may require some changes as now data is shifted on each step, changing the indices.


you may prefill activities with Nones and then simply do

activities = [None] * 5
index = 0

# some code in-between

activities[index] = activity

if index == 4:
    index = 0
    index = index + 1

assuming you have no none-activities)

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download