I am working on a list object containing several tokens with different frequency
from collections import Counter
s = {'book',
'car',
'bird',
'cup',
'book',
'cup',
'river'}
print(Counter(s))
[('book': 2), ('cup': 2), ('river': 1), ('car': 1), ('bird': 1)]
select = [word for word in s if list(s).count(word) >= 2]
select
In case s
is a list and not a set (like you wrote in your question, but not in the code in your example), you can use the most_common
function of the Counter
object to get the top X elements in your list:
In [67]: s = ['book',
...: 'car',
...: 'bird',
...: 'cup',
...: 'book',
...: 'cup',
...: 'river']
In [68]: s
Out[68]: ['book', 'car', 'bird', 'cup', 'book', 'cup', 'river']
In [69]: c = Counter(s)
In [70]: c.most_common(2)
Out[70]: [('book', 2), ('cup', 2)]
In case you want to get elements that appear more than Y times you can use:
In [71]: [x[0] for x in c.items() if x[1] >= 2]
Out[71]: ['book', 'cup']
x[0]
is the item (from the list) and x[1]
is the frequency