Nina Nina - 3 months ago 8
Python Question

Counting occurrences of tuples

I have a list of (token, tag) tuples that looks like the following:

token_tags =
[('book', 'noun'),
('run', 'noun'),
(',', ','),
('book', 'verb'),
('run', 'adj'),
('run', 'verb')]

I am trying to find out how many times a token was first tagged as a 'noun' then as a 'verb' in its following appearance in the list. So, I should not count 'run' because it was tagged as an adjective between its 'noun' and 'verb' assignment. Any suggestions on how to do that?

I have converted the tuple into a dict as follows

d = {}
for x, y in token_tags:
d.setdefault(x, []).append(y)

So, now d contains:

{'book': ['noun', 'verb'], 'run': ['noun', 'adj', 'verb'], ',': [',']}

I have tried regular expresion to solve this but did not work.


now that you have it in a dictionary, counting how many time a certain pair appear is simple, the idea is to take two consecutive element in the list and check if they are the desire pair, for example

>>> data = {'book': ['noun', 'verb'], 'run': ['noun', 'adj', 'verb'], ',': [',']}
>>> result={}
>>> for token, tag_list in data.items():
        count = 0
        for i in range(1,len(tag_list)):
            if tag_list[i-1]=="noun" and tag_list[i]=="verb":
                count = count + 1
        result[token] = count

>>> result
{',': 0, 'book': 1, 'run': 0}