LearningToPython LearningToPython -4 years ago 77
Python Question

Assigning words a unique number identifier

Task

I am trying to assign an number identifier for words in a string.

Code

I have currently done the following:

mystr = 'who are you you are who'

str_values = [i for i in mystr.split()]
list_values = [str(i) for i, w in enumerate(mystr.split())]


Output:

>>> str_values
['0', '1', '2', '3', '4', '5']
>>> list_values
['who', 'are', 'you', 'you', 'are', 'who']


Query/Desired Output

mystr
contains repeating words, and so I would like to assign each word a number rather than different numbers each time but aren't sure how I should begin doing so. Therefore, I would like
list_values
to output something along the line of:

['0', '1', '2', '2', '1', '0']

Answer Source

You could do this with help of another list -

n = []
output = [n.index(i) for i in mystr.split() if i in n or not n.append(i)]

First n is empty list. Now list comprehension iterate over all the element of mystr.split(). It adds the index of the element in list n if condition met.

Now for the condition. There are two parts with an or. First it checks if the element is present in n. If yes, then get the index of the element. If no, it goes to the second part, which just appends the element to the list n. Now append() returns None. That is why I added a not before it. So, that condition will be satisfied and it will give the newly inserted elements index.

Basically the first part of if condition restricts duplicate element addition in n and the second part does the addition.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download