I have a list populated with words from a dictionary. I want to find a way to remove all words, only considering root words that form at the beginning of the target word.
For example, the word "rodeo" would be removed from the list because it contains the English-valid word "rode." "Typewriter" would be removed because it contains the English-valid word "type." However, the word "snicker" is still valid even if it contains the word "nick" because "nick" is in the middle and not at the beginning of the word.
I was thinking something like this:
for line in wordlist:
if line.find(...) --
I'm assuming that you only have one list from which you want to remove any elements that have prefixes in that same list.
#Important assumption here... wordlist is sorted base=wordlist #consider the first word in the list for word in wordlist: #loop through the entire list checking if if not word.startswith(base): # the word we're considering starts with the base print base #If not... we have a new base, print the current base=word # one and move to this new one #else word starts with base #don't output word, and go on to the next item in the list print base #finish by printing the last base
EDIT: Added some comments to make the logic more obvious