n1c9 n1c9 - 19 days ago 8
Python Question

how can I make my Python string match non-greedily?

Given a text file that looks like so:

Samsung Galaxy S6 active SM-G890A 32GB Camo White (AT&T) *AS-IS* Cracked Screen
Samsung Galaxy S6 SM-G920 - 32GB - White Verizon Cracked screen
Samsung Galaxy S6 edge as is cracked screen


I've tried to think of a number of different ways to have the string
Samsung Galaxy S6
not match
Samsung Galaxy S6 edge
, but can't seem to come up with a way that works. There's no point in the string where it's clear that the name of the phone has ended and the extraneous information begins, so splitting them up that way and comparing to a dictionary or something like that wouldn't work.

I tried to think of some way to write the following:

phones = ['Samsung Galaxy S6', 'Samsung Galaxy S6 Edge']
lines = open('phones.txt', 'r').readlines()
for line in lines:
for phone in phones:
if phone in line and no other phone in phones is in line:
print('match found')


but I can't think of the right way to structure it - anyone have any ideas? I'm sure that I'm missing something simple here, but just can't figure out what.

Answer

start by sorting your phones so that it will look at them by length

phones.sort(key=len,reverse=True) 

then break when you find a match

for phone in phones:
   if phone in line:
      print "FOUND:",repr(phone),"IN",repr(line)
      break # we dont need to keep looking for other phones in this line

maybe?

this way "Samsung Galaxy s6 Edge" comes before "Samsung Galaxy" in your checks and you will match the longest one... without requireing more knowledge of your phone list like the regex answer