chow chow - 7 months ago 15
Python Question

Find substrings that are embedded in certain pattern

I know there has to be a better/faster way to do all of this -- but I've brute forced a solution for now that works. Can this be done more efficiently?

Psuedo code:

zoneNAME='R::BBQ (ZP)_|_R::Family Room (ZP)_|_R::Firepit (ZP)_|_R::Kitchen (ZP)_|_R::Living Room_|_R::Media Room (ZP)_|_R::Portable (ZP)_|_R::Spa (ZP)_|_S::BBQ (ZP)_|_S::Family Room (ZP)_|_S::Firepit (ZP)_|_S::Kitchen (ZP)_|_S::Media Room (ZP)_|_S::Portable (ZP)_|_S::Spa (ZP)_|_'
a = re.sub('_|_', '', zoneNAME)
a = a.split('S::', 1)[0]
a = re.sub('R::', '', a)
a = re.split('\|', a)
a = filter(None, a)

Final output:

['BBQ (ZP)', 'Family Room (ZP)', 'Firepit (ZP)', 'Kitchen (ZP)', 'Living Room', 'Media Room (ZP)', 'Portable (ZP)', 'Spa (ZP)']


You can use positive-lookbehind and positive-lookahead (see the documentation):

> a = re.findall('(?<=R::).*?(?=_\|_)', zoneNAME)

# '(?<=R::)x' -- positive lookbehind: matches 'x' that is preceded by 'R::'
# 'x(?=_\|_)' -- positive lookahead: matches 'x' that is followed by '_|_'
# .*? matches a sequence of any characters non-greedily

> a
> ['BBQ (ZP)', 'Family Room (ZP)', 'Firepit (ZP)', 'Kitchen (ZP)', 'Living Room', 
'Media Room (ZP)', 'Portable (ZP)', 'Spa (ZP)']

This returns a list of all substrings that are preceded by 'R::' and followed by '_|_', and it matches such strings non-greedily, in order not to match the entire string from the first 'R::' to the last '_|_'.