Jonathan Itakpe Jonathan Itakpe - 5 months ago 8
Python Question

Extracting String within strings using Regex Python

Say i have this string

"Input:Can we book an hotel in Lagos ? Parse: book VB ROOT +-- Can MD aux +-- we PRP nsubj +-- hotel NN dobj | +-- an DT det | +-- in IN prep | +-- Lagos NNP pobj +-- ? . punct "

and i want to get a list like this

['book VB ROOT', 'Can MD aux',..., '? . punct']


using regular expression.

I have tried doing

result = re.findall('\||\+-- (.*?)\+--|\| ', result, re.DOTALL)


any help would be appreciated

Answer

Without regex by playing with built-in functions and methods:

>>> filter(bool, map(str.strip, s.replace('+--', '|').split('Parse:')[1].split('|')))
['book VB ROOT', 'Can MD aux', 'we PRP nsubj', 'hotel NN dobj', 'an DT det', 'in IN prep', 'Lagos NNP pobj', '? . punct']
Comments