Chris Chris - 10 months ago 47
Python Question

Matching multiple patterns in a string

I have a string that looks like that:

s = "[A] text [B] more text [C] something ... [A] hello"

basically it consists of
[X] chars
and I am trying to get the text "after" every

I would like to yield this dict (I don't care about order):

mydict = {"A":"text, hello", "B":"more text", "C":"something"}

I was thinking about a regex but I was not sure if that is the right choice because in my case the order of [A], [B] and [C] can change, so this string is valid too:

s = "[A] hello, [C] text [A] more text [B] something"

I don't know how to properly extract the string. Can anyone point me to the right direction? Thanks.

Answer Source

Not sure if this is quite what you're looking for but it fails with duplicates

s = "[A] hello, [C] text [A] more text [B] something"

results = [text.strip() for text in re.split('\[.\]', s) if text]

letters = re.findall('\[(.)\]', s)

dict(zip(letters, results))

{'A': 'more text', 'B': 'something', 'C': 'text'}

Since the output looks like this:

In [49]: results
Out[49]: ['hello,', 'text', 'more text', 'something']

In [50]: letters
Out[50]: ['A', 'C', 'A', 'B']

To solve for duplicate you could do something like....

mappings = {}

for pos, letter in enumerate(letters):
        mappings[letter] += ' ' + results[pos]
    except KeyError:
        mappings[letter] = results[pos]

which gives: {'A': 'hello, more text', 'B': 'something', 'C': 'text'}