Yusr Safour Yusr Safour - 1 month ago 13
Python Question

Python regex findall

As far as I know, register should be a list, but by trying to print the element 0 , I always get an error that the Index is out of range.I can't explain why.
here is my code:

s = 'Atom = 1
1213 123 23 23 23
4455 5 6 5 5 5 5
Atom = 2
458 84 864 684
Atom = 3
4555 4 5 5 5 54'
register = re.findall(r'Atom(.*?)Atom',s)
print (register[0])

Answer

you're missing the re.DOTALL flag to match on several lines (note: your string is incorrect: if several lines, enclose with triple quotes, I suppose that's a typo)

import re

s = '''Atom = 1
 1213 123 23 23 23
 4455 5 6 5 5 5 5
 Atom = 2
 458 84 864  684
 Atom = 3
 4555 4 5 5 5 54'''
register = re.findall(r'Atom(.*?)(?=Atom|$)',s,re.DOTALL)
print (register[0])

result:

 = 1
 1213 123 23 23 23
 4455 5 6 5 5 5 5

Going further, you notice that I have also fixed your regex, replacing by a forward-lookup to avoid to consume the next Atom keyword and handle the end of text. So final list is now correct:

print(register)

[' = 1\n 1213 123 23 23 23\n 4455 5 6 5 5 5 5\n ', ' = 2\n 458 84 864  684\n ', ' = 3\n 4555 4 5 5 5 54']