Palomar Palomar - 2 months ago 18
Python Question

Python - combine regex patterns

I have a large text and the aim is to select all 10-character strings for which the first character is a letter and the last character is a digit.

I am a python rookie and what I managed to achieve is to find all 10-character strings:

ten_char = re.findall(r"\D(\w{10})\D", pdfdoc)


Question is how can I put together my other conditions: apart from a 10-character string, I am looking for one where the first character is a letter and the last character is a digit.

Suggestions appreciated!

Answer

If I understand it, do:

r'\b([a-zA-Z]\S{8}\d)\b'

Demo

Python demo:

>>> import re
>>> txt="""\
... Should match:
... a123456789 aA34567s89 zzzzzzzer9
... 
... Not match:
... 1123456789 aA34567s8a zzzzzzer9 zzzxzzzze99"""
>>> re.findall(r'\b([a-zA-Z]\S{8}\d)\b', txt)
['a123456789', 'aA34567s89', 'zzzzzzzer9']
Comments