soldiershin soldiershin - 10 months ago 44
Python Question

Regex taking too long in python

I have used regex101 to test my regex and it works fine.What i am trying to is to detect these patterns

  1. section 1.2 random 2

  2. 1.2 random 2

  3. 1.2. random 2

  4. random 2

  5. random 2.

But its just random it shouldn't match if the string is like that

  1. random

My regex is this.

m = re.match(r"^(((section)\s*|(\d+\.)|\d+|(\d+\.\d+)|[a-zA-z\s]|[a-zA-z\.\s])+((\d+\.$)|\d+$|(\d+\.\d+$)))","random random random random random",flags = re.I)

If i give in a long string it gets stuck.Any ideas?

sal sal
Answer Source

After some simplification, this regular expression meets the requirements stated above and reproduced in the test cases below.

import re

regex = r'(?:section)*\s*(?:[0-9.])*\s*random\s+(?!random)(?:[0-9.])*'

strings = [
   "random random random random random",
   "section 1.2 random 2",
   "1.2 random 2",
   "1.2. random 2",
   "random 2",
   "random 2.",

for string in strings:
    m = re.match(regex, string, flags = re.I)
    if m:
        print "match on", string
        print "non match on", string

which gives an output of:

non match on random random random random random
match on section 1.2 random 2
match on 1.2 random 2
match on 1.2. random 2
match on random 2
match on random 2.
non match on random

See it in action at: