soldiershin soldiershin - 8 months ago 39
Python Question

Regex taking too long in python

I have used regex101 to test my regex and it works fine.What i am trying to is to detect these patterns

  1. section 1.2 random 2

  2. 1.2 random 2

  3. 1.2. random 2

  4. random 2

  5. random 2.

But its just random it shouldn't match if the string is like that

  1. random

My regex is this.

m = re.match(r"^(((section)\s*|(\d+\.)|\d+|(\d+\.\d+)|[a-zA-z\s]|[a-zA-z\.\s])+((\d+\.$)|\d+$|(\d+\.\d+$)))","random random random random random",flags = re.I)

If i give in a long string it gets stuck.Any ideas?

sal sal

After some simplification, this regular expression meets the requirements stated above and reproduced in the test cases below.

import re

regex = r'(?:section)*\s*(?:[0-9.])*\s*random\s+(?!random)(?:[0-9.])*'

strings = [
   "random random random random random",
   "section 1.2 random 2",
   "1.2 random 2",
   "1.2. random 2",
   "random 2",
   "random 2.",

for string in strings:
    m = re.match(regex, string, flags = re.I)
    if m:
        print "match on", string
        print "non match on", string

which gives an output of:

non match on random random random random random
match on section 1.2 random 2
match on 1.2 random 2
match on 1.2. random 2
match on random 2
match on random 2.
non match on random

See it in action at: