lordzuko lordzuko - 2 months ago 7
Python Question

Generating regex string to be used in re.match()

I am trying to a string to be used as regex String.

In the following code:

_pattern
is a pattern like
abba
and I am trying to check
_string
follows the
_pattern
(eg.
catdogdogcat
)


rxp
in the following code is the regular expression that I am trying to create to match to
_string
(eg. for above example it will be
(.+)(.+)\\2\\1
). Which is being successfully generated. But the
re.match()
is returning
None
.


I want to understand why it is not working and how to correct it ?

import re

_pattern = "abba" #raw_input().strip()
_string = "catdogdogcat" #raw_input().strip()
hm = {}
rxp = ""
c = 1
for x in _pattern:
if hm.has_key(x):
rxp += hm[x]
continue
else:
rxp += "(.+)"
hm[x]="\\\\"+str(c)
c+=1

print rxp
#print re.match(rxp,_string) -> (Tried) Not working
#print re.match(r'rxp', _string) -> (Tried) Not working

print re.match(r'%s' %rxp, _string) # (Tried) Not working


Output

(.+)(.+)\\2\\1
None


Expected Output

(.+)(.+)\\2\\1
<_sre.SRE_Match object at 0x000000000278FE88>

Answer

The thing is that your regex string variable has double \\ instead of a single one.

You can use

rxp.replace("\\\\", "\\")

in .match like this:

>>> print re.match(rxp.replace("\\\\", "\\"), _string)
<_sre.SRE_Match object at 0x10bf87c68>

>>> print re.match(rxp.replace("\\\\", "\\"), _string).groups()
('cat', 'dog')

EDIT:

You can also avoid getting double \\ like this: import re

_pattern = "abba" #raw_input().strip()
_string = "catdogdogcat" #raw_input().strip()
hm = {}
rxp = ""
c = 1
for x in _pattern:
    if x in hm:
        rxp += hm[x]
        continue
    else:
        rxp += "(.+)"
        hm[x]="\\" + str(c)
        c+=1

print rxp
print re.match(rxp,_string)
Comments