Panfeng Li Panfeng Li - 3 years ago 95
Python Question

Confused about the backslash in python

I understand that to match a literal backslash, it must be escaped in the regular expression. With raw string notation, this means

r"\\"
. Without raw string notation, one must use
"\\\\"
.

When I saw the code
string = re.sub(r"[^A-Za-z0-9(),!?\'\`]", " ", string)
, I was wondering the meaning of backslash in
\'
and
\`
, since it also works well as
'
and
`
, like
string = re.sub(r"[^A-Za-z0-9(),!?'`]", " ", string)
. Is there any need to add the backslash ?

Then I try some examples in Python.

1) str1 = "\'s"
print(str1)
str2 = "'s"
print(str2)


The result is same as
's
. I think this might be the reason why in previous code, they use
\'\`
in
string = re.sub(r"[^A-Za-z0-9(),!?\'\`]", " ", string)
. I was wondering is there any difference between
"\'s"
and
"'s"
?

2) string = 'adequately describe co-writer/director peter jackson\'s expanded vision of j . r . r . tolkien\'s middle-earth .'
re.match(r"\\", string)


The
re.match
returns nothing, which shows there is no backslash in the string. However, I do see backslashes in it. Is that the backslash in
\'
actually not a backslash?

Thanks for your help!

Answer Source

Check out https://docs.python.org/2.0/ref/strings.html for a better explanation.

The problem with your second example is that string isn't a raw string, so the \' is interpreted as '. If you change it to:

>>> not_raw = 'adequately describe co-writer/director peter jackson\'s expanded vision of j . r . r . tolkien\'s middle-earth .'
>>> res1 = re.search(r'\\',not_raw)
>>> type(res1)
<type 'NoneType'>
>>> raw = r'adequately describe co-writer/director peter jackson\'s expanded vision of j . r . r . tolkien\'s middle-earth .'
>>> res2 = re.search(r'\\',raw)
>>> type(res2)
<type '_sre.SRE_Match'>

For an explanation of re.match vs re.search: What is the difference between Python's re.search and re.match?

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download