mp252 mp252 - 1 year ago 62
Python Question

Searching regex expression, to return string with spaces

I am trying to search a string in python using regex for a particular word that begins with a space and ends with a space after it. The string in question that I want to search is;

JAKARTA, INDONESIA (1 February 2017)


and I want to get back the
", INDONESIA ("
part so I can apply
rtrim
and
ltrim
to it. As I could also be returning United Kingdom.

I have attempted to write this code within my python code;

import re
text = "JAKARTA, INDONESIA (1 February 2017)"
countryRegex = re.compile(r'^(,)(\s)([a-zA-Z]+)(\s)(\()$')
mo = countryRegex.search(text)
print(mo.group())


However this prints out the result

AttributeError: 'NoneType' object has no attribute 'group'


Indicated to me that I am not returning any matched objects.

I then attempted to use my regex in regex 101 however it still returns an error here saying "Your regular expression does not match the subject string."

I assumed this would work as I test for literal comma (
,
) then a space (
\s
), then one or more letters (
[a-zA-Z]+
), then another space (
\s
) and then finally an opening bracket making sure I have escaped it (
\(
). Is there something wrong with my regex?

Answer Source

Once you remove the anchors (^ matches the start of string position and $ matches the end of string position), the regex will match the string. However, you may get INDONESIA with a capturing group using:

,\s*([a-zA-Z]+)\s*\(

See the regex demo. match.group(1) will contain the value.

Details:

  • ,\s* - a comma and zero or more whitespaces (replace * with + if you want at least 1 whitespace to be present)
  • ([a-zA-Z]+) - capturing group 1 matching one or more ASCII letters
  • \s* - zero or more whitespaces
  • \( - a ( literal symbol.

Sample Python code:

import re 
text = "JAKARTA, INDONESIA (1 February 2017)"
countryRegex = re.compile(r',\s*([a-zA-Z]+)\s*\(') 
mo = countryRegex.search(text)
if mo:
    print(mo.group(1))

An alternative regex that would capture anything between ,+whitespace and whitespace+( is

,\s*([^)]+?)\s*\(

See this regex demo. Here, [^)]+? matches 1+ chars other than ) as few as possible.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download