MohitC MohitC - 5 months ago 16
Python Question

Python regex not matching all patterns in string

I want to match a curl command's header pattern like:


-H 'key: value'


or


-H "key: value"


This switch can appear in the middle somewhere or at the end of string.

My pattern:

>>> header_pattern = re.compile(' \-H (?:\'|\").+?:.+?(?:\'|\")(?:\s+|$)')


My string:

>>> a = " -H 'Authorization: Bearer xxx' -H 'Content-Type: text/plain' "


Now I attempted to find all instances of this pattern but its matching just the first pattern.

>>> headers = header_pattern.findall(a)
>>> headers
[" -H 'Authorization: Bearer xxx' "]

Answer

Why don't use argparse module instead of regular expressions:

import argparse
import shlex


parser = argparse.ArgumentParser()
parser.add_argument('command')
parser.add_argument('url')
parser.add_argument('-d', '--data')
parser.add_argument('-b', '--data-binary', default=None)
parser.add_argument('-H', '--header', action='append', default=[])
parser.add_argument('--compressed', action='store_true')


curl_command = "curl https://google.com  -H 'Authorization: Bearer xxx' -H 'Content-Type: text/plain'"

tokens = shlex.split(curl_command)
parsed_args = parser.parse_args(tokens)
print(parsed_args.header)

Prints ['Authorization: Bearer xxx', 'Content-Type: text/plain'].

(Inspired by uncurl package).