1 year ago
Python Question

Remove All Commas Between Quotes

I'm trying to remove all commas that are inside quotes (

) with python:

'please,remove all the commas between quotes,"like in here, here, here!"'
^ ^

I tried this, but it only removes the first comma inside the quotes:

re.sub(r'(".*?),(.*?")',r'\1\2','please,remove all the commas between quotes,"like in here, here, here!"')


'please,remove all the commas between quotes,"like in here here, here!"'

How can I make it remove all the commas inside the quotes?

Answer Source

Assuming you don't have unbalanced or escaped quotes, you can use this regex based on negative lookahead:

>>> str = r'foo,bar,"foobar, barfoo, foobarfoobar"'
>>> re.sub(r'(?!(([^"]*"){2})*[^"]*$),', '', str)
'foo,bar,"foobar barfoo foobarfoobar"'

This regex will find commas if those are inside the double quotes by using a negative lookahead to assert there are NOT even number of quotes after the comma.

Note about the lookaead (?!...):

  • ([^"]*"){2} finds a pair of quotes
  • (([^"]*"){2})* finds 0 or more pair of quotes
  • [^"]*$ makes sure we don't have any more quotes after last matched quote
  • So (?!...) asserts that we don't have even number of quotes ahead thus matching commas inside the quoted string only.
