Paul Wicks Paul Wicks - 8 months ago 58
Java Question

Regex to match only commas not in parentheses?

I have a string that looks something like the following:


I'd like to create a regex that matches the commas, but only the commas that are not inside of parentheses (in the example above, all of the commas except for the two after 23 and 45). How would I do this (Java regular expressions, if that makes a difference)?


Assuming that there can be no nested parens (otherwise, you can't use a Java Regex for this task because recursive matching is not supported):

Pattern regex = Pattern.compile(
    ",         # Match a comma\n" +
    "(?!       # only if it's not followed by...\n" +
    " [^(]*    #   any number of characters except opening parens\n" +
    " \\)      #   followed by a closing parens\n" +
    ")         # End of lookahead", 

This regex uses a negative lookahead assertion to ensure that the next following parenthesis (if any) is not a closing parenthesis. Only then the comma is allowed to match.