Graeme Perrow Graeme Perrow - 4 months ago 8x
Perl Question

How can I match a quote-delimited string with a regex?

If I'm trying to match a quote-delimited string with a regex, which of the following is "better" (where "better" means both more efficient and less likely to do something unexpected):

/"[^"]+"/ # match quote, then everything that's not a quote, then a quote


/".+?"/ # match quote, then *anything* (non-greedy), then a quote

Assume for this question that empty strings (i.e. "") are not an issue. It seems to me (no regex newbie, but certainly no expert) that these will be equivalent.

Update: Upon reflection, I think changing the
characters to
will handle empty strings correctly anyway.


You should use number one, because number two is bad practice. Consider that the developer who comes after you wants to match strings that are followed by an exclamation point. Should he use:




The difference appears when you have the subject:

"one" "two"!

The first regex matches:


while the second regex matches:

"one" "two"!

Always be as specific as you can. Use the negated character class when you can.

Another difference is that [^"]* can span across lines, while .* doesn't unless you use single line mode. [^"\n]* excludes the line breaks too.

As for backtracking, the second regex backtracks for each and every character in every string that it matches. If the closing quote is missing, both regexes will backtrack through the entire file. Only the order in which then backtrack is different. Thus, in theory, the first regex is faster. In practice, you won't notice the difference.