Gops AB Gops AB - 2 months ago 6
Java Question

Why adding white space makes my regex wrong?

(^\s*\d+\)(.*) | ) | (^\s*Q\d+\.\s*(.*))


The above regex is not matching
Q1. qeqwewqeqeq qerqer


But If I remove white space before and after
|


(^\s*\d+\)(.*) | )|(^\s*Q\d+\.\s*(.*))


It matches my string.

What does white space mean? Is it equal to
\s
? It affects my readability.

Answer

Inside a regex pattern, spaces are meaningful atoms that match spaces. If you need to format your pattern with spaces and tabs and newlines - with whitespace that will not be accounted for by the regex engine - you may use the (?x) modifier, or the Pattern.COMMENTS flag.

Then, to match a literal space in such a pattern with (?x) option, you need to escape spaces to match literal spaces. Or, you may consider matching any whitespace with \s:

\s  A whitespace character: [ \t\n\x0B\f\r]

Note that in case you add (?U) modifier, Pattern.UNICODE_CHARACTER_CLASS flag, \s will match all Unicode whitespace (like [\p{Zs}\t\r\n]).

Comments