aravind aravind - 4 months ago 8
Java Question

Regex pattern for repeated words

I am very new to regex, I am learning it now. I have a requirement like this:

Any String starts with #newline# and also ends with #newline#. In between these two words, there could be (0 or more spaces) or (0 or more #newline#).

below is an example:

#newline# #newline# #newline##newline# #newline##newline##newline#.


How to do regex for this?

I have tried this, but not working

^#newline#|(\s+#newline#)|#newline#|#newline#$

Answer

Your ^#newline#|(\s+#newline#)|#newline#|#newline#$ matches either a #newline# at the start of the string (^#newline#), or 1+ whitespaces followed with #newline# ((\s+#newline#)), or #newline#, or (and this never matches as the previous catches all the cases of #newline#) a #newline# at the end of the string (#newline#$).

You may match these strings with

^#newline#(?:\s*#newline#)*$

or (if there should be at least 2 occurrences of #newline# in the string)

^#newline#(?:\s*#newline#)+$
                          ^

See the regex demo.

  • ^ - start of string
  • #newline# - literal string
  • (?:\s*#newline#)* - zero (NOTE: replacing * with + will require at least 1) or more sequences of
    • \s* - 0+ whitespaces
    • #newline# - a literal substring
  • $ - end of string.

Java demo:

String s = "#newline#  #newline# #newline##newline# #newline##newline##newline#";
System.out.println(s.matches("#newline#(?:\\s*#newline#)+"));
// => true

Note: inside matches(), the expression is already anchored, and ^ and $ can be removed.

Comments