Vishal Afre Vishal Afre - 4 months ago 8
Java Question

Java regular expression to find occurence of a particular substring n times

I have a string with alphabets followed by numbers(1-4). For example

String input = "A1 B1 P3 D1 D2 D3 D4 F3 F3 Z1 Z2 Z3 Z4 P2";


I want to find the strings which have an alphabet with all the 4 numbers (1 to 4) in consecutive manner. In above example, it is
D1 D2 D3 D4
or
Z1 Z2 Z3 Z4
.

Is there any possible way by which we can match this using RE in java?

Answer

Yes, you can do it with backreferences.

Here is an fully executable program:

import java.util.regex.*;
public class Four {
    static String input = "A1 B1 P3 D1 D2 D3 D4 F3 F3 Z1 Z2 Z3 Z4 P2";
    static Pattern four = Pattern.compile("([A-Z])1 \\12 \\13 \\14");
    public static void main(String[] args) {
        Matcher m = four.matcher(input);
        while (m.find()) {
            System.out.println(m.group(1));
        }
    }
}

It outputs

D
Z

The regex you are looking for is

([A-Z])1 \\12 \\13 \\14"

It matches

  • First, a capital A through Z
  • Followed by a 1
  • Then a space
  • Then, \1 a backreference, matches the contents of the string that were matched in the first set of parentheses. So if you previously had D1, the \1 matches D.
  • Next, the number 2
  • (This continues to pick up the 3 and the 4).

You may wish to replace the single space with \\s+, meaning one or more space characters, if your use case allows it.

EDIT

The group(1) picks up the value matched within the parentheses. If you change this to group(0) you will pick up the entire matched string, and the program will then output

D1 D2 D3 D4
Z1 Z2 Z3 Z4
Comments