luke luke - 1 month ago 6
Java Question

Writing one regular expression for string in java

I am trying to write one regular expression for string. Let us say there is a string RBY_YBR where _ represents empty so we can recursively replace the alphabets and _ and the result is RRBBYY_ . There can be two or more alphabet pairs can be formed or something like this also RRR .


Conditions

1). Left or right alphabet should be the same.

2). If there is no _ then the alphabet should be like RRBBYY not RBRBYY or RBYRBY etc.

3). There can be more than one underscore _ .

From regular expression I am trying to find whether the given string can satisfy the regular expression or not by replacing the character with _ to form a pattern of consecutive alphabets

The regular expression which I wrote is


String regEx = "[A-ZA-Z_]";


But this regular expression is failing for RBRB. since there is no empty space to replace the characters and RBRB is also not in a pattern.

How could I write the effective regular expression to solve this.

Answer

Ok, as I understand it, a matching string shall either consist only of same characters being grouped together, or must contain at least one underscore.

So, RRRBBR would be invalid, while RRRRBB, RRRBBR_, and RRRBB_R_ would all be valid.

After comment of question creator, additional condition: Every character must occur 0 or 2 or more times.

As far as I know, this is not possible with Regular Expressions, as Regular Expressions are finite-state machines without "storage". You would have to "store" each character found in the string to check that it won't appear later again.

I would suggest a very simple method for verifying such strings:

public static boolean matchesMyPattern(String s) {
    boolean withUnderscore = s.contains("_");

    int[] found = new int[26];

    for (int i = 0; i < s.length(); i++) {
        char ch = s.charAt(i);
        if (ch != '_' && (ch < 'A' || ch > 'Z')) {
            return false;
        }

        if (ch != '_' && i > 0 && s.charAt(i - 1) != ch && found[ch - 'A'] > 0
                && !withUnderscore) {
            return false;
        }
        if (ch != '_') {
            found[ch - 'A']++;
        }
    }

    for (int i = 0; i < found.length; i++) {
        if (found[i] == 1) {
            return false;
        }
    }

    return true;
}
Comments