iDemigod iDemigod - 2 months ago 12
Java Question

Regex for detecting repeating symbols

I'm looking for the regex expression that will detect repeating symbols in a String. And currently I didn't found solution that fits all my requirements.

Requirements are pretty simple:


  • detect any repeating symbol in a String;

  • to be able to setup repeating count (eg. more than twice)



Examples of required detection (of symbol 'a', more than 2 times, true if detects, false otherwise)

"Abcdefg" - false

"AbcdaBCD" - false

"abcd_ab_ab" - true (symbol 'a' used three times)

"aabbaabb" - true (symbols 'a' used four times)

Since I'm not a pro in regex and usage of them - code snippet and explanation would be appreciated!

Thanks!

Answer

I think that

(.).*\1

would work:

  • (.) match a single character and capture
  • .* match any intervening characters
  • \1 match the captured group again.

(You'd need to compile with the DOTALL flag, or replace . with [\s\S] or similar if the string contains characters not ordinarily matched by .)

and if you want to require that it is found at least 3 times, just change the quantifier of the second two bullets:

(.)(.*\1){2}

etc.

This is going to be pretty inefficient, though, because it's going to have to do the "search for the next matching character" for everything in the string, making it at least quadratic.

You might be as well off not using regular expressions, e.g.

char[] cs = str.toCharArray();
Arrays.sort(cs);
for (int i = 1; i < cs.length; ++i) {
  if (cs[i] == cs[i - 1]) {
    return true;
  }
}
return false;

This sorts all of the same characters together, allowing you just to pass over the string once looking for adjacent equal characters.