Xavier Dass Xavier Dass - 7 months ago 14
Java Question

Regex to break non-whitespace string into individual characters and digit chunks in Java

I've been reading/searching for awhile now, but can't find anything that quite answers my question case.

Currently, I have a string (

str
) such as
"a1bc23def456"
being split using the following regex:

String[] stuff = str.split("(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)");


which gives me a string array that looks like

["a","1","bc","23","def","456"]


but what I am trying to get is a split on every character that is a letter, and before a number begins. So that my array will look like:

["a","1","b","c","23","d","e","f","456"]


so numbers are split from letters, but not from themselves, and letters are split from everything.

I am quite fresh to using regex with Java, so please go easy.

Edit:
This is not quite like the "duplicate" question linked. Because the regex answers provided in that section also result in the same splitting pattern.

I am trying to split groupings of letters. I think it was said well above "so numbers are split from letters, but not from themselves, and letters are split from everything [including other letters]."

Answer

The simplest regex that works is:

(?<=\D)|(?=\D)

Which splits before or after a letter (\D means non-digit, which in this context is a letter).

Demo:

System.out.println(Arrays.toString("a1bc23def456".split("(?<=\\D)|(?=\\D)")));

Output:

[a, 1, b, c, 23, d, e, f, 456]