Alex M. Alex M. - 10 days ago 7
Java Question

How does split method work when using a non-whitespace character in Java?

I do not understand how the

split()
method from String class works when using
regex
(a non-whitespace character). I have found some partial anwsers on the internet but i still don't understand. Here is my code:

public class Test {
public static void main(String[] args) {
String myX = "x xx ";
String[] x = myX.split("\\S");

for (String s : x){
System.out.print("\"" + s + "\", ");
}
System.out.println(x.length);
}
}


My logic is as follows:

Is the first
'x'
a non-whitespace? Yes, so in the array we should have
""


Is the
' '
non-whitespace? No, so in the array we should have
" "


Is the second
'x'
a non-whitespace? Yes, so in the array we should have
""


Is the third
'x'
a non-whitespace? Yes, so in the array we should have
""


Is the last
' '
non-whitespace? No, so in the array we should have
" "



In my opinion the array should look like this:
["", " ", "", "", " "]


Why the array looks like
["", " ", "", " "]
and has the length 4 in stead of 5? In the middle are located 2x, not only one as in the array appears.

Thanks!

Answer

The question that you should rather ask yourself is what do we have between 2 consecutive separators?

  1. When it finds the first x, the extracted value is "" because the previous separator is virtually the beginning of the String
  2. When it finds the second x, the extracted value is " " because the previous separator was the first x and " " is what we have in between
  3. When it finds the third x, the extracted value is "" because the previous separator was also the previous character so there is nothing to extract
  4. When it finds the end of the String, the extracted value is " " because the previous separator was the third x and we have a space in between.

So the result is indeed "", " ", "", " "

Comments