lakesh lakesh - 11 months ago 74
iOS Question

Difference between \b and \s in Regular Expression

I was learning regular expression in iOS, saw this tutorial:

It reads like this for \b:

\b matches word boundary characters such as spaces and punctuation. to\b will match the "to" in "to the moon" and "to!", but it will not match "tomorrow". \b is handy for "whole word" type matching.

and \s:

\s matches whitespace characters such as spaces, tabs, and newlines. hello\s will match "hello " in "Well, hello there!".

I have two questions on this:

what is the difference between \s and \b? when to use which?

\b is handy for "whole word" type matching -> Don't understand the meaning..

Need some guidance on these two.


\b Boundary characters

\b matches the boundary itself but not the boundary character (like a comma or period). It has no length in itself but can be used to find for example e in the end of a word.

For example in the sentence: "Hello there, this is one test. Testing"

The regex e\b will match an e if it's at the end of the word (followed by a word boundary). Notice in the image below that the e in "test" and "Testing" didn't match since the "e" is not followed by a boundary.

enter image description here

\s Whitespace

\s on the other hand matches the actual white space characters (like spaces and tabs). In the same sentence it will match all the spaces between the words.

enter image description here


Since \b doesn't make much sense alone I showed to how to it as e\b (above). The OP asked (in a comment) about what e\s would match compared to e\b to better explain the difference between \b and \s.

In the same string there is only one match for e\s while there was two matches for e\b since the comma is not a whitespace. Note that the e\s match (image 3) includes the white space where as the e\b match doesn't (image 1).

enter image description here