peter.murray.rust peter.murray.rust - 1 year ago 70
Java Question

What is a word boundary in regexes?

I am using Java regexes in Java 1.6 (inter alia to parse numeric output) and cannot find a precise definition of

("word boundary"). I had assumed that
would be an "integer word" (matched by
) but it appears that this does not work. I'd be grateful to know of ways of matching space-separated numbers.


Pattern pattern = Pattern.compile("\\s*\\b\\-?\\d+\\s*");
String plus = " 12 ";
String minus = " -12 ";
pattern = Pattern.compile("\\s*\\-?\\d+\\s*");

This returns:


Answer Source

A word boundary, in most regex dialects, is a position between \w and \W (non-word char), or at the beginning or end of a string if it begins or ends (respectively) with a word character ([0-9A-Za-z_]).

So, in the string "-12", it would match before the 1 or after the 2. The dash is not a word character.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download