Aman Tugnawat Aman Tugnawat - 2 months ago 26
Java Question

Regex: Does /w means [a-zA-Z] or [a-zA-Z0-9_] as most tutorials mention \w -Matches the word characters?

I have just started with Regular Expressions and was to solving this question in which the task is to check whether that username is valid. A valid username will have the following properties:


  1. The username can contain alphanumeric characters and/or
    underscores(_).

  2. The username must start with an alphabetic character.

  3. 8<=(Username Length)<=30.



I am using this
as my reference that says


\w Matches the word characters.


and I came up with a solution like this
String pattern = "^\\w(\\d|\\w|_){7,29}$";
which is not the correct solution.
And after searching for a while I found the correct solution is

String pattern = "^[a-zA-Z][a-zA-Z0-9_]{7,29}$";
which is pretty clear to understand.

What I want to confirm is
(\\w|\\d|_)
equivalent to
[a-zA-Z0-9_]
or not?

I think they are because
String pattern = "^[a-zA-z](\\w|\\d|_){7,29}$";
is accecpted for all test cases.

Also, this stackoverflow post has two different equivalent expressions for
\\w
as answers with one upvote each, want to know which one is correct
[A-Za-z\s]
or
[A-Za-z0-9_]
?

Answer

Yes, according to the Java summary of regular expression constructs found here: https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html,

\d  A digit: [0-9]
\w  A word character: [a-zA-Z_0-9]

So (\w|\d|_) is equivalent to ([a-zA-Z_0-9]|[0-9]|_), where the extra underscore is redundant since it's included with \w.

Comments