Hubro Hubro - 5 months ago 14
Python Question

Can I mix character classes in Python RegEx?

Special sequences (character classes) in Python RegEx are escapes like

\w
or
\d
that matches a set of characters.

In my case, I need to be able to match all alpha-numerical characters except numbers.

That is,
\w
minus
\d
.

I need to use the special sequence
\w
because I'm dealing with non-ASCII characters and need to match symbols like "Æ" and "Ø".

One would think I could use this expression:
[\w^\d]
but it doesn't seem to match anything and I'm not sure why.

So in short, how can I mix (add/subtract) special sequences in Python Regular Expressions?




EDIT: I accidentally used
[\W^\d]
instead of
[\w^\d]
. The latter does indeed match something, including parentheses and commas which are not alpha-numerical characters as far as I'm concerned.

Answer

You can use r"[^\W\d]", ie. invert the union of non-alphanumerics and numbers.