Ahatius Ahatius - 3 months ago 7
Java Question

Regex: Match wildcard followed by variable length of digits

I'm trying to extract the personal number from a stringlike

Personal number: 123456
with the following regex:

(Personal number|Personalnummer).*(\d{2,10})


When trying to get the second group, it will only contain the last 2 digits of the personal number. If I change the digit range to
{3,10}
it will match the last 3 digits of the personal number.

Now I cannot just add the whitespaces as additional group, because I cannot be sure that there will be always whitespaces - there might be none or some other characters, but the personal number will be always at the end.

Is there anyway I could instruct the Parser to get the whole digit string?

Answer

.* is working as greedy quantifier for the regex. It ends up eating all the matching characters except the last 2 that it has to leave to match the string.

You have to make it reluctant by applying ?. Like below

(Personal number|Personalnummer).*?(\d{2,10})

Now it should work perfectly.

You can also convert the first group into a non capturing group, then you'll get only the number that you want in the answer like below.

(?:Personal number|Personalnummer).*?(\d{2,10})
Comments