Chinovski Chinovski - 4 months ago 14
Java Question

How to regex a string representig a city or its postal code with accent?

I'm trying to write a code of java allowing to show a list of cities depending on the name of the city or its postal code:

I wrote many expressions but they didn't work 100%.
This is my last expression:

([A-Z_]+)(:)([0-9]+)


The expression should match a city name : it could be :
Lonéy' ed
or its code postal
57000


Does anyone have an idea how to improve my expression?

Thanx.

Answer

Since Java7 you can do the following :

Pattern.compile("([\\p{Alpha} '-_]+):(\\d{5})", Pattern.UNICODE_CHARACTER_CLASS)

Keep adding connecting characters (here [ '-_]) to cater for all your needs.

The pattern doesn't make any assumptions about the case of the name of a place as in some non-Latin scripts there are no cases.

EDIT: added 5 digits postal code detection and a SPACE for name detection