tvelykyy tvelykyy - 21 days ago 5
Java Question

Regex to extract cents value from arbitrary currency formatting

I need to extract cents value from next possible values using Java regex (thousand separator could be both dot and comma):

$123,456.78
123,456.78 dollars
123,456.78


I have partially working solution:

[\.,]\d\d\D


The problem with my solution, that it doesn't work in case "123,456.78" when the last digit is the end of string. How can I handle this case?

http://java-regex-tester.appspot.com/regex/6af08221-63cb-4c5b-a865-c86fe5e825ff

Answer

Note that \D requires a character that is not a digit after the ,/. and 2 digits in your pattern. If you want to make sure there is no digit without consuming (requiring it) use a negative lookahead:

[.,](\d{2})(?!\d)
           ^^^^^^ 

See the regex demo.

Details:

  • [.,] - a dot or comma (to support decimal separators in different countries, not just the U.S.)
  • (\d{2}) - Group 1 (since the \d{2} appears inside a capturing group (...), you may access its value using Matcher.group(1))
  • (?!\d) - a negative lookahead requiring the absence of a digit right after the previous 2 digits.

See more about how negative lookahead works.