Mr Billy Mr Billy - 1 month ago 12
C# Question

Optional group in Regex returns too many matches

I want to match only values (i.e. 2.699.230,20) from an input on a C# Regex class.

I use

"(\\.?[0-9]){2,}\\,[0-9]{2}"
and it matches desired values
5.000,00
,
2.699.230,20
,
1.000.000,00
, etc. The
{2,}
is to only match values above
999,99
.

But there's also other values on the same input that i want to match. They are always
1.000
or above, but the difference is that it don't have the decimal
,00
part. Examples:
4.541.087
,
8.997.434
.

So i put the last part of the regex a binary (0 or 1 times present) option (added
(...)?
around the decimal part:

"(\\.?[0-9]){2,}(\\,[0-9]{2})?"
, but now this matches hundreds of numbers, including
18
,
1.0
,
1.5.2
,
8854
, etc.

So, how can i make the decimal part optional, so it matches both
1.000,00
and
1.000
?

Answer

It seems you only want to get numbers that have a . as thousand separator in them with optional 2 digits in the fractional part.

Use

@"\b\d{1,3}(?:\.\d{3})+(?:,\d{2})?\b"

See the regex demo.

Details:

  • \b - leading word boundary (may be replaced with (?<!\d) negative lookbehind to disallow only digit before...)
  • \d{1,3} - 1 to 3 digits
  • (?:\.\d{3})+ - 1 or more sequences of a dot and 3 digits (NOTE: if you change + with *, it will match values below 1.000)
  • (?:,\d{2})? - an optional sequence of a , and 2 digits.
  • \b - trailing word boundary (may be replaced with (?!\d) negative lookbahead to disallow only digit after the number).

C# demo:

var re = @"\b\d{1,3}(?:\.\d{3})+(?:,\d{2})?\b"; 
var str = "values 5.000,00, 2.699.230,20, 1.000.000,00, etc.  999,99 including 18, 1.0, 1.5.2, 8854, etc"; 
var res = Regex.Matches(str, re)
    .Cast<Match>()
    .Select(p => p.Value)
    .ToList();
Console.WriteLine(string.Join("\n", res));