Andoni Zubizarreta Andoni Zubizarreta - 1 month ago 9
ASP.NET (C#) Question

Regex match account number in PDF until new line

I'm working on a pdf scrapper in C# and I got stuck on a regex problem. I want to match just the account number and my regex statement is matching both the incorrect line and the correct line. I think I have to match everything until a new line but I can't find a way to do it.

This is my regex: ([A-Z0-9\-]{5,30})-[0-9]{1,10}-[0-9]{3}

XXX-XX-914026-1558513 // I don't want to match this line

130600298-110-528 // I want to match this line


Thanks in advance!

Answer

You have to add anchors:

^([A-Z0-9\-]{5,30})-[0-9]{1,10}-[0-9]{3}$
^                                       ^

Which mean start of line (^) and end of line ($).

If you don't, the match will be:

XXX-XX-914026-1558513
^^^^^^^^^^^^^^^^^ 

Also, you don't have to escape the caret in the end of a character class and you can use \d instead of [0-9]note: this will match numbers in any charset which gives:

^([A-Z0-9-]{5,30})-\d{1,10}-\d{3}$
Comments