I am trying to extract numbers from a string (email) based on keywords.
There are a couple of difficulties here;
- The numbers we are looking for in our system are Always 8 characters, but the senders could be neglecting the preprocessing "0" and instead of sending 01234567 they will send us 1234567.
- There are other numbers that could be matched as valid numbers, like Phonenumbers, and are known in our system, therefore we have decided to detect preprocessing keywords like "casenumber: " and other variants.
- last but not least, the sender could send "casenumber: 1234567" but he could also send "casenumbers: 1234567, 7654321" or any variant of that. (devider ; or , or . or : etc.)
An example text:
Hereby I would like to confirm that I will be present at the meeting about casenumber: 1234567 and 7654321.
Can you confirm that you have received this email?
What I have tried to use is a regex match that searches for a list of keywords, including "casenumber:" and than adding after that all possible solutions, but this only works for 1 case number, the second one or third and so on will not be found.
Code language used: C#
Regex.Matches(checkString, keyword + @"[ +;:,.\r\n\t]*[BL0123456789][0-9]+", RegexOptions.IgnoreCase )
This my current regex, it uses Regex.Matches and checks generally on global. It does match when the text has "casenumber: 12345678 and casenumber: 87654321" but not when its comma seperated.