codingcoding codingcoding - 5 months ago 9
Vb.net Question

Why is Regex.Match not matching the same strings as Regex.Matches?

I made a find and replace dialog with a regex option. There is a button to test a regex, highlighting all matches, and a button to find individual matches. With some regular expressions both methods make the same matches. Other regular expressions yield no matches with

Regex.Match
, but behave as expected with a collection of
Regex.Matches
. I have tried different
RegexOptions
when assigning the Regex, but haven't found any option that makes it behave as desired.

The goal here is to be able to test a regex, with
ButtonTestRegex
, then to be able to select each match, with a Find or Replace button.

Public rtb as RichTextBox

Private Sub ButtonTestRegex_Click(sender As Object, e As EventArgs)
rtb.Select(0, rtb.TextLength)
rtb.SelectionColor = Color.Black

Dim rgx As New Regex("(duplicate of )*([0-9]:+)*")

Dim matches As MatchCollection = rgx.Matches(rtb.Text)
For Each match In matches
rtb.Select(match.index, match.length)
rtb.SelectionColor = Color.Red
Next
End Sub

Private Sub ButtonFind_Click(ByVal sender As Object, ByVal e As EventArgs)
rtb.Focus()
rtb.selectionstart = 0
rtb.selectionlength = 0
Dim rgx = New Regex("(duplicate of )*([0-9]:+)*")
Dim match As Match = rgx.Match(rtb.Text)
If match.Value <> "" Then
rtb.SelectionStart = match.Index
rtb.SelectionLength = match.Length
End If
End Sub


With a RichTextBox containing the following:


1:remainder

duplicate of 1:remainder

duplicate of duplicate of 1:remainder


The code above will match all text except "remainder" with
ButtonTestRegex_Click()
(as expected). Nothing will be matched with
ButtonFind_Click()
. The code is being executed, and it does work with some regexs, e.g.
[0-9]
.

This code sample is abbreviated for clarity. My question is, why does
Regex.Match
not match anything in this case but
Regex.Matches
does?

Answer

I suspect you have a space or something at the start of the text selected by your RichTextBox. At that point, it makes total sense. Look at your regular expression:

(duplicate of )*([0-9]:+)*

That will match the empty string. So for example, if you find all the matches of that against "x", you'll find one match before the x, and one match after the x.

When you call Match, that finds the first match - which it does successfully, but matches an empty string. When you call Matches, it will retrieve all the matches - and there are a lot of them. Here's a small C# console app to show them all, assuming a space at the start of the text:

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main(string[] args)
    {
        var regex = new Regex("(duplicate of )*([0-9]:+)*");
        var input = @" 1:remainder
duplicate of 1:remainder
duplicate of duplicate of 1:remainder";
        foreach (Match match in regex.Matches(input))
        {
            Console.WriteLine(match.Length);
        }
    }
}

The output of that starts like this:

0
2
0
0
0
0
0
0
0
0
0
0
0
15
0

... but there's a lot of output.

It's not entirely clear what you were trying to match, but you probably want to make sure that empty strings don't match your regex.