Whitekang Whitekang - 3 months ago 7
Vb.net Question

VB.NET: Searching for certain values in text

I have programmed a piece of code that reads a String and tries to get certain parts out of it.

In particular, I want to get the numbers that are contained in a custom textual written tag:

[propertyid=]
. For example
[propertyid=541]
would need to return me
541
.

This search and retrieve happens in a text and needs to occur as often as the amount of tags there are in the text.

I already have written out code that works

Module Module1

Sub Main()
Dim properties As New List(Of String)
'context of string doesn't matter, only the ids are important
Dim text As String = "Dit is de voorbeeld string. Eerst komt er gewoon tekst. Daarna een property als [propertyid=1155641] met nog wat tekst. Dan volgt nog een [propertyid=1596971418413399] en dan volgt het einde."
Dim found As Integer = 1

Do
found = InStr(found, text, "[propertyid=")
If found <> 0 Then
properties.Add(text.Substring(found + 11, text.IndexOf("]", found + 11) - found - 11).Trim())
found = text.IndexOf("]", found + 11)
End If
Loop While found <> 0




Console.WriteLine("lijst")
For Each itemos As String In properties
Console.WriteLine(itemos)
Next
End Sub

End Module


But I can't help but feel like this isn't optimal. I'm pretty sure this can be written way easier or with the help of other tools than
Substring
and
IndexOf
. Especially so, because of the fact that I need to play a bit with the indexes and the loop.

Any suggestions for improving this piece of code?

Answer

You can use regular expressions for this kind of task.

In this case, the pattern to match [propertyid=NNNN] is:

\[propertyid=(\d+)\]

Which isolates a set of one or more digits - \d+ - in a capture group (the parentheses) so it can be retrieved by the matching engine.

Here's a code example:

Imports System.Text.RegularExpressions

Module Module1

    Sub Main()

        Dim properties As New List(Of String)
        'context of string doesn't matter, only the ids are important
        Dim text As String = "Dit is de voorbeeld string. Eerst komt er gewoon tekst. Daarna een property als [propertyid=1155641] met nog wat tekst. Dan volgt nog een [propertyid=1596971418413399] en dan volgt het einde."
        Dim pattern As String = "\[propertyid=(\d+)\]"

        For Each m As Match In Regex.Matches(text, pattern)
            properties.Add(m.Groups(1).Value)
        Next

        For Each s As String In properties
            Console.WriteLine(s)
        Next

        Console.ReadKey()


    End Sub

End Module