Simonetos Simonetos - 6 months ago 23
Vb.net Question

VB.NET - How to grab specific part of text from a local html file and use it as variable?

I am making a small "home" application using VB. As the title says, I want to grab a part of text from a local html file and use it as variable, or put it in a textbox.

I have tried something like this...

Private Sub Open_Button_Click(sender As Object, e As EventArgs) Handles Open_Button.Click
Dim openFileDialog As New OpenFileDialog()
openFileDialog.CheckFileExists = True
openFileDialog.CheckPathExists = True
openFileDialog.FileName = ""
openFileDialog.Filter = "All|*.*"
openFileDialog.Multiselect = False
openFileDialog.Title = "Open"

If openFileDialog.ShowDialog = Windows.Forms.DialogResult.OK Then
Dim fileReader As String = My.Computer.FileSystem.ReadAllText(openFileDialog1.FileName)
TextBox.Text = fileReader
End If
End Sub


The result is to load the whole html code inside this textbox. What should I do so to grab a specific part of html files's code? Let's say I want to grab only the word text from this span...
<span id="something">This is a text!!!</a>

Answer

I make the following assumptions on this answer.

  1. Your html is valid - i.e. the id is completely unique in the document.
  2. You will always have an id on your html tag
  3. You'll always be using the same tag (e.g. span)

I'd do something like this:

' get the html document

 Dim fileReader As String = My.Computer.FileSystem.ReadAllText(openFileDialog1.FileName)

' split the html text based on the span element

Dim fileSplit as string() = fileReader.Split(New String () {"<span id=""something"">"}, StringSplitOptions.None)

' get the last part of the text

fileReader = fileSplit.last

' we now need to trim everything after the close tag

fileSplit = fileReader.Split(New String () {"</span>"}, StringSplitOptions.None)

' get the first part of the text 

fileReader = fileSplit.first

' the fileReader variable should now contain the contents of the span tag with id "something"

Note: this code is untested and I've typed it on the stack exchange mobile app, so there might be some auto correct typos in it.

You might want to add in some error validation such as making sure that the span element only occurs once, etc.