Kibi Brat Kibi Brat - 3 months ago 30
Vb.net Question

Download a web page HTML as a UTF-8 string

I want to download online page's inner html, but when i do that, characters like šđčćž are replaced by ć¡ and so on.

Code i am using:

Dim sourceString As String = New System.Net.WebClient().DownloadString("SomeWebPage")
TextBox1.Text = sourceString

Answer

You probably have to download bytes then use Encoding class to convert to UTF8 :

Async Function GetHtmlString(address As String) As Task(Of String)
    Using client As New WebClient
        Dim bytes  = Await client.DownloadDataTaskAsync(address)
        Dim s  = Encoding.UTF8.GetString(bytes)
        return s
    End Using
End Function

An even simpler way thanks to @dave's comment:

Async Function GetHtmlString(address As String) As Task(Of String)
    Using client As New WebClient
        client.Encoding = Encoding.UTF8
        Dim s  = Await client.DownloadStringTaskAsync(address)
        return s
    End Using
End Function

Usage example:

Imports System.Net
Imports System.Text

Public Class Form1
    Private Async Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
        Dim s = Await GetHtmlString("http://www.radiomerkury.pl/")
    End Sub

    Async Function GetHtmlString(address As String) As Task(Of String)
        Using client As New WebClient
            client.Encoding = Encoding.UTF8
            Dim s = Await client.DownloadStringTaskAsync(address)
            Return s
        End Using
    End Function
End Class