Kallol Kallol - 2 years ago 125
Vb.net Question

Downloading webpage contents in parallel using async

I am using the example from Microsoft that downloads data of multiple URLs using


My requirement is to complete download of 200 links in 1 minute so that 2nd minute the same set of 200 URLs will start downloading again. I am aware that largely this would depend on network speed and to a less extent on CPU power since this is not a IO bound process.

Assuming network and CPU would support this operation and would not be a bottleneck, I am actually seeing timeout and cancellation exception after some time for the tasks.

Question is, therefore, in the same example, can I change this to long running tasks so that the tasks don't timeout? I am aware of usage of the
enum and using
. However, the problems are:
1) How do I provide this parameter while creating the tasks in the below example and the link provided?
2) What is the definition
? Does this mean each task will not timeout anymore?
3) Can I explicitly set an infinite timeout by some other mean?

Basically, my requirement is, if the download process of a particular URL is completed, it will again trigger the download of the same URL - which means the same URL will be downloaded over and over again and hence the task should never complete (the URLs in the MSDN example is not the URLs I will fire, there will be other URLs the contents of which will change every minute and hence I need to continuously download the URL at least once every minute).

Pasting the code here too from the above example link:

Dim cts As CancellationTokenSource
Dim countProcessed As Integer

Private Async Sub startButton_Click(sender As Object, e As RoutedEventArgs)

' Instantiate the CancellationTokenSource.
cts = New CancellationTokenSource()


Await AccessTheWebAsync(cts.Token)
resultsTextBox.Text &= vbCrLf & "Downloads complete."

Catch ex As OperationCanceledException
resultsTextBox.Text &= vbCrLf & "Downloads canceled." & vbCrLf

Catch ex As Exception
resultsTextBox.Text &= vbCrLf & "Downloads failed." & vbCrLf
End Try

' Set the CancellationTokenSource to Nothing when the download is complete.
cts = Nothing
End Sub

Private Sub cancelButton_Click(sender As Object, e As RoutedEventArgs)
If cts IsNot Nothing Then
End If
End Sub

Async Function AccessTheWebAsync(ct As CancellationToken) As Task

Dim client As HttpClient = New HttpClient()

' Call SetUpURLList to make a list of web addresses.
Dim urlList As List(Of String) = SetUpURLList()

' ***Create a query that, when executed, returns a collection of tasks.
Dim downloadTasksQuery As IEnumerable(Of Task(Of Integer)) =
From url In urlList Select ProcessURLAsync(url, client, ct)

' ***Use ToList to execute the query and start the download tasks.
Dim downloadTasks As List(Of Task(Of Integer)) = downloadTasksQuery.ToList()

Await Task.WhenAll(downloadTasks)
'Ideally, this line should never be reached

End Function

Async Function ProcessURLAsync(url As String, client As HttpClient, ct As CancellationToken) As Task(Of Integer)
Console.WriteLine("URL=" & url)
' GetAsync returns a Task(Of HttpResponseMessage).
Dim response As HttpResponseMessage = Await client.GetAsync(url, ct)

' Retrieve the web site contents from the HttpResponseMessage.
Dim urlContents As Byte() = Await response.Content.ReadAsByteArrayAsync()
Return urlContents.Length
End Function

Private Function SetUpURLList() As List(Of String)

Dim urls = New List(Of String) From
'For space constraint I am not including the 200 URLs, but pls assume the above list contains 200 URLs

Return urls
End Function

Answer Source

Question is, therefore, in the same example, can I change this to long running tasks so that the tasks don't timeout?

Tasks themselves do not timeout. What you're probably seeing is the HTTP requests timing out. Long-running tasks don't have any different timeout semantics.

I am aware of usage of the TaskCreationOptions enum and using LongRunning.

You should also be aware that they should almost never be used.

You're probably getting timeouts because all your requests are hitting the same website. Try setting ServicePointManager.DefaultConnectionLimit to int.MaxValue, and possibly also increase HttpClient.Timeout.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download