SpK SpK - 6 months ago 48
Vb.net Question

VB.net - Faster text manipulation/storing to array

I have to make a program that reads .txt files for a string, and manipulates the data to return various results.
The problem that I have is with the timing of the execution.

Dim OpenAnswerFile As New OpenFileDialog
OpenAnswerFile.Multiselect = True
Dim strFileName() As String '// String Array.
Dim tempStr As String = "" '// temp String for result.
Dim FileName As String
Dim Watcher As New Stopwatch
If OpenAnswerFile.ShowDialog = DialogResult.OK Then
Watcher.Restart()
For Each FileName In OpenAnswerFile.FileNames

strFileName = IO.File.ReadAllLines(FileName)
For Each myLine In strFileName
tempStr &= myLine & vbNewLine
Next

Next
Watcher.Stop()
Dim TimeToArrays = Watcher.Elapsed
Watcher.Reset()
'========================WRITE TO FILE
Dim file As System.IO.StreamWriter
file = My.Computer.FileSystem.OpenTextFileWriter("c:\test.txt", True)
Watcher.Restart()
file.Write(tempStr)
file.WriteLine(TimeToArrays)
file.WriteLine(Watcher.Elapsed)
Watcher.Stop()
file.Close()

'========================WRITE TO FILE

End If


Running this to dictionary-styled txt files, from A to Z, takes about a minute, which I find a lot for a total of just over 1MB of files.

Is there any way to speed-up the whole process?

Answer

Try this code - it runs in 6.5 seconds on nearly 90Mb of files.

Changes to your code can be summarised as:

  • Check the DialogResult and don't iterate the files until this part of the routine is complete.

  • During the read operations, use the Using...End Using to make the IO efficient.

  • Use a List(of String) to hold the data rather than keep appending to a String. I think this is more efficient from a memory management point of view. There may be quicker ways using other collections.

  • During the write operation use the Using...End Using to make the IO efficient.

Here's the code:

Sub DoItQuicker()

    Dim OpenAnswerFile As New OpenFileDialog
    OpenAnswerFile.Multiselect = True
    Dim strFileName() As String '// String Array.
    Dim strSingleFileContent As String
    Dim strFileData As New List(Of String)
    Dim FileName As String
    Dim Watcher As New Stopwatch

    'get user feedback
    If OpenAnswerFile.ShowDialog <> DialogResult.OK Then
        MessageBox.Show("Not OK")
        Exit Sub
    Else
        strFileName = OpenAnswerFile.FileNames
    End If
    OpenAnswerFile.Dispose()

    'read data
    Watcher.Restart()
    For Each FileName In OpenAnswerFile.FileNames

        Using sr As New IO.StreamReader(FileName)
            strSingleFileContent = sr.ReadToEnd
            For Each line As String In strSingleFileContent.Split(vbLf)
                strFileData.Add(line)
            Next
        End Using

    Next
    Watcher.Stop()
    Dim TimeToArrays = Watcher.Elapsed
    Debug.Print(Watcher.ElapsedMilliseconds)

    'write data
    Watcher.Restart()
    Using sw As New IO.StreamWriter("D:\temp\out.txt")
        For Each line As String In strFileData
            sw.WriteLine(line)
        Next
        sw.WriteLine(TimeToArrays)
        sw.WriteLine(Watcher.Elapsed)
    End Using
    Watcher.Stop()
    Debug.Write(Watcher.ElapsedMilliseconds)


End Sub