gulmaily gulmaily - 1 year ago 129 Question

Remove Repeated Words from Text file

I have a text file, contaning nearly 45,000 words, one word in each line. Thousands of these words appear more than 10 times. I want to create a new file in which there is no repeated word. I used Stream reader but it reads the file only once. How can I get rid of the repeated words. Please help me. Thanks
My code was like this

Catch ex As Exception
Exit Sub
End Try

Dim line As String = String.Empty
Dim OldLine As String = String.Empty
Dim sr = File.OpenText(TextBox1.Text)

line = sr.ReadLine
OldLine = line

Do While sr.Peek <> -1
line = sr.ReadLine
If OldLine <> line Then
My.Computer.FileSystem.WriteAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "\Splitted File without Repeats.txt", line & vbCrLf, True)
End If

OldLine = line

System.Diagnostics.Process.Start(My.Computer.FileSystem.SpecialDirectories.Desktop & "\Splitted File without Repeats.txt")
MsgBox("Loop terminated. Stream Reader Closed." & vbCrLf)

Answer Source

You can use LINQ's Distinct() method for this.

This will work for smaller files:

Dim lines As String() = File.ReadAllLines("yourfile.txt")
File.WriteAllLines("yourfile.txt", lines.Distinct().ToArray())
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download