Andrew Andrew - 1 month ago 9
Vb.net Question

vb.net Filestream getting spaces after decoding bytes

I'm not sure what is happening. I don't think I changed the code at all, but for some reason I am getting spaces in between the returned characters after using the FileStream object to read the bytes of a file:

'Turn off Raise Events until after change is checked
fsw.EnableRaisingEvents = False

'read from current seek position to end of file
Dim bytesRead(_maxBytes) As Byte


Dim fs As New FileStream(_filename, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)

If (fs.Length > _maxBytes) Then
previousSeekPosition = fs.Length - _maxBytes
End If

previousSeekPosition = fs.Seek(previousSeekPosition, SeekOrigin.Begin)

Dim numBytes = fs.Read(bytesRead, 0, _maxBytes)

fs.Close()

previousSeekPosition += numBytes

Dim sb As New StringBuilder()
For i = 0 To numBytes - 1
sb.Append(bytesRead(i))
Next

'Raise the event to show data
If Not blnFirstRun Then
RaiseEvent MoreData(Me, Encoding.ASCII.GetString(bytesRead, 0, _maxBytes), _filename, _fileDescription)
Else
blnFirstRun = False
End If

'Check the changes against the alerts
AlertChange(Encoding.ASCII.GetString(bytesRead, 0, _maxBytes))

'Turn Raise Events back on
fsw.EnableRaisingEvents = True


I have the _maxBytes set to 16384. I'm basically reading the file from the last known read location any time there is a file change (similar to what Linux Tail would do).

I tested it on a file and it appeared to work great. For some reason, though, it doesn't want to work anymore. I don't think I changed anything - but it now returns changes with spaces now.

For example:

I have a file that I have appended '9999' to. When I run the Encoding.ASCII.GetString routine, it shows up as: '9 9 9 9'.

I feel like I'm beating my head against a wall for something probably real simple. Hopefully someone knows the answer quick.

Answer

The fact that you are getting '9 9 9 9' when "9999" was written to the file suggests that whatever wrote to the file was using UTF-16 encoding, which uses a minimum of two bytes per character (ref: Wikipedia: Comparison of Unicode encodings).

Examining the file with a hex editor should reveal if that is in fact the case.

Please take note of the remarks in Encoding.Unicode Property just in case there is something that could cause a problem.