Emanuele Vissani Emanuele Vissani - 1 month ago 15
C# Question

BinaryReader 30x faster after first run. Byte array still in memory?

My program reads x bytes from a file, checks if they are all zeros, repeats the process for 20.000 files, and keeps a list of the files that have non-zero bytes.
Trying to monitor performance, I made the number of bytes it checks for each file definable (byteSize).

The problem is that the first run of the program it takes ~5 minutes for it to complete (byteSize = 8192), but if I run it again it takes only 10 seconds, even if I close and restart the program, so the only cause that comes to my mind is that the byte array remains in memory.

BinaryReader is under a "using" directive, so as far as I know it should close the stream after the loop completes. So why the byte array remains? How can I delete it? I need to do it to measure actual performance each time I run the prog.

byte[] readByte = new byte[byteSize];

for (int i = 0; i < readCycles; i++)
{
using (BinaryReader reader = new BinaryReader(new FileStream(file, FileMode.Open, FileAccess.Read)))
{
reader.BaseStream.Seek(8192 + i * byteSize, SeekOrigin.Begin);
reader.Read(readByte, 0, byteSize);
}

foreach (byte b in readByte)
{
if (b != 0)
{
allZeros = false;
break;
}
else
allZeros = true;
}

if (allZeros == false) break;
}

Answer

This almost certainly has nothing to do with anything .NET is doing - it'll be the file system transparently caching for you.

To test this, change your code to just use FileStream and simply loop over the file reading it to a buffer and ignoring the data:

using (var stream = File.OpenRead(...))
{
    var buffer = new byte[16384];
    while (stream.Read(buffer, 0, buffer.Length) > 0)
    {
    }
}

I'm sure you'll see the same result - the first read will be relatively slow, then it'll be very fast.