taras.roshko taras.roshko - 11 months ago 59
C# Question

C# SkipWhile leaks memory if predicate is false

// C:\logs\AzureSDK.log is ~2.5GB file
IEnumerable<string> lines = File.ReadLines(@"C:\logs\AzureSDK.log").SkipWhile(line => false);

Console.WriteLine(string.Join("\n", lines));

This clearly does not return an iterator and allocates memory internally until I get OOM. Returning
predicate does not lead to this and completes as expected (couple
memory usage during the execution)

As per docs, method signature and common sense,
must return an iterator and not load all the data into memory.

Machine info

Microsoft Windows [Version 10.0.14393]
Target 4.5.2, AnyCPU, Release
VS 2015 Update 3
NET 4.6.01586

Thoughts? I must be doing something stupid but unsure what

UPD: well the stupid thing was the string.Join I forgot about, which is appending to a single StringBuilder loading all the lines into memory.

I also checked SkipWhile sources and it's obviously perfectly fine:

public static IEnumerable<TSource> SkipWhile<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate) {
if (source == null) throw Error.ArgumentNull("source");
if (predicate == null) throw Error.ArgumentNull("predicate");
return SkipWhileIterator<TSource>(source, predicate);

static IEnumerable<TSource> SkipWhileIterator<TSource>(IEnumerable<TSource> source, Func<TSource, bool> predicate) {
bool yielding = false;
foreach (TSource element in source) {
if (!yielding && !predicate(element)) yielding = true;
if (yielding) yield return element;

Answer Source

SkipWhile does return an enumerator. But then you use string.Join to concatenate everything, and therefore end up loading the whole file into memory.

If you change your code to process each line independently, you'll see that you use much less memory:

foreach (var line in File.ReadLines(@"C:\logs\AzureSDK.log").SkipWhile(_ => false))