taras.roshko taras.roshko - 1 month ago 11
C# Question

C# SkipWhile leaks memory if predicate is false

// C:\logs\AzureSDK.log is ~2.5GB file
IEnumerable<string> lines = File.ReadLines(@"C:\logs\AzureSDK.log").SkipWhile(line => false);

Console.WriteLine(string.Join("\n", lines));
return;


This clearly does not return an iterator and allocates memory internally until I get OOM. Returning
true
in
SkipWhile
predicate does not lead to this and completes as expected (couple
MB
memory usage during the execution)

As per docs, method signature and common sense,
SkipWhile
must return an iterator and not load all the data into memory.

Machine info

Microsoft Windows [Version 10.0.14393]
Target 4.5.2, AnyCPU, Release
VS 2015 Update 3
NET 4.6.01586


Thoughts? I must be doing something stupid but unsure what

UPD: well the stupid thing was the string.Join I forgot about, which is appending to a single StringBuilder loading all the lines into memory.

I also checked SkipWhile sources and it's obviously perfectly fine:

public static IEnumerable<TSource> SkipWhile<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate) {
if (source == null) throw Error.ArgumentNull("source");
if (predicate == null) throw Error.ArgumentNull("predicate");
return SkipWhileIterator<TSource>(source, predicate);
}

static IEnumerable<TSource> SkipWhileIterator<TSource>(IEnumerable<TSource> source, Func<TSource, bool> predicate) {
bool yielding = false;
foreach (TSource element in source) {
if (!yielding && !predicate(element)) yielding = true;
if (yielding) yield return element;
}
}

Answer

SkipWhile does return an enumerator. But then you use string.Join to concatenate everything, and therefore end up loading the whole file into memory.

If you change your code to process each line independently, you'll see that you use much less memory:

foreach (var line in File.ReadLines(@"C:\logs\AzureSDK.log").SkipWhile(_ => false))
{
    Console.WriteLine(line);
}