kayjtea kayjtea - 1 month ago 18
C# Question

C# local arrays IReadOnlyCollection/IReadOnlyList optimizing away object crreation

Do the following three methods have equivalent garbage collection behavior? #1 would be a stretch, but is today's C# compiler smart enough to optimize away the object creation in #2 on every invocation of the method? I specifically do not want to hoist the array initialization outside of the method.

public Boolean IsRgb1(string color)
{
string[] colors = { "red", "green", "blue" };
return colors.Contains(color);
}

public Boolean IsRgb2(string color)
{
IReadOnlyCollection<string> colors = new[] { "red", "green", "blue" };
return colors.Contains(color);
}

public Boolean IsRgb3(string color)
{
switch(color)
{
case "red":
case "green":
case "blue":
return true;
default:
return false;
}
}

Answer

There is no compiler magic that occurs relating to these types. An array is always created in both Rgb1 and Rgb2, on each invocation.

The array declaration shorthand syntax

string[] colors = { "red", "green", "blue" };

is the same as (showing the 'deduced syntax')

string[] colors = new string[3] { "red", "green", "blue" };

The basic rule is: new always creates a new object/instance. To create the (array) object only once, only create it once. The single array instance could then be shared using a member/field. This 'hoisting' must be manually done.

// One array created and stored for later use ..
private static string[] Colors = { "red", "green", "blue" };
// .. independent of number of times this method is called
public Boolean IsRgb(string color)
{
    return Colors.Contains(color);
}

In both cases the Contains is from IEnumerable<T> as both T[] and IReadOnlyList<T> are subtypes of IEnumerable1 and eligible for the LINQ Contains extension method. The same IEnumerable Contains implementation (LINQ To Objects) will be used and any specialization that is applied to arrays should apply to both cases.

The Rgb3 case avoids the array creation entirely and it avoids some method calls and it avoids overhead of doing the generalized collection Contains 'looping' logic. It will be the fastest - if/where such matters - simply because it has the least to do.

A simple switch statement for strings can be considered an alternative way of writing a series of if..else if.. comparing the same value. In this case there is no new object creation per method call: the string literals have been interned and there is clearly no new array.

Alternatively, consider simply using a single expression:

 return color == "red" || color == "green" || color == "blue";

1Because the type inheritance is confusing, here is a small extract:

T[] -> IEnumerable<T> (Contains as Extension Method)
    -> IList<T> -> ICollection<T> (Contains in Interface) -> IEnumerable<T>
    -> IReadOnlyList<T> -> IEnumerable<T>
                        -> IReadOnlyCollection<T> -> IEnumerable<T>

Since T[] is a subtype of IReadOnlyList<T> the assignment in Rgb2 resulted in an implicit upcast - the variable still names the newly created array object. The selection of IEnumerable<T>.Contains happens at compile-time and thus both Rgb1 and Rgb2 methods will use the extension method IEnumerable<T>.Contains on the original created array object. To use the ICollection<T>.Contains would require ((IList<string>)colors).Contains(..) or similar.