Johannes Johannes - 23 days ago 6
C# Question

Performance difference between .removeall and .where in C#

I have a list of items and I want to iterate through a subset of them. Now, I am wondering if there is a performance impact difference between removing the unwanted items from the list and then loop through it; or simply filtering the list in the for loop.

Here is an example.

The RemoveAll approach:

list.RemoveAll(o => !someOtherList.Contains(o.Property));

foreach (var i in list)
{
}


The Where approach:

foreach (var i in list.Where(o => someOtherList.Contains(o.Property))
{
}


I understand that the first approach is actually going to manipulate what is in the list where as the second one won't. That doesn't really concern me. I am more concerned as to whether the filter in the second approach is applied for each iteration or whether C# is smart enough to create a subset and only loop through that subset (almost like the first approach with a temp variable).

Answer

I am more concerned as to whether the filter in the second approach is applied for each iteration or whether C# is smart enough to create a subset and only loop through that subset (almost like the first approach with a temp variable)

Linq's Where uses yield in order to return the elements one at a time once requested.

So actually what is done in the second approach is:

1- Iterate through the list

2- Check if the Current element matches the condition (Loops through someOtherList unless it is a special lookup data structure eg. HashSet)

3- Once we find the first element return it

4- Execute the foreach body logic

5- Continue Searching from where we stopped at step 3

Meaning if you decided to break based on some condition inside the foreach block, then maybe not all the list may be scanned at that point which at some cases may give a performance boost on large lists.