Mrinal Kamboj -4 years ago 175
C# Question

Improve the time complexity of current Linq queries

I have the following lists:

`RakeSnapshots, ProductMovements`

Aim is to process the both and get the count of elements that match a condition, as follows:

• Consider
`RakeSnapshots`
with
`StatusCode == "Dumping"`

• Consider
`ProductMovement`
with
`Status == "InProgress"`

• Fetch the
`count`
of all elements both lists, which meet the condition
`RakeSnapshots.RakeCode`
equal to
`ProductMovements.ProductCode`

Following are my current options:

// Code 1:

`````` var resultCount =  ProductMovements.Where(x => RakeSnapshots
.Where(r => r.StatusCode == "Dumping")
.Any(y => y.RakeCode == x.ProductCode  &&
x.Status == "InProgress"))
.Count();
``````

// Code 2:

``````var productMovementsInprogress = ProductMovements.Where(x => x.Status == "InProgress");

var rakeSnapShotsDumping = RakeSnapshots.Where(r => r.StatusCode == "Dumping");

var resultCount = productMovementsInprogress.Zip(rakeSnapShotsDumping,(x,y) => (y.RakeCode == x.ProductCode) ?  true : false)
.Where(x => x).Count();
``````

Challenge is both the codes are
`O(n^2)`
complexity, is there a way to improve it, this will hurt if the data is very large

You can use an `inner join` to do this:

``````var dumpingRakeSnapshots       = rakeSnapshots.Where(r => r.StatusCode == "Dumping");
var inProgressProductMovements = productMovements.Where(p => p.Status == "InProgress");

var matches =
from r in dumpingRakeSnapshots
join p in inProgressProductMovements on r.RakeCode equals p.ProductCode
select r;

int count = matches.Count(); // Here's the answer.
``````

Note that (as Ivan Stoev points out) this only works if RakeCode is the primary key of `RakeSnapshots`.

If it is not, you will have to use a `grouped join`.

Here's the Linq query syntax version that you should use in that case, but note that this is exactly the same as Ivan's answer (only in Linq query form):

``````var matches =
from r in dumpingRakeSnapshots
join p in inProgressProductMovements on r.RakeCode equals p.ProductCode into gj
select gj;
``````

For completeness, here's a compilable console app that demonstrates the different results you'll get if `RakeCode` and `ProductCode` are not primary keys:

``````using System;
using System.Collections.Generic;
using System.Linq;

namespace ConsoleApp1
{
class RakeSnapshot
{
public string StatusCode;
public string RakeCode;
}

class ProductMovement
{
public string Status;
public string ProductCode;
}

sealed class Program
{
void run()
{
var rakeSnapshots = new List<RakeSnapshot>
{
new RakeSnapshot {StatusCode = "Dumping", RakeCode = "1"},
new RakeSnapshot {StatusCode = "Dumping", RakeCode = "1"},
new RakeSnapshot {StatusCode = "Dumping", RakeCode = "2"}
};

var productMovements = new List<ProductMovement>
{
new ProductMovement {Status = "InProgress", ProductCode = "1"},
new ProductMovement {Status = "InProgress", ProductCode = "2"},
new ProductMovement {Status = "InProgress", ProductCode = "2"}
};

var dumpingRakeSnapshots       = rakeSnapshots.Where(r => r.StatusCode == "Dumping");
var inProgressProductMovements = productMovements.Where(p => p.Status == "InProgress");

// Inner join.

var matches1 =
from r in dumpingRakeSnapshots
join p in inProgressProductMovements on r.RakeCode equals p.ProductCode
select r;

Console.WriteLine(matches1.Count());

// Grouped join.

var matches2 =
from r in dumpingRakeSnapshots
join p in inProgressProductMovements on r.RakeCode equals p.ProductCode into gj
select gj;

Console.WriteLine(matches2.Count());

// OP's code.

var resultCount =
productMovements
.Count(x => rakeSnapshots
.Where(r => r.StatusCode == "Dumping")
.Any(y => y.RakeCode == x.ProductCode && x.Status == "InProgress"));

Console.WriteLine(resultCount);
}

static void Main(string[] args)
{
new Program().run();
}
}
}
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download