Arnab Arnab - 1 month ago 15
C# Question

Adding rows from one IEnumerable to another based on conditions

I have two array..

var data1 = new[] {
new { Product = "Product 1", Year = 2009, Sales = 1212 },
new { Product = "Product 2", Year = 2009, Sales = 522 },
new { Product = "Product 1", Year = 2010, Sales = 1337 },
new { Product = "Product 2", Year = 2011, Sales = 711 },
new { Product = "Product 2", Year = 2012, Sales = 2245 },
new { Product = "Product 3", Year = 2012, Sales = 1000 }
};

var data2 = new[] {
new { Product = "Product 1", Year = 2009, Sales = 1212 },
new { Product = "Product 1", Year = 2010, Sales = 1337 },
new { Product = "Product 2", Year = 2011, Sales = 711 },
new { Product = "Product 2", Year = 2012, Sales = 2245 }
};


What I want to do is check for each distinct
Product
and
Year
in
data2
, and if any row exists for any combination of such
Product
and
Year
in
data1
but not in
data2
then add that row to
data2
.

Example..
In
data2
, distinct products are
Product1
and
Product2
and distinct years are
Year1
,
Year2
,
Year3
and
Year4
.

In data1 there exists a row
{ Product = "Product 2", Year = 2009, Sales = 522 }
, which is not present in
data2
, so I wish to add it to
data2
.

What I can do is get distinct products and years in two variables.

Then do a for each loop in both and check if combination exists in data1 but not in data2 and if so add it to data2.

What I would like to get is a single LINQ query which can do this job for me rather than doing two distinct separately and then doing a couple of for each loop.

Thanks

Answer

You can get this to work in a single query. However it is going to be sub-optimal, because for each item in data1 you would need to check three conditions, which potentially require going through the entire data2 for an O(m*n) time complexity (space complexity remains O(1), though).

You can avoid identical loop, though:

var uniqueProd = new HashSet<string>(data2.Select(d=>d.Product));
var uniqueYear = new HashSet<int>(data2.Select(d=>d.Year));
var knownPairs = new HashSet<Tuple<string,int>>(
    data2.Select(d=>Tuple.Create(d.Product, d.Year))
);
var newData2 = data2.Concat(
    data1.Where(d =>
        uniqueProd.Contains(d.Product)                       // The product is there
    &&  uniqueYear.Contains(d.Year)                          // The year is there
    && !knownPairs.Contains(Tuple.Create(d.Product, d.Year)) // Combination is not there
    )
).ToArray();

This solution is O(m+n) in time and also O(n) in space.

Comments