Arnab Arnab - 1 month ago 8
C# Question

Top 5 of a column and top 5 of another column for each of first column

I have a lot of data in the below format..

var data1 = new[] {
new { Product = "Product 1", Year = 2009, Sales = 1212 },
new { Product = "Product 2", Year = 2009, Sales = 522 },
new { Product = "Product 1", Year = 2010, Sales = 1337 },
new { Product = "Product 2", Year = 2011, Sales = 711 },
new { Product = "Product 2", Year = 2012, Sales = 2245 },
new { Product = "Product 3", Year = 2012, Sales = 1000 }
};


If I wanted to get the top 20 rows with max sales, I could do something as below..

data1.OrderByDescending(o=>o.Sales).Take(20);


But what I want to do is get the top 5 Products and (for those products) the top 5 years along with their sales.

So, the output would be something like below:

var outputdata = new[] {
new { Product = "Product 1", Year = 2012, Sales = 2245 },
new { Product = "Product 1", Year = 2010, Sales = 1337 },
new { Product = "Product 1", Year = 2009, Sales = 1212 },
new { Product = "Product 1", Year = 2011, Sales = 711 },
new { Product = "Product 1", Year = 2013, Sales = 522 },
new { Product = "Product 2", Year = 2012, Sales = 1000 }
};


This might be a similar question for sql. but unfortunately could not understand how to convert to linq.

Answer

Ok if I understood correctly: First group by the product so you can order by the total sales of a product. Then you can take only the amount you want. Use SelectMany to flatten the groups:

var data = new[] {
          new { Product = "Product 1", Year = 2009, Sales = 1212 },
          new { Product = "Product 2", Year = 2009, Sales = 522 },
          new { Product = "Product 1", Year = 2010, Sales = 1337 },
          new { Product = "Product 2", Year = 2011, Sales = 711 },
          new { Product = "Product 2", Year = 2012, Sales = 2245 },
          new { Product = "Product 3", Year = 2012, Sales = 1000 }
      };
int numberOfProducts = 2;
int numberOfYearsForEachProduct = 3;

var result = data.GroupBy(x => x.Product)
    .OrderByDescending(x => x.Sum(y => y.Sales)) //Order products by their total sum of `Sales`
    .Take(numberOfProducts )
    .SelectMany(x => x.OrderByDescending(y => y.Year).Take(numberOfYearsForEachProduct)) // Take only the N top years for each product
    .ToList();
  1. I used smaller numbers in the Take so I can see that it is doing it correctly
  2. From the question I understood you want to order by the year for each product. If, as your example output looks like, you want to order by the sales then replace the inner OrderByDescending with the Sales property