Daniele Daniele - 3 months ago 18
C# Question

Linq query and Foreach on large number of records from SQL database

I'm working with Entity Framework and Linq. I need to do a query on 2 properties of my object.

I have this object on the database (about 200000 records):

public class DeviceState
{
public int ID { get; set; }
public DateTime TimeStamp { get; set; }
public string StatusCode { get; set; }
public int Device_ID { get; set; }
}


I need to do a query like this:

List<DeviceState> listState = systemDB.DeviceStates.Where(s => s.Device_ID == DeviceID).Where(l => l.TimeStamp > startDate).Where(l => l.TimeStamp < endDate).Where(s => s.StatusCode == "xx").ToList();
foreach (DeviceState status in listState)
{
// here I need to save in an object the status code and the time stamp:
object.StatusCode= status.StatusCode;
object.TimeStamp = status.TimeStamp;
}


This query takes much time (15 minutes approx). I think that is due to the creation of the list. So I tried this:

foreach (DeviceState status in systemDB.DeviceStates.Where(s => s.Device_ID == DeviceID).Where(l => l.TimeStamp > startDate).Where(l => l.TimeStamp < endDate).Where(s => s.StatusCode == "xx"))
{
// here I need to save in an object the status code and the time stamp:
object.StatusCode= status.StatusCode;
object.TimeStamp = status.TimeStamp;
}


This was much faster in creating the list. But I still have performance problems due to the foreach cycle. It takes 5ms for each element.

I need to find a solution that takes seconds to execute that.

Answer

You can do these things to help the generated query.

  1. Return only what you need. Now you are returning everything BUT you only ever use TimeStamp so just return that. You are setting StatusCode below however you are already filtering this in the Where clause so you know that all returned items have a StatusCode of "xx" so no need to retrieve that too. This is less data coming back across the network and less cycles taken to map that data to objects and less memory that the data takes up.
  2. You should look at the generated query by EF. You can do this using a sql profiler tool (Sql Server has one called Sql Profiler). Then look at the query plan and see if there is anything that could benifit you like adding a missing index. This profiling should be done on the database server, not in c#.
  3. Consolidate your where clauses because it's easier to read.

Code

// list of timestamps from database
var timeStamps = systemDB.DeviceStates.Where(s => s.Device_ID == DeviceID && 
                                                  s.TimeStamp > startDate && 
                                                  s.TimeStamp < endDate &&
                                                  s.StatusCode == "xx")
                                      .Select(x => x.TimeStamp).ToList();

If you still wanted your status code because you removed the filter you could do it like this

var timeStamps = systemDB.DeviceStates.Where(s => s.Device_ID == DeviceID &&
                                                  s.TimeStamp > startDate && 
                                                  s.TimeStamp < endDate && 
                                                  s.StatusCode == "xx")
                                      .Select(x => new {x.TimeStamp, x.StatusCode})
                                      .ToList();