Guerrilla Guerrilla - 1 month ago 7
C# Question

Managing memory for long running IO bound process

I have a method that runs on a record that involves third party API calls so it's not CPU bound and runs slowly in the background but because the data set is so large it is causing me memory issues. I cannot load the entire list in one go (it causes exception) so I am paging through it in batches. This works but each batch is adding to the ram usage as I am assuming it is being tracked by the context.

I think to fix this I could detach each batch after its finished processing. I tried this:

using (var db = new PlaceDBContext())
{
int count = 0;

while(count < total)
{
var toCheck = db.Companies
.Include(x => x.CompaniesHouseRecords)
.Where(x => x.CheckedCompaniesHouse == false)
.OrderBy(x => x.ID)
.Skip(count)
.Take(1000)
.ToList();

foreach (var company in toCheck)
{
// do all the stuff that needs to be done
// removed for brevity but it makes API calls
// and creates/updates records

company.CheckedCompaniesHouse = true;
db.SaveChanges();
count++;
}
// attemmpted to detach to free up ram but doesn't work
db.Entry(toCheck).State = EntityState.Detached;
}
}


It causes this exception after batch is complete:


The entity type List`1 is not part of the model for the current
context


I guess this is because I enumerated it to a list and it's actually tracking the records inside the list.

What is the proper way to detach the records and nested records so the ram doesn't fill up? Should I be approaching this in a different way?

edit:

I also tried detaching each company record as I loop over it but the ram still goes up

foreach (var company in toCheck)
{
// do all the stuff that needs to be done
// removed for brevity but it makes API calls
// and creates/updates records

company.CheckedCompaniesHouse = true;
db.SaveChanges();
count++;
foreach(var chr in company.CompaniesHouseRecords.ToList())
{
db.Entry(chr).State = EntityState.Detached;
}
db.Entry(company).State = EntityState.Detached;
}

Answer

Contexts are meant to be short lived and cheap to create, if you're worried that it's hogging memory and that's an issue because your while is long running maybe you could try this :

int count = 0;

while(count < total)
{
    using (var db = new PlaceDBContext()) // create a new context each time
    {
        var toCheck = db.Companies
            .Include(x => x.CompaniesHouseRecords)
            .Where(x => x.CheckedCompaniesHouse == false)
            .OrderBy(x => x.ID)
            .Skip(count)
            .Take(1000)
            .ToList();

        foreach (var company in toCheck)
        {
            // do all the stuff that needs to be done
            // removed for brevity but it makes API calls
            // and creates/updates records

            company.CheckedCompaniesHouse = true;
            db.SaveChanges();
            count++;
        }
    }
}
Comments