I have a data processing MVC application that works with uploaded file sizes ranging from 100MB to 2GB and contains a couple of long running operations. Users will upload the files and the data in those files will be processed and then finally some analysis on the data will be sent to related users/clients.
It will take least a couple of hours to process the data, so in order to make sure the user doesn't have to wait all the way, I've spun up a separate task to do this long running operation. This way, once the files are received by the server and stored on the disk, the user will get a response back with a ReferenceID and they can close the browser.
So far, it's been working well as intended but after reading up on issues with using Fire-and-Forget pattern in MVC and worker threads getting thrown away by IIS during recycling, I have concerns about this approach.
Is this approach still safe? If not, How can I ensure that the thread that is processing the data doesn't die until it finishes processing and sends the data to clients? (in a relatively simpler way)
The app runs on .NET 4.5, so don't think I will be able to use
public ActionResult ProcessFiles()
HttpFileCollectionBase uploadedfiles = Request.Files;
var isValid = ValidateService.ValidateFiles(uploadedFiles);
var referenceId = DataProcessor.ProcessData(uploadedFiles);
public Class DataProcessor
public int ProcessFiles(HttpFileCollectionBase uploadedFiles)
var referenceId = GetUniqueReferenceIdForCurrentSession();
var location = SaveIncomingFilesToDisk(referenceId, uploadedFiles);
//ProcessData makes a DB call and takes a few hours to complete.
TaskFactory.StartNew(() => ProcessData(ReferenceId,location))
Log.Info("Completed Processing. Carrying on with other work");
//Below method takes about 30 mins to an hour
Is this approach still safe?
It was never safe.
Does using Async/Await at controller help?
The app runs on .NET 4.5, so don't think I will be able to use HostingEnvironment.QueueBackgroundWorkItem at the moment.
I have an AspNetBackgroundTasks library that essentially does the same thing as
QueueBackgroundWorkItem (with minor differences). However...
I've also thought of using a message queue on app server to store messages once the files are stored to disk and then making the DataProcessor a separate service/Process and then listen to the queue. If the queue is recoverable, then it will assure me that the messages will always get processed eventually even if the server crashes or the thread gets thrown away before finish processing the data. Is this a better approach?
Yes. This is the only reliable approach. It's what I call the "proper distributed architecture" in my blog post.