Rob Rob - 9 months ago 38
MySQL Question

What is the best practice for querying related tables using LINQ given the following example?

Let's say I am generating a View Model from several related tables. What are the advantages/differences/disadvantages of querying either in this way:

var enrollment = db.enrollment
.Include(d => d.cohort.OldCourses.OldCourseSections.Select(f => f.OldCoursePages))
.Include(d => d.OldProgress)
.FirstOrDefault(b => b.UserID == currentuser && b.cohort.OldCourses.Titleabbrev == courseabbrev);
var viewModel = new OldSectionViewModel();

viewModel.OldCourseTitle = enrollment.cohort.OldCourses.OldCourseTitle;
viewModel.OldCourseSec_title = enrollment.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCourseSec_title;
viewModel.Meta = enrollment.cohort.OldCourses.Meta;
viewModel.Titleabbrev = enrollment.cohort.OldCourses.Titleabbrev;
viewModel.OldCourseSec_abbrev = enrollment.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCourseSec_abbrev;
viewModel.progress = currentprogress;
viewModel.pages = enrollment.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCourse_Page_Total;
viewModel.EnrollmentID = enrollment.EnrollmentID;
viewModel.OldCourseSectionID = enrollment.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCourseSectionID;

viewModel.OldCoursePage_title = enrollment.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCoursePages.FirstOrDefault(g => g.OldCoursePage_Order ==

viewModel.OldCoursePage_HTML = enrollment.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCoursePages.FirstOrDefault(g => g.OldCoursePage_Order ==

viewModel.OldCoursePage_Order = enrollment.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCoursePages.FirstOrDefault(g => g.OldCoursePage_Order ==

Or alternatively this way:

var viewModel = db.enrollment
.Where(b => b.UserID == currentuser && b.cohort.OldCourses.Titleabbrev == courseabbrev)

.Select(x => new OldSectionViewModel
OldCourseTitle = x.cohort.OldCourses.OldCourseTitle,
OldCourseSec_title = x.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCourseSec_title,
Meta = x.cohort.OldCourses.Meta,
Titleabbrev = x.cohort.OldCourses.Titleabbrev,
OldCourseSec_abbrev = x.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCourseSec_abbrev,
progress = currentprogress,
pages = x.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCourse_Page_Total,
EnrollmentID = x.EnrollmentID,
OldCourseSectionID = x.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCourseSectionID,

OldCoursePage_title = x.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCoursePages.FirstOrDefault(g => g.OldCoursePage_Order ==

OldCoursePage_HTML = x.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCoursePages.FirstOrDefault(g => g.OldCoursePage_Order ==

OldCoursePage_Order = x.cohort.OldCourses.OldCourseSections.FirstOrDefault(f => f.OldCourseSec_abbrev == secabbrev).OldCoursePages.FirstOrDefault(g => g.OldCoursePage_Order ==


Assuming 'currentprogress' is an integer set elsewhere. It appears to me that the second query would be slightly more effective as I am loading on the specific fields I require, whereas with the first query I am loading the entire tables. However my understanding of these things is extremely basic and I would be interested to hear exactly which of these is more efficient and why from someone who genuinely knows what they are talking about?

Do both of these queries qualify as eager loading?

Answer Source

Simply, in the first method, you're including all the related tables in a single query. In the second method, you're generating N+1 queries.

While generally speaking you should avoid N+1 query situations, it also depends a lot on what's happening with the data. Querying this many related tables all at once is going to bring down a ton of data, and the nature of doing these joins means that you're going to be bring down all related items. If you actually only need like the first related item from each, for example, then by doing it one big query, you're returning a lot of data that you will never utilize.

Doing N+1 queries could mean sending a lot of queries to your server, but the definition of "a lot" is variable. If it ends up being like 10-20 relatively small queries and you've got a good DB server instance with plenty of resources, it will likely not matter much, and you're better off, then, selecting just the data you actually need.

Additionally, you can possibly employ memory caching so that you may only need to run these queries just once an hour or something at the maximum. You just need to study what's happening with either option and determine which is ultimately the most efficient / makes the most sense.