chapskev chapskev - 5 months ago 16
MySQL Question

SQL query time difference when adding an inner join and group by gives wrong calculations

Am trying to come up with an sql query to calculate, the time difference from a start time and an end time, then sum up the duration for each school.

I have the following table :


  1. -Class /*Contains all the classes tied to a school using school_id */

  2. -School /*contains all the schools in tied to a hub using hub_id */

  3. -Hub /*contains all the hubs details (one hub can have many schools in it) */

  4. -student /*contains student profile, which is tied to a class using class_id */

  5. -session_details /contains the session details which include the session start time, end time, session_id,/

  6. -student_attendance_register /* used to mark the register for all the students in a class for a particular session using session_id
    and student_id as in_attendance=0 or in_attendance =1 /*



I have a mini query which works perfect, the query calculates the duration of the session and finds the total duration for all the classes in a school.

select
c.class_name,c.school_id,d.school_name,e.session_date,e.time_finish,e.time_start,
CAST(sum(TIMEDIFF(e.time_finish, e.time_start)) as TIME) as duration
From session_details as e
Inner join class as c
on c.id = e.class_id
Inner join school as d
on c.school_id = d.id
group by c.school_id;


My problem pops up when i add inner joins to get the student profile and their attendance. See below my query

select
a.session_details_id,c.class_name,c.school_id,d.school_name,
e.class_id,e.session_date,e.time_finish,e.time_start,
SUM((TIME_TO_SEC(e.time_finish) - TIME_TO_SEC(e.time_start))/60) as session_duration,
COUNT(CASE WHEN a.in_attendance = 1 and b.gender = 1 then 1 END) as boys_present,
COUNT(CASE WHEN a.in_attendance = 1 and b.gender = 2 then 1 END) as girls_present,
COUNT(CASE WHEN a.in_attendance = 0 and b.gender = 1 then 1 END) as boys_absent,
COUNT(CASE WHEN a.in_attendance = 0 and b.gender = 2 then 1 END) as girls_absent,
COUNT(CASE WHEN b.gender = 2 then 1 END) as girls_total,
COUNT(CASE WHEN b.gender = 1 then 1 END) as boys_total
from session_details as e
inner join class as c
on c.id = e.class_id
inner join school as d
on c.school_id = d.id
inner join attendance_student as a
on e.id = a.session_details_id
inner join student as b
on a.student_id = b.id
group by c.school_id;


Image 1 shows the results set without with group for all session that were done for each class in a school. This returns correct time duration.

Image 1 shows the results set without with group for all session that were done for each class in a school. This returns correct time duration.

Image 2 shows the result set with group for all session that we done for each class in a school.
Image 2 shows the result set with group for all session that we done for each class in a school.

When running query 2, whereby which is an extension of query one for adding the student profile and the total no of students who were in attendance.

When running query 2, whereby which is an extension of query one for adding the student profile and the total no of students who were in attendance.

When running query 2,without grouping by shool_id i get the following results
When running query 2,without grouping by shool_id i get the following results

Note
-I have tried to change the time to seconds then do the division then convert to minutes which return the wrong total.

sum((TIME_TO_SEC(e.time_finish) - TIME_TO_SEC(e.time_start)))/60 as duration


-I have also tried to do a sum for the time finish in second, subtract the finish and start time then divide the total

-Am also in doubt of how am doing my inner joins, and maybe the problem lies in there, since once i introduce the inner joins to students i get the errors. Other totals are working ok except for my finish and start time which are time data type on the database.


Here is a link to the related tables.

Kindly assit on how best to go about doing a correct calculation for the duration once i inner join and group the tables. :)

http://sqlfiddle.com/#!9/36bb8/2

Answer

If I understood all the details, the problem is in the group by. You must group by every field with no aggregation function (like SUM or COUNT). Using your second SQL query you change the group by like this :

select 
    a.session_details_id,c.class_name,c.school_id,d.school_name,
    e.class_id,e.session_date,e.time_finish,e.time_start,
    SUM((TIME_TO_SEC(e.time_finish) - TIME_TO_SEC(e.time_start))/60) as session_duration,
    COUNT(CASE WHEN a.in_attendance = 1 and b.gender = 1 then 1 END) as boys_present,
    COUNT(CASE WHEN a.in_attendance = 1 and b.gender = 2 then 1 END) as girls_present,
    COUNT(CASE WHEN a.in_attendance = 0 and b.gender = 1 then 1 END) as boys_absent,
    COUNT(CASE WHEN a.in_attendance = 0 and b.gender = 2 then 1 END) as girls_absent,
    COUNT(CASE WHEN  b.gender = 2 then 1 END) as girls_total,
    COUNT(CASE WHEN  b.gender = 1 then 1 END) as boys_total
from session_details as e
    inner join class as c
      on c.id = e.class_id
    inner join school as d 
      on c.school_id = d.id
    inner join attendance_student as a
      on e.id = a.session_details_id
    inner join student as b
      on a.student_id = b.id
group by a.session_details_id,c.class_name,c.school_id,d.school_name,
    e.class_id,e.session_date,e.time_finish,e.time_start 

But, then, the SUM in the session_duration calculation, add the total duration for each student, so you need to make calculation of session_duration without aggregation function:

select 
    a.session_details_id,c.class_name,c.school_id,d.school_name,
    e.class_id,e.session_date,e.time_finish,e.time_start,
    (TIME_TO_SEC(e.time_finish) - TIME_TO_SEC(e.time_start))/60 as     session_duration,
    COUNT(CASE WHEN a.in_attendance = 1 and b.gender = 1 then 1 END) as     boys_present,
    COUNT(CASE WHEN a.in_attendance = 1 and b.gender = 2 then 1 END) as girls_present,
    COUNT(CASE WHEN a.in_attendance = 0 and b.gender = 1 then 1 END) as boys_absent,
    COUNT(CASE WHEN a.in_attendance = 0 and b.gender = 2 then 1 END) as girls_absent,
    COUNT(CASE WHEN  b.gender = 2 then 1 END) as girls_total,
    COUNT(CASE WHEN  b.gender = 1 then 1 END) as boys_total
from session_details as e
    inner join class as c
      on c.id = e.class_id
    inner join school as d 
      on c.school_id = d.id
    inner join attendance_student as a
      on e.id = a.session_details_id
    inner join student as b
      on a.student_id = b.id
group by a.session_details_id,c.class_name,c.school_id,d.school_name,
    e.class_id,e.session_date,e.time_finish,e.time_start

If you run this query on SQLFiddle you provide, your results are:

Results in SQLFiddle

-- Edited --

After your comments I'll try again ;)

If you strip off class_id, session_date, time_start, time_finish to only get totals by school, this query will do the work:

select 
  c.school_id, sc.school_name,
  sum((TIME_TO_SEC(sd.time_finish) - TIME_TO_SEC(sd.time_start))/60) as session_duration,
  sum(stc.boys_present) boys_present, sum(stc.girls_present) girls_present,
  sum(stc.boys_absent) boys_absent, sum(stc.girls_absent) girls_absent,
  sum(stc.boys_total) boys_total, sum(stc.girls_total) girls_total
  from session_details sd
       inner join (
          select a.session_details_id,
              COUNT(CASE WHEN a.in_attendance = 1 and s.gender = 1 then 1 END) as boys_present,
              COUNT(CASE WHEN a.in_attendance = 1 and s.gender = 2 then 1 END) as girls_present,
              COUNT(CASE WHEN a.in_attendance = 0 and s.gender = 1 then 1 END) as boys_absent,
              COUNT(CASE WHEN a.in_attendance = 0 and s.gender = 2 then 1 END) as girls_absent,
              COUNT(CASE WHEN s.gender = 2 then 1 END) as girls_total,
              COUNT(CASE WHEN s.gender = 1 then 1 END) as boys_total
            from attendance_student a
                 inner join student s
                       on a.student_id = s.id
           group by a.session_details_id
         ) stc
        on sd.id = stc.session_details_id
      inner join class c
        on sd.class_id = c.id
      inner join school sc
        on c.school_id = sc.id
 group by c.school_id, sc.school_name

Here I use a different approach making a subquery (maybe you prefer save as a view) to calculate student assistance totals.

First I count the students for each session using only two tables, attendance_student and student. Results are grouped by th ID os the session, so I have students by session. This query can be saved as a view for readability.

select a.session_details_id,
    COUNT(CASE WHEN a.in_attendance = 1 and s.gender = 1 then 1 END) as boys_present,
    COUNT(CASE WHEN a.in_attendance = 1 and s.gender = 2 then 1 END) as girls_present,
    COUNT(CASE WHEN a.in_attendance = 0 and s.gender = 1 then 1 END) as boys_absent,
    COUNT(CASE WHEN a.in_attendance = 0 and s.gender = 2 then 1 END) as girls_absent,
    COUNT(CASE WHEN s.gender = 2 then 1 END) as girls_total,
    COUNT(CASE WHEN s.gender = 1 then 1 END) as boys_total
  from attendance_student a
       inner join student s
             on a.student_id = s.id
 group by a.session_details_id

) stc

After that I join this results with session_details to calculate session_duration for each session. Our resultset now has a session_details_id, session_duration and the count of students (one row for each session).

select 
  sd.id as session_id,
  (TIME_TO_SEC(sd.time_finish) - TIME_TO_SEC(sd.time_start))/60 as session_duration,
  sum(stc.boys_present) boys_present, sum(stc.girls_present) girls_present,
  sum(stc.boys_absent) boys_absent, sum(stc.girls_absent) girls_absent,
  sum(stc.boys_total) boys_total, sum(stc.girls_total) girls_total
  from session_details sd
       inner join (
          select a.session_details_id,
              COUNT(CASE WHEN a.in_attendance = 1 and s.gender = 1 then 1 END) as boys_present,
              COUNT(CASE WHEN a.in_attendance = 1 and s.gender = 2 then 1 END) as girls_present,
              COUNT(CASE WHEN a.in_attendance = 0 and s.gender = 1 then 1 END) as boys_absent,
              COUNT(CASE WHEN a.in_attendance = 0 and s.gender = 2 then 1 END) as girls_absent,
              COUNT(CASE WHEN s.gender = 2 then 1 END) as girls_total,
              COUNT(CASE WHEN s.gender = 1 then 1 END) as boys_total
            from attendance_student a
                 inner join student s
                       on a.student_id = s.id
           group by a.session_details_id
         ) stc
        on sd.id = stc.session_details_id

We want to sum results by school so we need add to more tables, class and school. We don't really need class table to show any data but an intermediate table to get school data.

Comments