NWT_Bob NWT_Bob - 25 days ago 14
MySQL Question

MySQL CrossTab and remove duplicates

I have a table in MySQL that has students entered for every year they are registered. I want to produce a query that will show nnn students born in yyyy for academic year ending aaaa. I got that query working pretty quickly but the part that is stumping me is that I have some duplicate entries in the database that I need to reject from my totals.

Here is my query so far. And, as I say, I know that it's allowing duplicate values in for some years.

SELECT COUNT(YEAR(birthdate)) as YearCount, YEAR(birthdate) as BirthYear, AcadYearEnding
FROM Enrollment
GROUP BY AcadYearEnding
ORDER BY AcadYearEnding, BirthYear


The results end up looking like (if I put in a WHERE clause to restrict to a specific student):

YearCount BirthYear AcadYearEnding
1 2007 2012
2 2007 2013
1 2007 2014
1 2007 2015


This is the query with the WHERE clause that generated the data above ... but I eventually will remove the WHERE clause to show all students but without duplicate entries for any of the students.

SELECT COUNT(YEAR(birthdate)) as YearCount, YEAR(birthdate) as BirthYear, AcadYearEnding
FROM Enrollment
WHERE StudentId=16183
GROUP BY AcadYearEnding
ORDER BY AcadYearEnding, BirthYear


Given my where clause is restricting to a single student, I know the 2013 entry is the result of an extra row in the database (and doing clean-up on the data isn't an option, we need to restrict it in the query).

I tried doing a sub-select to give me a list of unique student IDs ... but in 2013, it found that student ID twice so it still counted it twice.

Here is the raw data that is generating the incorrect results because there are 2 records for 2013:

RecId StudentId Birthdate AcadYearEnding
39885 16183 11/29/2005 2012
33768 16183 11/29/2005 2013
31020 16183 11/29/2005 2013
59508 16183 11/29/2005 2014
64054 16183 11/29/2005 2015


RecId is the unique identifier for the table.

It's possible that what I need is the following (but I would appreciate someone confirming):

SELECT COUNT(YEAR(birthdate)) as YearCount, YEAR(birthdate) as BirthYear, AcadYearEnding
FROM (SELECT DISTINCT(StudentId), birthdate, AcadYearEnding From Test) As Ed
GROUP BY AcadYearEnding
ORDER BY AcadYearEnding, BirthYear

Answer

So the proposed (possible) answer I had at the bottom of my question was close ... but not quite right. The actual answer turned out to be:

SELECT (COUNT(YEAR(Birthdate)) As YearCount, Year(Birthdate) As BirthYear, AcadYearEnding
From (SELECT DISTINCT(StudentId), AcadYearEnding, Birthdate From Enrollment) As Ed
GROUP BY BirthYear, AcadYearEnding
ORDER BY AcadYearEnding, BirthYear

I've done a count of Birthdate ... which could, and perhaps should have been StudentId but the count is the same either way.

A little more testing, but I appreciate the help getting it this close.

Comments