Patrick O'Hara Patrick O'Hara - 2 months ago 4x
SQL Question

Is there a way to filter the results of a query by comparing the value of an aggregate function to a column that is not included in the GROUP BY?

I am assuming the answer is no, but always interested in learning something new. Here is the setup; I have three tables, container_location, container_location_size and container. Yes there is a location table also, but I do not believe that I need it here. The container_location is a join table that lists the containers that are at a given location. The container table describes the characteristics for a given container. The container_location_size describes how containers can be put into a location. I can get the number of containers for a given location with this SQL:

SELECT location_code,
COUNT(*) AS container_count
FROM container_location
GROUP BY location_code

The question I am trying to answer is which locations are full (i.e. have a number of containers equal to, or greater than, the max for that location. The container_location_size table has a column max_containers that specifies how many containers can be at that location. I have come up with this SQL to answer that question:

SELECT container_location_size.location_code
FROM container_location_size
JOIN (SELECT location_code,
COUNT(*) AS container_count
FROM container_location
GROUP BY location_code) AS contcount
ON contcount.location_code = container_location_size.location_code
WHERE contcount.container_count >= container_location_size.max_containers

This works but I am wondering if this can be done as a single select. The issue I ran into is that
requires that any column being compared be in the
. I obviously do not want to group by the max number of containers. Like I said at the beginning I expect this is not a solvable problems, but I am interested in either being proven wrong, or alternate approaches.


You can write this as:

SELECT l.location_code, COUNT(*) AS container_count
FROM container_location cl JOIN
     location l
     ON l.location_code = cl.location_code
GROUP BY l.location_code
HAVING COUNT(*) >= MAX(l.max_containers);


  • This assumes that location(location_code) is unique. That seems reasonable.
  • The HAVING clause uses MAX(l.max_containers). The column needs to be in an aggregate function.
  • An alternative would be to include l.max_containers in the GROUP BY.
  • This may not be much faster than your version.