Th120 Th120 - 19 days ago 9
MySQL Question

mysql filter duplicates where other column matches

I am currently stuck with a homework exercise...

We are using the sakila db (https://dev.mysql.com/doc/sakila/en/)

The exercise is to find and select every film in which more than one actor has the same first name and select these actors

e.g.

title vn nn
AFRICAN EGG MATTHEW TAUTOU
AFRICAN EGG MATTHEW LEIGH
ANALYZE HOOSIERS TOM MCKELLEN
ANALYZE HOOSIERS TOM MIRANDA


I currently using 3 tables for that (tables -> https://dev.mysql.com/doc/sakila/en/sakila-structure-tables.html):


  • film_actor

  • actor

  • film



I currently select every actor who is in a film which has more than 1 actor with the same first name with this:

SELECT f.title title, a.actor_id, a.first_name vn, a.last_name nn FROM film_actor fa
LEFT JOIN actor a ON a.actor_id = fa.actor_id
LEFT JOIN film f ON f.film_id = fa.film_id
WHERE fa.film_id IN
(
SELECT fa.film_id FROM film_actor fa
LEFT JOIN actor a ON a.actor_id = fa.actor_id
LEFT JOIN film f ON f.film_id = fa.film_id
GROUP BY a.first_name, f.title HAVING count(a.first_name) > 1
)
ORDER BY title;


I get this as a result:

title actor_id vn nn
DESPERATE TRAINSPOTTING 117 RENEE TRACY
DESPERATE TRAINSPOTTING 137 MORGAN WILLIAMS
DESPERATE TRAINSPOTTING 187 RENEE BALL
DESPERATE TRAINSPOTTING 77 CARY MCCONAUGHEY


I have no idea how to get only the names of the duplicate actors...
Any suggestions?

Answer

Use COUNT(DISTINCT firstname):

SELECT f.title title, a.actor_id, a.first_name vn, a.last_name nn
FROM film_actor fa
LEFT JOIN actor a
    ON a.actor_id = fa.actor_id
LEFT JOIN film f
    ON f.film_id = fa.film_id
INNER JOIN
(
    SELECT a.first_name, fa.film_id
    FROM film_actor fa
    LEFT JOIN actor a
        ON a.actor_id = fa.actor_id
    WHERE fa.film_id IN
    (
        SELECT fa.film_id
        FROM film_actor fa
        LEFT JOIN actor a
            ON a.actor_id = fa.actor_id
        LEFT JOIN film f
            ON f.film_id = fa.film_id
        GROUP BY fa.film_id
        HAVING COUNT(DISTINCT a.first_name) > 1
    )
    GROUP BY a.first_name, fa.film_id
    HAVING COUNT(*) > 1
) t
    ON a.first_name = t.first_name AND
       fa.film_id   = t.film_id

In your original attempt, you were checking the count of all first names. This would of course then be true for any movie having more than one actor. Instead, you want to check the count of distinct actors.

Comments