ihoaxed ihoaxed - 18 days ago 7
MySQL Question

Can someone explain these three SQL queries to me?

I'm a database class practicing some SQL problems to sharpen my skills from my book. Here is the database schema I am using for reference to the problems (the solutions posted are correct, I just need some explanation to better understand them):

enter image description here

Q1: Retrieve the name of each employee who has a dependent with the same first name and is the same sex as the employee.

SELECT E.FNAME, E.LNAME
FROM EMPLOYEE AS E
WHERE E.SSN IN (SELECT D.ESSN, FROM DEPENDENT AS D, WHERE E.FNAME = D.DEPENDENT_NAME AND D.SEX = E.SEX);


I’m a bit confused about this query. Shouldn’t the first FROM clause include “DEPENDENT AS D”? Why are we nesting this? I don’t see a reason why. Just seems to make things more complicated.

Q2: Retrieve the name of the employees who don’t have any dependents.

SELECT E.FNAME, E.LNAME
FROM EMPLOYEE AS E
WHERE NOT EXISTS (SELECT *, FROM DEPENDENT AS D, WHERE D.ESSN = E.SSN);


Instead of using WHERE NOT EXISTS, can't we just use NULL?

Why use the “WHERE NOT EXISTS” clause? Can’t we just use NULL?

Q3: List the names of managers who have at least one dependent.

SELECT FNAME, LNAME
FROM EMPLOYEE
WHERE EXISTS(SELECT * FROM DEPARTMENT WHERE SSN = MGR_SSN)
WHERE EXISTS (SELECT * FROM DEPENDENT WHERE ESSN = SSN);


I honestly just don’t understand the rationale behind this query. I get it somewhat, but am having a hard time even explaining it to myself. Shouldn’t the FROM clause include DEPARTMENT and DEPENDENT? Why does it include just EMPLOYEE? If someone can walk me through this query step by step and explain me why we use those WHERE EXISTS clauses the way it’s being used, that would be really helpful.



Answer

Someone needs to teach that class about appropriate comma use.


Q1: No, you wouldn't want the first FROM to be on DEPENDENT, you are trying to select employee information, not dependents'. The subquery is used to find the essn (parent ssn) values for the dependents who have the same name and gender as their parent.

Edit: Upon further inspection of the Q1 query, it does seems rather convoluted; and almost definitely not the best way to go about getting that information. Leaving the subquery as it is, using an EXISTS instead of an IN would not suffice, as it is possible there is a dependent with the same name and gender as the correlated employee that is not actually the employee's dependent. If AND D.ESSN = E.SSN were added to the subquery's WHERE criteria, then EXISTS would be a valid choice. However, there is really no way to "un-correlate" the subquery that does not make it completely obvious that an INNER JOIN on the three fields involved would have been a better choice (outside of the rare cases where some nincompoop names more than one of their same-gendered dependents after themselves; in which case an INNER JOIN is still probably a better solution but now DISTINCT should be used in the SELECT.).


Q2: I'm not sure how you would plan to use NULL.


Q3: Again, you wouldn't (or at least don't need to) select FROM the other tables in the outer query because you are not needing data from them in the results. The two EXISTS conditions are (likely) using correlated subqueries (probably with poor efficiency/performance, JOINs are usually much faster); perhaps it would be more obvious written with aliases like this:

SELECT e.FNAME, e,LNAME
FROM EMPLOYEE AS e
WHERE EXISTS(SELECT * FROM DEPARTMENT AS d WHERE e.SSN = d.MGR_SSN)
   AND EXISTS (SELECT * FROM DEPENDENT AS d WHERE d.ESSN = e.SSN);

(Edit: As another answerer noted; you can't have multiple WHERE's in (the same "level" of) a query.

or to translate it to English...

"Give me the first and last name of employees where there is a department whose manager's ssn is the same as employee's ssn, and there is a dependent whose parent's ssn is the same as the employee's."

Comments