danny-v danny-v - 5 months ago 8
SQL Question

SQL query using EXIST operator - unexpected records in result

Using the AdventureWorks2014 database, I was experimenting with the EXIST keyword. Please note the following query:

select p.color, p.productid, p.name, th.Quantity
from production.product p, production.TransactionHistory th
where p.ProductID=th.ProductID and EXISTS(
select *
from Production.TransactionHistory t
where t.Quantity = 1000
and t.ProductID=p.ProductID
)


I was expecting to see only products that were ordered 1000 at a time (there is only one transaction that meets this condition), but instead I get hundreds of rows where th.Quantity is < 1000.

Removing the joined TransactionHistory table from the outer query solves the problem, but I just want to know why the original query returns the rows I am seeing.

Thanks

Edit:

For clarification, I understand how to solve the question that I want. I just wanted to understand the behavior of EXISTS and why I'm not getting the results I expected.

The following subquery (which is part of the EXISTS subquery), only returns a single result.

select *
from Production.TransactionHistory t
where t.Quantity = 1000


Therefore, if this is inside EXISTS it will return true every time. The caveat is that I am linking t.ProductID with p.ProductID in the subquery. So, for every row in the outer query, the product ID should be matching the product ID in the inner query. EXISTS should only return true when the product ID matches and the quantity is exactly 1000. To be precise, EXIST should only return true when the product ID is 994, because there is only one transaction in the entire table (with that product ID) that satisfies both the product ID requirement and the 1000 quantity requirement.

Notice the rest of the EXISTS subquery...

where t.Quantity = 1000 and t.ProductID=p.ProductID


The product ID has to match the outer record's product ID AND the quantity must be 1000.

To me, this query says "Give me the color, product id and name of all products, join in transactions, and then only include each row where there is at least one record in the transaction table whose product id matches the id of the CURRENT outer row, AND the order quantity is 1000". But this is not how it behaves. Just trying to understand why.

Answer

Your query sounds like this:

Get all transaction history entries of product if any of history entry have Quantity equal to 1000.

EXISTS return true or false, so

EXISTS(
    select *
    from Production.TransactionHistory t
    where t.Quantity = 1000
    and t.ProductID=p.ProductID
)

will return true for all TransactionHistory rows of product which have Transaction with Quantity = 1000

In addition:
Query above will be executed for every row of "Main" query and will return True on every row in your case. Thats why you get all rows

Comments