Paul Neale Paul Neale - 7 months ago 14
SQL Question

MYSQL Sub Query with two columns

I have been trying to get a sub query to work but I'm thinking it needs to be join instead but I'm new to databases and having a hard time wrapping my head around just how it works.

I need to collect a rows from userprofile based on three columns in connect, what I have that doesn't work is this.

SELECT * FROM userprofile WHERE user_id IN
(SELECT connect.user_id, connect.connect_to_id FROM connect WHERE
(user_id = '1' OR connect_to_id = '1') AND is_connected = '1' );


Because the second SELECT has two columns listed it errors with

#1241 - Operand should contain 1 column(s)


Having just one column listed in the second join doesn't return all entries, only the ones that the first test meets.

Does this have to be done with JOIN and if so how? Every join that I try just results in all entries from the userprofile table being returned.

Thanks for your help.

*UPDATE

I have two queries working now.

SELECT u.* FROM userprofile u
JOIN connect c ON (u.user_id = c.user_id OR u.user_id = c.connect_to_id) WHERE
((c.user_id = '1' OR c.connect_to_id = '1') AND c.is_connected = '1') AND u.user_id != '1'
GROUP BY u.id;


And

SELECT u.* FROM userprofile u WHERE u.user_id IN
(SELECT connect.user_id FROM connect WHERE connect.connect_to_id = '1' AND connect.is_connected = '1'
UNION ALL
SELECT connect.connect_to_id FROM connect WHERE connect.user_id = '1' AND connect.is_connected = '1')


Over 10000 calls using microtime I have the second using the Sub Query being a tad faster.

Join Query:4.6199560165405
Sub Query:4.0528790950775


I then decided to do a test to see why and discovered that the GROUP BY command is slowing down the JOIN considerably. So with the GROUP BY removed I get these numbers.

Join Query:2.6410460472107
Sub Query:4.1510140895844


As it turns out I don't really need the GROUP BY as I'm checking that u.user_id != '1' and this stops the same problem I was having.

So the question, is using GROUP BY frowned on?

Answer

It can be done either via a join or a sub-query. Typically with MySQL joins are more efficient because in many cases MySQL is unable to use indexes with sub-queries. So, in your case the query with a join would be:

SELECT u.*
FROM
    userprofile u
    JOIN connect c ON (u.user_id = c.user_id OR u.user_id = c.connect_to_id)
WHERE
    (c.user_id = '1' OR c.connect_to_id = '1') AND
    c.is_connected = '1'