Francisco Trillo Garcia Francisco Trillo Garcia - 5 months ago 15
SQL Question

mysql querying for matches in many to many relations

I need to build a complicated query and I don't even know where to start. Any help would be welcome.

I've got an users table and a games table with 2 Many to Many relationships between them, the tables and relevant columns are:


  • users: id

  • games: id

  • game_users_A: id, user_id, game_id. (A relationship)

  • game_users_B: id, user_id, game_id. (B relationship)



Let's say I chose an user and one of the relationships, for example user1, relationship A.
I want to get every other user and order them by how many games in their B relationships match the games in user1's A relationship.

Thanks. I've been trying to find similar questions but I don't even know how to call what I'm trying to do.

Edit.

Thanks to Hayden's help I was able to build the query I needed.
I just had to adapt his a little:

His query compares every user with each other. That is a lot more than I need to do, with around 3000 users it will return more than 900000 rows.
I only need to compare the user that's logged into my application with every other user in the database, so with N users I'll get N - 1 rows.

So I just added a couple of WHERE clouses and changed the ORDER BY.

Assuming the user making the request has an Id of 3 and wants to see the users matching the games in his A relationship with the games in their B relationship (and not the other way around), this is the query:

SELECT U1.id, U2.id, SUM(GCol) AS matches
FROM users U1 JOIN users U2 ON U1.id <> U2.id
LEFT JOIN (SELECT A.user_id AS A_id, B.user_id as B_id, (A.id IS NOT NULL) AS GCol
FROM game_users_A A JOIN game_users_B B
ON A.game_id = B.game_id AND A.user_id <> B.user_id WHERE A.user_id = 3) G
ON U1.id = G.A_id AND U2.id = G.B_id
WHERE U1.id = 3
GROUP BY GCol, U1.id, U2.id
ORDER BY matches desc


Thanks again.

Answer

I played around with a little test database with those tables, and here's what I have.

I broke it down into two queries. The first one was to match all users against all other users, which is simply:

SELECT U1.id, U2.id 
FROM users U1 JOIN users U2 ON U1.id <> U2.id
ORDER BY U1.id, U2.id

The second query was to match up records in the B group to records in the A group, where the game is the same. This gave me:

SELECT A.user_id AS A_id, B.user_id as B_id
FROM game_users_A A JOIN game_users_B B
ON A.game_id = B.game_id AND A.user_id <> B.user_id

To combine these two, I did a left join of the first query against the second query, since for any pair of users, it's possible that there isn't a matching set of games. In these cases, the results of the second query will show up as NULL values. What we want is a value of 1 when those appear, so if you look in the query below, there's a value of

(A.id IS NOT NULL) AS GCol

which is used so we can find the sum. If there is a record, there's a value of 1, or NULL otherwise. When there are no values there, the total is still null, but when there are matches, the total is summed up and displayed. This gives a final query of:

SELECT U1.id, U2.id, SUM(GCol) 
FROM users U1 JOIN users U2 ON U1.id <> U2.id
LEFT JOIN (SELECT A.user_id AS A_id, B.user_id as B_id, (A.id IS NOT NULL) AS GCol
FROM game_users_A A JOIN game_users_B B
ON A.game_id = B.game_id AND A.user_id <> B.user_id) G
ON U1.id = G.A_id AND U2.id = G.B_id
GROUP BY GCol, U1.id, U2.id
ORDER BY U1.id, U2.id

I think that's what you're looking for.