lonely_luckily lonely_luckily - 2 months ago 5
MySQL Question

SQL: aggregating pairs without JOIN (challenging)

Could you please help me with a very difficult question?

I have a table 'itemslog' in MySQL DBMS with two columns: 'userid' and 'itemid', looks like:

| user1 | item 1 |
| user1 | item 2 |
| user2 | item 1 |
| user2 | item 2 |
| user2 | item 3 |


I need to count how much users have every pair of item, i.g. answer like that:

| item1 | item2 | 2 |
| item1 | item3 | 1 |
| item2 | item3 | 1 |


Usually we can use query based on JOIN operations, like that:

SELECT
t1.itemname,
t2.itemname,
count(*)
FROM
itemslog AS t1
CROSS JOIN itemslog AS t2 ON t1.userid = t2.userid
WHERE
t1.itemname < t2.itemname
GROUP BY
t1.itemname, t2.itemname;


But it takes a lot of computations and in my situation it is useless (i have about 200k rows). Can you give me an advice, is there another ways to do that? Thank you in advance!

Answer

This is your query:

SELECT t1.itemname, t2.itemname, count(*)
FROM itemslog t1 JOIN
     itemslog t2
     ON t1.userid = t2.userid AND t1.itemname < t2.itemname
GROUP BY t1.itemname, t2.itemname;

For this query you want an index on itemslog(userid, itemname):

create index itemslog_userid_itemname on itemslog(userid, itemname);

Assuming you have only a handful of items for each userid, this should have reasonable performance.

Comments