datatista datatista - 2 months ago 6
SQL Question

Oracle: How to find overlaps in rows

Suppose I have the following table:

User_ID Activity_ID
123 222
123 333
124 222
124 224
124 333
125 224
125 333


I want to return a count users by the different combinations of overlaps such as the following:

Activity_ID_1 Activity_ID_2 Count_of_Users
222 333 2
222 224 2


In the above example, there are 2 users who completed both 223 AND 333.

I do not want to define each combination manually since there are 93 different activity_ids I am working with. Is there a way to do this purely in Oracle SQL?

Answer

Assuming you have an activity table with activity id's, and you want to count only DISTINCT users who had the same two activities (the same user having both activities twice wouldn't count):

select a1.activity_id, a2.activity_id, count(distinct f.user_id)
from   activity a1 inner join facts    f  on a1.activity_id = f.activity_id
                   inner join activity a2 on a2.activity_id = f.activity_id
where  a1.activity_id < a2.activity_id
group by a1.activity_id, a2.activity_id
having count(distinct f.user_id) >= 2
;

facts is the name of your facts table (the one you show in your question).

EDIT: If the facts table (or view or subquery or whatever) is already "distinct"-ed by user_id, then delete "distinct" from my solution; this will make it more efficient. NOTE: "distinct" appears twice, once in SELECT and again in HAVING.