ltt ltt - 1 year ago 170
SQL Question

How to select all rows from n random groups in postgres

This question pertains to formulating a query in

PostgreSQL

Suppose I have the following table:

record entity docid sentencid
1 A 123 1231
1 A 123 1232
1 A 100 1001
1 C 100 1002
1 B 121 1212
1 B 123 1234
2 B 102 1021
2 D 111 1111
2 D 102 1022
2 E 101 1011
3 C 111 1115
3 C 111 1113
3 C 111 1114


Is there a
PostgresSQL
query that I can use to select all rows for n (or less) random groups of entities for each record in this table? Lets say n is 2. So the query should select all rows for record 3 and all rows for any 2 random entity groups of record 1 and 2. The final result should be ordered by
accession
,
entity
,
docid
,
sentenceid
.

Here is an example result with n=2:

record entity docid sentencid
1 A 100 1001
1 A 123 1231
1 A 123 1232
1 B 121 1212
1 B 123 1234
2 D 102 1022
2 D 111 1111
2 E 101 1011
3 C 111 1113
3 C 111 1114
3 C 111 1115


assuming that the entities A and B were randomly selected from the set of entities (A,B,C) for record 1 and the entities D and E were randomly selected from the set of entities (B,D,E) for record 2. The n entities should be selected randomly for each record.

I have searched extensively for answers to this question but did not find any working query. Thank you for looking into this!

Answer Source

You can use row_number with a random() order to randomly select n entities per record group. Then join this to your main table

select * from Table1 t1
join (
    select * from (
        select record, entity,
            row_number() over (partition by record order by random()) rn
        from Table1
        group by record, entity
    ) t where rn <= 2
) t2 on t1.record = t2.record and t1.entity = t2.entity

Demo

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download