Donna Panzarotti Donna Panzarotti - 2 months ago 14
SQL Question

Sample of data within groups - Teradata

Found a similar question but without an answer that worked succuessfully.

I need to select a sample of 50 of each status type within a single table.

TABLE1



MEMBER STATUS
1234 A
1324 A
3424 R
3432 S
3232 R
2783 A
2413 S
4144 R
2387 S


I tried:


SEL Member, status
FROM TABLE1 Qualify Row_Number ( ) OVER (PARTITION
BY status ORDER BY random (1,10000)) <=50


As suggested in the previous question/answer but Teradata does not like RANDOM in an Aggregate or Ordered Analytical Function.

Answer

Since, as you say in the comments, you do not need to use random you can do this:

SEL     Member, status 
FROM    TABLE1 
QUALIFY ROW_NUMBER() OVER (
            PARTITION BY status 
            ORDER BY NULL) <= 50;

Edit: based on a question in the comments related to a spool-space error, an approach we can try when the table is large is to create a temporary table with a portion of the records from the original table.

CREATE MULTISET VOLATILE TABLE tmp_tbl AS (
    SEL     Member, status
    FROM    TABLE1
    WHERE   somefield = 'somecriterion'
) WITH DATA 
ON COMMIT PRESERVE ROWS;

And then try again:

SEL     Member, status 
FROM    tmp_tbl /* now from temporary table */
QUALIFY ROW_NUMBER() OVER (
            PARTITION BY status 
            ORDER BY NULL) <= 50;