pehrs pehrs - 1 month ago 22
SQL Question

Generate unique random strings in plpgsql

I am trying to write a function to create unique random tokens of variable length. However, I am stumped by the plpgsql syntax. My intention is to create a function which


  • Takes a table and column as input

  • Generates a random string of a given length, with a given set of characters

  • Checks if the string is already in the colum

  • If so (and this is expected to be rare), simply generate a new random string.

  • Otherwise, return the random string



My current attempt looks like this:

CREATE FUNCTION random_token(_table TEXT, _column TEXT, _length INTEGER) RETURNS text AS $$
DECLARE
alphanum CONSTANT text := 'abcdefghijkmnopqrstuvwxyz23456789';
range_head CONSTANT integer := 25;
range_tail CONSTANT integer := 33;
random_string text;
BEGIN
REPEAT
SELECT substring(alphanum from trunc(random() * range_head + 1)::integer for 1) ||
array_to_string(array_agg(substring(alphanum from trunc(random() * range_tail + 1)::integer for 1)), '')
INTO random_string FROM generate_series(1, _length - 1);
UNTIL random_string NOT IN FORMAT('SELECT %I FROM %I WHERE %I = random_string;', _column, _table, _column)
END REPEAT;
RETURN random_string;
END
$$ LANGUAGE plpgsql;


However, this doesn't work, and gives me a not very helpful error:


DatabaseError: error 'ERROR: syntax error at or near "REPEAT"


I have tried a number of variations, but without knowing what the error in the syntax is I am stumped. Any idea how to fix this function?

Answer

There is no repeat statement in plpgsql. Use simple loop.

CREATE OR REPLACE FUNCTION random_token(_table TEXT, _column TEXT, _length INTEGER) RETURNS text AS $$
DECLARE
  alphanum CONSTANT text := 'abcdefghijkmnopqrstuvwxyz23456789';
  range_head CONSTANT integer := 25;
  range_tail CONSTANT integer := 33;
  random_string text;
  ct int;
BEGIN
  LOOP
    SELECT substring(alphanum from trunc(random() * range_head + 1)::integer for 1) ||
      array_to_string(array_agg(substring(alphanum from trunc(random() * range_tail + 1)::integer for 1)), '')
      INTO random_string FROM generate_series(1, _length - 1);
    EXECUTE FORMAT('SELECT count(*) FROM %I WHERE %I = %L', _table, _column, random_string) INTO ct;
    EXIT WHEN ct = 0;
  END LOOP;
  RETURN random_string;
END
$$ LANGUAGE plpgsql;

Note, random_string should be a parameter to format().

Update. According to the accurate hint from Abelisto, this should be faster for a large table:

DECLARE
  dup boolean;
...
    EXECUTE FORMAT('SELECT EXISTS(SELECT 1 FROM %I WHERE %I = %L)', _table, _column, random_string) INTO dup;
    EXIT WHEN NOT dup;
...