Justin Justin - 3 months ago 9
SQL Question

Procedure to apply formatting to all rows in a table

I had a SQL procedure that increments through each row and and pads some trailing zeros on values depending on the length of the value after a decimal point. Trying to carry this over to a PSQL environment I realized there was a lot of syntax differences between SQL and PSQL. I managed to make the conversion over time but I am still getting a syntax error and cant figure out why. Can someone help me figure out why this wont run? I am currently running it in PGadmin if that makes any difference.

DO $$

DECLARE
counter integer;
before decimal;
after decimal;

BEGIN
counter := 1;
WHILE counter <> 2 LOOP


before = (select code from table where ID = counter);

after = (SELECT SUBSTRING(code, CHARINDEX('.', code) + 1, LEN(code)) as Afterward from table where ID = counter);

IF before = after
THEN
update table set code = before + '.0000' where ID = counter;

ELSE
IF length(after) = 1 THEN
update table set code = before + '000' where ID = counter;
ELSE IF length(after) = 2 THEN
update table set code = before + '00' where ID = counter;
ELSE IF length(after) = 3 THEN
update table set code = before + '0' where ID = counter;
ELSE
select before;
END IF;
END IF;

counter := counter + 1;

END LOOP
END $$;


Some examples of the input/output of the intended result:

Input 55.5 > Output 55.5000
Input 55 > Output 55.0000


Thanks for your help,
Justin

Answer

There is no need for a function or even an update on the table to format values when displaying them.

Assuming the values are in fact numbers stored in a decimal or float column, all you need to do is to apply the to_char() function when retrieving them:

select to_char(code, 'FM999999990.0000')
from data;

This will output 55.5000 or 55.0000

The drawback of the to_char() function is that you need to anticipate the maximum number of digits of that can occur. If you have not enough 9 in the format mask, the output will be something like #.###. But as too many digits in the format mask don't hurt, I usually throw a lot into the format mask.

For more information on formatting functions, please see the manual:
https://www.postgresql.org/docs/current/static/functions-formatting.html#FUNCTIONS-FORMATTING-NUMERIC-TABLE


If you insist on storing formatted data, you can use to_char() to update the table:

update the_table
  set code = to_char(code::numeric, 'FM999999990.0000');

Casting the value to a number will of course fail if there a non-numeric values in the column.

But again: I strong recommend to store numbers as numbers, not as strings.

If you want to compare this to a user input, it's better to convert the user input to a proper number and compare that to the (number) values stored in the database.


The string matching that you are after doesn't actually require a function either. Using substring() with a regex will do that:

update the_table 
   set code = code || case length(coalesce(substring(code from '\.[0-9]*$'), ''))
         when 4 then '0'
         when 3 then '00'
         when 2 then '000'
         when 1 then '0000'
         when 0 then '.0000'
         else ''
       end
 where length(coalesce(substring(code from '\.[0-9]*$'), '')) < 5;

substring(code from '\.[0-9]*$') extracts everything the . followed by numbers that is at the end of the string. So for 55.0 it returns .0 for 55.50 it returns .50 if there is no . in the value, then it returns null that's why the coalesce is needed.

The length of that substring tells us how many digits are present. Depending on that we can then append the necessary number of zeros. The case can be shortened so that not all possible length have to be listed (but it's not simpler):

update the_table
    set code = code || case length(coalesce(substring(code from '\.[0-9]*$'), ''))
         when 0 then '.0000'
         else lpad('0', 5- length(coalesce(substring(code from '\.[0-9]*$'), '')), '0')
       end
 where length(coalesce(substring(code from '\.[0-9]*$'), '')) < 5;

Another option is to use the position of the . inside the string to calculate the number of 0 that need to be added:

update the_table
     set code = 
       code || case 
         when strpos(code, '.') = 0 then '0000'
         else rpad('0', 4 - (length(code) - strpos(code, '.')), '0')
       end
where length(code) - strpos(code, '.') < 4;

Regular expressions are quite expensive not using them will make this faster. The above will however only work if there is always at most one . in the value.

But if you can be sure that every value can be cast to a number, the to_char() method with a cast is definitely the most robust one.


To only process rows where the code columns contains correct numbers, you can use a where clause in the SQL statement:

where code ~ '^[0-9]+(\.[0-9][0-9]?)?$'