Jordan Davis Jordan Davis - 3 months ago 16
SQL Question

How do you calculate data completeness for multiple tables based on null values within columns?

The query below calculates what we need but for only one specific column. How can we do this for all the columns within that table, without having to duplicate the case statement multiple times. This needs to be done for hundreds of tables, so duplicating the case statement is not ideal.

Select SUM(cast(case when column is null then 0 else 1 end as float))/count(*) from [Table]


So the output would be something like

Column Name: Data completeness

Customer Name: 88%

Answer

Solution by Jens Suessmeyer from Finding the percentage of NULL values for each column in a table

SET NOCOUNT ON
DECLARE @Statement NVARCHAR(MAX) = ''
DECLARE @Statement2 NVARCHAR(MAX) = ''
DECLARE @FinalStatement NVARCHAR(MAX) = ''

DECLARE @TABLE_SCHEMA SYSNAME = <SCHEMA_NAME>
DECLARE @TABLE_NAME SYSNAME = <TABLE_NAME>

SELECT
        @Statement = @Statement + 'SUM(CASE WHEN ' + COLUMN_NAME + ' IS NULL THEN 1 ELSE 0 END) AS ' + COLUMN_NAME + ',' + CHAR(13) ,
        @Statement2 = @Statement2 + COLUMN_NAME + '*100 / OverallCount AS ' + COLUMN_NAME + ',' + CHAR(13)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = @TABLE_NAME 
    AND TABLE_SCHEMA = @TABLE_SCHEMA

IF @@ROWCOUNT = 0
    RAISERROR('TABLE OR VIEW with schema "%s" and name "%s" does not exists or you do not have appropriate permissions.',16,1, @TABLE_SCHEMA, @TABLE_NAME)
ELSE
BEGIN
    SELECT @FinalStatement =
            'SELECT ' + LEFT(@Statement2, LEN(@Statement2) -2) + ' FROM (SELECT ' + LEFT(@Statement, LEN(@Statement) -2) +
            ', COUNT(*) AS OverallCount FROM ' + @TABLE_SCHEMA + '.' + @TABLE_NAME + ') SubQuery'
    EXEC(@FinalStatement)
END