user2205916 user2205916 - 4 months ago 11
SQL Question

COUNT non-null values for multiple columns are the same?

I am trying to count non-null values for three columns. However, the same COUNT is returned for each of the three columns despite having differing number of non-null values. I'm not sure why this is the case (do I need to recast the column data type? wrong SQL clause?). Below is what I've tried and the results.

EDIT: replaced images with text. made shorter aliases.

SELECT CAST(users.install_ts AS Date) AS inst_date,
COUNT(users.install_ts) AS total_inst,
COUNT(users.firstlogin_ts) AS total_first_logins,
COUNT(users.firstpurchase_ts) AS conv_cust,
SUM(CASE WHEN users.firstpurchase_ts IS NULL THEN 1 ELSE 0 END) AS conv_cust
FROM users
GROUP BY CAST(users.install_ts AS Date)

inst_date total_inst total_first_logins conv_cust conv_cust
---------- ---------- ------------------ --------- ---------
2015-01-01 17191 17191 17191 0
2015-01-02 41038 41038 41038 0
2015-01-03 41176 41176 41176 0
2015-01-04 41072 41072 41072 0
2015-01-05 41115 41115 41115 0
2015-01-06 8417 8417 8417 0
2015-05-16 9991 9991 9991 0


Here is the
users
table:

uid device_id install_ts firstlogin_ts firstpurchase_ts
------ ---------- ------------------- ------------------- -------------------
121045 GalaxyS3 2015-01-01 14:00:01 2015-01-01 14:00:01 (null)
121046 GalaxyS1 2015-01-01 14:00:03 2015-01-01 14:00:07 2015-01-02 06:00:07
121047 iPhone3 2015-01-01 14:00:03 2015-01-01 14:00:03 (null)
121048 GalaxyS1 2015-01-01 14:00:04 (null) (null)
121049 iPhone5 2015-01-01 14:00:07 2015-01-01 14:00:08 (null)
121050 iPad4 2015-01-01 14:00:07 2015-01-01 14:00:09 (null)
121051 iPhone4s 2015-01-01 14:00:11 (null) (null)
121052 iPhone4s 2015-01-01 14:00:13 (null) (null)
121053 GalaxyTab3 2015-01-01 14:00:16 (null) (null)
121054 iPhone4 2015-01-01 14:00:19 2015-01-01 14:00:22 (null)
121055 iPad1 2015-01-01 14:00:22 2015-01-01 14:00:26 (null)
121056 iPad2 2015-01-01 14:00:26 2015-01-01 14:00:29 (null)
121057 GalaxyTab2 2015-01-01 14:00:30 2015-01-01 14:00:31 (null)
121058 iPhone5s 2015-01-01 14:00:34 2015-01-01 14:00:38 (null)
121059 GalaxyS5 2015-01-01 14:00:34 (null) (null)
121060 GalaxyS5 2015-01-01 14:00:34 2015-01-01 14:00:38 (null)
121061 GalaxyS5 2015-01-01 14:00:37 (null) (null)
121062 iPhone3 2015-01-01 14:00:39 (null) (null)


Description of the table:

DESCRIBE users
Field Type Null Key Default Extra
---------------- ----------- ---- --- ------------------- ---------------------------
uid int(11) YES MUL (null)
device_id varchar(64) YES (null)
install_ts timestamp NO CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP
firstlogin_ts timestamp NO 0000-00-00 00:00:00
firstpurchase_ts timestamp NO 0000-00-00 00:00:00


EDIT: Doing them one at a time gives the same results. So maybe it is a grouping problem. If that's the case, how do I go about achieving my intended results?

One at a time:

SELECT COUNT(users.firstlogin_ts) AS total_first_logins
FROM users
GROUP BY CAST(users.install_ts AS Date)

total_first_logins
17191
41038
41176
41072
41115
8417
9991


Another column:

SELECT COUNT(users.install_ts) AS total_inst
FROM users
GROUP BY CAST(users.install_ts AS Date)

total_inst
17191
41038
41176
41072
41115
8417
9991

Answer

I am guessing there might be a display issue at the root of your problem. The data you present shows (null), but the table description indicates the fields are not nullable in the first place. However, the fields default to '0000-00-00 00:00:00', which is not quite a valid value for the type, but allowed (it's weird). Some APIs convert such values to null, even though they are not actually NULL in the database. Try this instead:

SELECT  CAST(users.install_ts AS Date) AS inst_date, 
        COUNT(1) AS total_inst, 
        COUNT(NULLIF(users.firstlogin_ts, '0000-00-00 00:00:00')) AS total_first_logins,
        COUNT(NULLIF(users.firstpurchase_ts, '0000-00-00 00:00:00')) AS conv_cust
FROM users
GROUP BY CAST(users.install_ts AS Date)

NULLIF will convert such values to true NULL values.