Edamame Edamame - 3 months ago 45
SQL Question

Impala: all DISTINCT aggregate functions need to have the same set of parameters

I got the following error in my Impala query:

select
upload_key,
max(my_timestamp) as upload_time,
max(color_key) as max_color_fk,
count(distinct color_key) as color_count,
count(distinct id) as toy_count
from upload_table
group by upload_key





and got the error:


AnalysisException: all DISTINCT aggregate functions need to have the
same set of parameters as count(DISTINCT color_key); deviating
function: count(DISTINCT id)


I am not sure why I got this error. What I did was for each group (grouped by
upload_key
), I tried to compute how many
distinct id
as well as how many
distinct color_key
.

Does any one have any idea

Answer

The error message indicates that DISTINCT is only allowed on one column [combination], but you try two, color_key& id. A workaround would be two Selects and then a join:

select
   t1.upload_key,
   t1.upload_time,
   t1.max_color_fk,
   t1.color_count,
   t2.toy_count
from
 (
   select 
      upload_key, 
      max(my_timestamp) as upload_time, 
      max(color_key) as max_color_fk, 
      count(distinct color_key) as color_count
   from upload_table 
   group by upload_key
 ) as t1
join
 (
   select 
      upload_key
      count(distinct id) as toy_count 
   from upload_table 
   group by upload_key
 ) as t2
on t1. upload_key = t2.upload_key