Jivan Jivan - 8 days ago 9
SQL Question

UNNEST expression references column which is neither grouped nor aggregated

Google Analytics BigQuery tables are structured like this (Legacy SQL notations - only relevant fields are shown):

visitId: INTEGER
hits: RECORD/REPEATED
hits.hour: INTEGER


On one such table, the following query works well:

SELECT
visitId,
MIN(h.hour) AS firstHitHour
FROM
`my-table.ga_sessions_20161122`, UNNEST(hits) AS h
GROUP BY
visitId


But using this alternative syntax:

SELECT
visitId,
(SELECT MIN(hour) FROM UNNEST(hits)) as firstHitHour
FROM
`my-table.ga_sessions_20161122`
GROUP BY
visitId


Triggers the following error:


Error: UNNEST expression references column hits which is neither grouped nor aggregated


I understand that
UNNEST(hits)
must be somehow grouped or aggregated, but since this column is an
array (repeated)
, what does it mean exactly?

If I try to "group the column
hits
", as requested, like this:

(SELECT MIN(hour) FROM UNNEST(hits) as h GROUP BY h) as firstHitHour


Then I get a
Grouping by expressions of type STRUCT is not allowed
error.

How can this alternative syntax be corrected to produce the same result as the first one?

Answer

My first Answer is for original version of this question.
When I answered, I realized you have changed it to quite different one :o)

So below answer is for most recent version of your question:

I think that in "alternative" version you just do not need GROUP BY at all, because you operate on original (un-flattened) row by row and for each row (visitId) you calculate firstHitHour

SELECT
  visitId,
  (SELECT MIN(hour) FROM UNNEST(hits)) as firstHitHour
FROM
  `my-table.ga_sessions_20161122`

In your initial query - you kind of flattening all records for each row - so that's why you need then to group them back

Comments