Samuel Markson Samuel Markson - 2 months ago 10
SQL Question

SQL: Mapping first (or random) result of element from specific column matching certain conditions in another column to variable

The problem is as follows: I want to query a database, and return the number of hits that have particular values in two different columns, grouped by timestamp window. I am currently doing this successfully with the following query:

SELECT shortdate,
sum(column1 like '%interestingthing1%') thing1count,
sum(column1 like '%interestingthing2%') thing2count,


FROM (
select LEFT(string(DATE), 8) shortdate,column1
from [database]
where column3 like '%thingthatcolumn3shouldbe%'
)
group by shortdate

ORDER BY 1 DESC
LIMIT 1000


What I want to also do is to also return a variable which is a random element of column2, corresponding to a row that satisfies the thingthatcolumn3shouldbe, one for each interestingthingn (n=1,2,...).

I have some intuition that this can be done either with some user-defined function (I am doing this in google bigquery, which allows javascript functions), or a whole mess of UNIONs within the FROM (...) statement above. But since I am far from a SQL expert, this question is partially an effort to start using some best/better practices early on. Naked opinions welcome.

Thanks in advance,

Samuel

Answer

Perhaps something like the query below. It will pick the "first" value for column2 when column1 satisfies your condition, but since order of values going into aggregation function is not deterministic - it is as good as random. And since FIRST aggregate function ignores NULLs, we just need to make sure to convert column2 to NULL when condition isn't met.

SELECT shortdate, 
sum(column1 like '%interestingthing1%') thing1count, 
sum(column1 like '%interestingthing2%') thing2count,
first(if(column1 like '%interestingthing1%', column2, NULL)) column2forthing1,
first(if(column1 like '%interestingthing2%', column2, NULL)) column2forthing1

FROM (
select LEFT(string(DATE), 8) shortdate, column1, column2
from [database]
where column3 like '%thingthatcolumn3shouldbe%'
)
group by shortdate

ORDER BY 1 DESC
LIMIT 1000
Comments