valenzio valenzio - 20 days ago 6
SQL Question

SQL: Get Frequency counter, similar to pivot table

I have the following Table in a Vertica Database

+---------+-------+
| ReadOut | Event |
+---------+-------+
| 1 | A |
| 1 | A |
| 1 | B |
| 1 | B |
| 2 | A |
| 2 | B |
| 2 | B |
+---------+-------+


I am trying to get the frequency of each event per read out. Maybe I am missusing the term 'frequency' in this case, but ideally at the end I would have a table like this:

+-----------+---------+---------+--+
| Frequency | Event_A | Event_B | |
+-----------+---------+---------+--+
| 1 | 1 | 0 | |
| 2 | 1 | 2 | |
| 3 | 0 | 0 | |
| 4 | 0 | 0 | |
| . | 0 | 0 | |
| . | 0 | 0 | |
| . | 0 | 0 | |
+-----------+---------+---------+--+


So far I implemented a for loop in python that looks like this (simplified version):

# get all the available event names
Eventlist=SELECT DISTINCT Event FROM table
# loop over each event to get the frequency
for ii in Eventlist:
SELECT count(Readout) FROM table WHERE Event = ii group by Readout


But the disadvante is that I need to know the names of the Events in advance which requires a seperate query and I have to run a bunch of loops.
Is there a more elegant way of doing this.
Thanks Mates

Answer

Vertica does not have the ability to pivot. You'll need to generate the sql to pivot or just pivot in python. I would just do it in python, keep your query simple.

As for getting counts per event, you can do it all at one time:

with f as (
    SELECT readout, event, COUNT(*) frequency
    FROM   mytable
    GROUP BY 1, 2 
)
select frequency, event, count(*) cnt
from f
group by 1, 2
order by 1, 2

Then use python to pivot and fill in the frequency gaps if you need to. (If you use pandas, pivoting is probably dead simple).

Here is an example pivoting to a dict (depends on which lib you are using and settings for your connection, so you'll probably need to change it):

from collections import defaultdict

myresult = defaultdict(dict)
for row in rows: 
    myresult[row['frequency'][row['event']] = row['cnt']

I'm sure there is some more clever way to do it with dict comprehension but this seems straightforward.

Hope it helps.

Comments