I have a MySQL table "page" that contains fields "page_name" and "page_timestamp". The table stores page and times requests that were made on a website. I'm trying to write a query that gives me the most hit page for every 5 min period.
Time Page Hits
12:00 index.html 34
12:05 page1.html 11
12:10 index.html 44
(select count( pages_name)
from pages t2
where UNIX_TIMESTAMP(t2.pages_timestamp) DIV 300 =
group by pages_name
order by count(pages_name) desc
limit 1 )
group by UNIX_TIMESTAMP(pages_timestamp ) DIV 300
page_timestamp is a
Here is a sample with some data I created, it helps to visualize the output.
So first off, lets create a function for time rounding, this will declutter our query:
drop function if exists rtime; create function rtime (time DATETIME) returns DATETIME BEGIN DECLARE newtime DATETIME; set newtime = from_unixtime(floor(unix_timestamp(time)/300)*300); return newtime; END;
Next let's explore our data a little bit. I want to pull all
Count(Page_Name), grouping by interval and page.
SELECT Time, Page, Hits from ( select rtime(pages_timestamp) as time , page_name as page , count(page_name) as hits from pages group by page_name, rtime(pages_timestamp) order by rtime(pages_timestamp), hits desc) g ;
This orders our aggregated table by time interval then the number of hits. Since our groups are ordered by most hits desc we can pull the first row. In MySQL we can non-aggregate non-group by columns (src. This gives us the first row per group, the one with the most hits. We just
SELECT * from the above table grouping only by time:
select * from (select time, page, hits from ( select rtime(pages_timestamp) as time , page_name as page , count(page_name) as hits from pages group by page_name, rtime(pages_timestamp) order by rtime(pages_timestamp), hits desc ) g ) h group by time;
Disclaimer!: If there is a tie, if two pages both have the most hits, this will only pull one record.