Kein-Wai Kein-Wai - 1 month ago 7
Linux Question

Stored procedure that Automatically delete rows older than 7 days in MYSQL

I would like to know if is possible to create a stored procedure that automatically, every day at 00:00, deletes every row of every table that is over 7 days.

I have seen few solutions but not sure if its what I am looking for, and would be nice if someone has any good example. I know this could be done with simple scripts in python and php, but I would like something more automated by MySQL.

Any help would be really appreciate.

Thanks!

Answer

Mysql has its EVENT functionality for avoiding complicated cron interactions when much of what you are scheduling is sql related, and less file related. See the Manual page here. Hopefully the below reads as a quick overview of the important steps and things to consider, and verifiable testing too.

show variables where variable_name='event_scheduler';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| event_scheduler | OFF   |
+-----------------+-------+

ooops, the event scheduler is not turned on. Nothing will trigger.

SET GLOBAL event_scheduler = ON; -- turn her on and confirm below

show variables where variable_name='event_scheduler';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| event_scheduler | ON    |
+-----------------+-------+

Schema for testing

create table theMessages
(   id int auto_increment primary key,
    userId int not null,
    message varchar(255) not null,
    updateDt datetime not null,
    key(updateDt)
    -- FK's not shown
);
-- it is currently 2015-09-10 13:12:00
-- truncate table theMessages;
insert theMessages(userId,message,updateDt) values (1,'I need to go now, no followup questions','2015-08-24 11:10:09');
insert theMessages(userId,message,updateDt) values (7,'You always say that ... just hiding','2015-08-29');
insert theMessages(userId,message,updateDt) values (1,'7 day test1','2015-09-03 12:00:00');
insert theMessages(userId,message,updateDt) values (1,'7 day test2','2015-09-03 14:00:00');

Create 2 events, 1st runs daily, 2nd runs every 10 minutes

Ignore what they are actually doing (playing against one another). The point is on time difference approaches and scheduling.

DELIMITER $$
CREATE EVENT `delete7DayOldMessages`
  ON SCHEDULE EVERY 1 DAY STARTS '2015-09-01 00:00:00'
  ON COMPLETION PRESERVE
DO BEGIN
   delete from theMessages 
   where datediff(now(),updateDt)>6; -- not terribly exact, yesterday but <24hrs is still 1 day
   -- etc etc all your stuff in here
END;$$
DELIMITER ;

...

DELIMITER $$
CREATE EVENT `Every_10_Minutes_Cleanup`
  ON SCHEDULE EVERY 10 MINUTE STARTS '2015-09-01 00:00:00'
  ON COMPLETION PRESERVE
DO BEGIN
   delete from theMessages 
   where TIMESTAMPDIFF(HOUR, updateDt, now())>168; -- messages over 1 week old (168 hours)
   -- etc etc all your stuff in here
END;$$
DELIMITER ;

Show event statuses (different approaches)

show events from so_gibberish; -- list all events by schema name (db name)
show events; -- <--------- from workbench / sqlyog
show events\G;` -- <--------- I like this one from mysql> prompt

*************************** 1. row ***************************
                  Db: so_gibberish
                Name: delete7DayOldMessages
             Definer: root@localhost
           Time zone: SYSTEM
                Type: RECURRING
          Execute at: NULL
      Interval value: 1
      Interval field: DAY
              Starts: 2015-09-01 00:00:00
                Ends: NULL
              Status: ENABLED
          Originator: 1
character_set_client: utf8
collation_connection: utf8_general_ci
  Database Collation: utf8_general_ci
*************************** 2. row ***************************
                  Db: so_gibberish
                Name: Every_10_Minutes_Cleanup
             Definer: root@localhost
           Time zone: SYSTEM
                Type: RECURRING
          Execute at: NULL
      Interval value: 10
      Interval field: MINUTE
              Starts: 2015-09-01 00:00:00
                Ends: NULL
              Status: ENABLED
          Originator: 1
character_set_client: utf8
collation_connection: utf8_general_ci
  Database Collation: utf8_general_ci
2 rows in set (0.06 sec)

Random stuff to consider

drop event someEventName; -- <----- a good thing to know about

can't alias datediff and use in where clause in 1 line, so

select id,DATEDIFF(now(),updateDt) from theMessages where datediff(now(),updateDt)>6;

get more exact, 168 hours for 1 week old

select id,TIMESTAMPDIFF(HOUR, updateDt, now()) as `difference` FROM theMessages;
+----+------------+
| id | difference |
+----+------------+
|  1 |        410 |
|  2 |        301 |
|  3 |        169 |
|  4 |        167 |
+----+------------+

The link to the Manual Page shows quite a bit of flexibilty with interval choices, shown below:

interval:

quantity {YEAR | QUARTER | MONTH | DAY | HOUR | MINUTE |
          WEEK | SECOND | YEAR_MONTH | DAY_HOUR | DAY_MINUTE |
          DAY_SECOND | HOUR_MINUTE | HOUR_SECOND | MINUTE_SECOND}

Concurrency

Embed any concurrency measures necessary that multiple events (or multiple firings of the same event) don't cause data to run amok.

Set and Forget

Remember, for now, because you are going to forget it, that these events just keep firing. So build in solid code that will just keep running, even when you forget. Which you most likely will.

Your particular requirements

You need to determine which rows need to be deleted first by table, such that it honors Primary Key constraints. Just lump them all in proper order inside of the obvious area via the CREATE EVENT statement, which can be massive.