I have a redshift table and it is storing a lot of data. Every weekend I go and manually using Workbench TRUNCATE last week of data that I no longer need.
I manually have to run
DELETE FROM tableName WHERE created_date BETWEEN timeStamp1 AND timeStamp2;
No, there is no in-built ability to run commands on a regular basis on Amazon Redshift. You could, however, run a script on another system that connects to Redshift and runs the command.
For example, a
cron job that calls
psql to connect to Redshift and execute the command. This could be done in a one-line script.
Alternatively, you could configure an AWS Lambda function to connect to Redshift and execute the command. (You would need to write the function yourself, but there are libraries that make this easier.) Then, you would configure Amazon CloudWatch Events to trigger the Lambda function on a desired schedule (eg once a week).
A common strategy is to actually store data in separate tables per time period (eg a month, but in your case it would be a week). Then, define a view that combines several tables. To delete a week of data, simply drop the table that contains that week of data, create a new table for this week's data, then update the view to point to the new table but not the old table.
By the way...
Your example uses the
DELETE command, which is not the same as the
TRUNCATE removes all data from a table. It is an efficient way to completely empty a table.
DELETE is good for removing part of a table but it simply marks rows as deleted. The data still occupies space on disk. Therefore, it is recommended that you
VACUUM the table after deleting a significant quantity of data.