I need to delete about 2 million rows from my PG database. I have a list of IDs that I need to delete. However, any way I try to do this is taking days.
I tried putting them in a table and doing it in batches of 100. 4 days later, this is still running with only 297268 rows deleted. (I had to select 100 id's from an ID table, delete where IN that list, delete from ids table the 100 I selected).
DELETE FROM tbl WHERE id IN (select * from ids)
It all depends ...
Delete all indexes (except the one on the ID which you need for the delete)
Recreate them afterwards (= much faster than incremental updates to indexes)
Check if you have triggers that can safely be deleted / disabled temporarily
Do foreign keys reference your table? Can they be deleted? Temporarily deleted?
Depending on your autovacuum settings it may help to run
VACUUM ANALYZE before the operation.
If you delete large portions of the table and the rest fits into RAM, the fastest and easiest way would be this:
SET temp_buffers = 1000MB -- or whatever you can spare temporarily CREATE TEMP TABLE tmp AS SELECT t.* FROM tbl t LEFT JOIN del_list d USING (id) WHERE d.id IS NULL; -- copy surviving rows into temporary table TRUNCATE tbl; -- empty table - truncate is very fast for big tables INSERT INTO tbl SELECT * FROM tmp; -- insert back surviving rows.
This way you don't have to recreate views, foreign keys or other depending objects.
Read about the
temp_buffers setting in the manual. This method is fast as long as the table fits into memory, or at least most of it. Be aware that you can lose data if your server crashes in the middle of this operation. You can wrap all of it into a transaction to make it safer.
Also, be advised:
TRUNCATEcannot be used on a table that has foreign-key references from other tables, unless all such tables are also truncated in the same command.
ANALYZE afterwards. Or
VACUUM ANALYZE if you did not go the truncate route, or
VACUUM FULL ANALYZE if you want to bring it to minimum size. For big tables consider the alternatives
For small tables, a simple
DELETE instead of
TRUNCATE is often faster:
DELETE FROM tbl t USING del_list d WHERE t.id = d.id;