Mindaugas Jakubauskas Mindaugas Jakubauskas - 2 months ago 4x
SQL Question

Delete related information all at once and avoid nested queries (PostgreSql)

To begin with, I have few tables with a nested structure and I need to delete information from all of them by parent table ID.

For example, four tables:

country_id | country_name

city_id | country_id | city_name

house_id | city_id | house_name

room_id | house_id | room_name

I need to delete all information based on country_id (the country itself either).

Now I use nested queries and I am afraid it will overload the system.

Example of what I am using now:

DELETE FROM Room WHERE house_id IN (SELECT house_id FROM House WHERE city_id IN (SELECT city_id FROM City WHERE country_id = :country_id))
DELETE FROM House WHERE city_id IN (SELECT city_id FROM City ...

Is there any more compact way to do this? (The performance is the most important here..)


If Room.house_id is a foreign key referencing House.house_id, and so forth, then you can use cascading deletes. That is, if you have

foreign key (house_id) references House(house_id)

then make it

foreign key (house_id) references House(house_id)
    on delete cascade

That instructs the database that whenever the House referenced by a given Room is deleted, the Room must be deleted, too. If you set that up all the way up the hierarchy, then deletes at higher levels will "cascade" down to all lower levels. (Similar syntax is available when you declare the constraint in the column definition.)

Be aware that that's a whole-database characteristic, however, not a query-specific behavior. Enabling cascading deletes is dangerous because it makes it possible to inadvertently delete large swaths of data. Also, it's main advantage is simplicity. It probably will perform a bit better than the sequence of queries that you presented, because only one query is needed and because you don't need subqueries at the lower levels, but the difference might not be enough for you to notice.