code_talker code_talker - 24 days ago 15
SQL Question

Job queue as SQL table with multiple consumers (PostgreSQL)

I have a typical producer-consumer problem:

Multiple producer applications write job requests to a job-table on a PostgreSQL database.

The job requests have a state field that starts contains QUEUED on creation.

There are multiple consumer applications that are notified by a rule when a producer inserts a new record:

CREATE OR REPLACE RULE "jobrecord.added" AS
ON INSERT TO jobrecord DO
NOTIFY "jobrecordAdded";


They will try to reserve a new record by setting its state to RESERVED. Of course, only on consumer should succeed. All other consumers should not be able to reserve the same record. They should instead reserve other records with state=QUEUED.

Example:
some producer added the following records to table jobrecord:

id state owner payload
------------------------
1 QUEUED null <data>
2 QUEUED null <data>
3 QUEUED null <data>
4 QUEUED null <data>


now, two consumers A, B want to process them. They start running at the same time.
One should reserve id 1, the other one should reserve id 2, then the first one who finishes should reserve id 3 and so on..

In a pure multithreaded world, I would use a mutex to control access to the job queue, but the consumers are different processes that may run on different machines. They only access the same database, so all synchronization must happen through the database.

I read a lot of documentation about concurrent access and locking in PostgreSQL, e.g. http://www.postgresql.org/docs/9.0/interactive/explicit-locking.html
Select unlocked row in Postgresql
http://stackoverflow.com/questions/5636006/postgresql-and-locking/5638753#5638753

From these topics, I learned, that the following SQL statement should do what I need:

UPDATE jobrecord
SET owner= :owner, state = :reserved
WHERE id = (
SELECT id from jobrecord WHERE state = :queued
ORDER BY id LIMIT 1
)
RETURNING id; // will only return an id when they reserved it successfully


Unfortunately, when I run this in multiple consumer processes, in about 50% of the time, they still reserve the same record, both processing it and one overwriting the changes of the other.

What am I missing? How do I have to write the SQL statement so that multiple consumers will not reserve the same record?

Answer

Read my post here:

Consistency in postgresql with locking and select for update.

If you use transaction and LOCK TABLE you will have no problems.

Comments