KartDev KartDev - 1 month ago 14
C Question

Alternative to sleep, semaphores

I have a simple c program ( on linux). The steps in the program are as follows:


  1. within a while loop, It calls a query that returns exactly one record. It is essentially a view that looks for a column called "processed" with value of "0" and uses "limit 1".

  2. I read the records in the result set and perform some calculations and upload the results back to the database. I also set the processed column to "1".

  3. If this query does not return any records, I exit the while loop.

  4. Once the while loop is exited, program exits.



Once it completes running, I do not want the program to exit. The reason is the database might get more qualifying records in the next 30 minutes. I want this program to be long running program that would check for any new records and start the while loop again to process the records.

I am not doing any multi threading or fancy stuff. I did some google and found posts talking about semaphore.

Is this the right way to go about? Are there any simple examples of semaphores with explanation?

Answer

First, I hope you're using a transaction. Otherwise there can be a race condition between 1 and 2.

I think your question is "How does your program know when there is more information to be processed in a SQL table?" There's several ways to do this.

The simplest is polling. Your program just checks every so often if there's any work. If there isn't, it sleeps for a while. If checking is cheap, or you don't have to check very often, polling is fine. It's pretty robust, there's no coordination necessary between the worker and the supplier. The worker just checks for work.

Another is to make the program block on some sort of I/O like waiting for a lock on a file. That's what semaphores are about. It goes like this.

  1. The queue is empty.
  2. The producer gets an exclusive lock on the semaphore file.
  3. Your worker tries to get a lock on the semaphore file, it blocks.
  4. The producer adds to the queue and releases its lock.
  5. The worker immediately unblocks.
    1. Checks the queue
    2. Does its work.

...but resetting the system is a problem. The producer doesn't know when the queue is empty again without polling. And this requires everything adding to the SQL table knows about this procedure and is located on the same machine. Even if you get it working, it's very vulnerable to deadlocks and race conditions.

Another way is via signals. The producer process sends a signal to the worker process to say "I added some work". As above, this requires coordination between the things adding to the SQL table and the workers.

A better solution is to not use SQL for a work queue. It's inherently something you have to poll. Instead use a named or network pipe. Pipes automatically act as a queue. Producers write to the pipe when they add work. The worker connects to the pipe and read from it to get more work. If there's no work, it quietly blocks waiting for work. The pipe can contain all the information necessary to do the work, or it can just contain an indication that there is work elsewhere (like an ID for a row).

Finally, depending on how much processing needs to be done, you could try doing all that processing in a stored procedure triggered by a table update.