raf raf - 27 days ago 12
C Question

semaphore hogging by one process

I'm writing a program to test interprocess communication, in particular, POSIX shared memory. I'm using POSIX semaphores to synchronize the processes' access to the shared memory. (I read that posix sem_open function lets you use the same semaphore between processes, as long as you use the same "name" identifier. )

Problem is - when I do sem_wait and sem_post one process... the other process does not catch the semaphore. Process 1 just hogs the semaphore and releases it and then grabs it back itself without ever giving the other process a chance to intervene.

Here is the code on process 1

if ((sem1 = sem_open(request->mem_group.sem_name, O_CREAT, 0644, 0)) ==
SEM_FAILED) {
perror("sem_open");
goto finish;
}

cache = simplecache_get(request->file_path);
*(int *)mem_shared = cache == -1 ? -1 : 1;
sem_post(sem);
sem_wait(sem);
if (cache == -1) {
break;


fprintf(stdout, "File was not found, going to finish\n");
}


file_length = lseek(cache, 0, SEEK_END);
lseek(cache, 0, SEEK_SET);
*(size_t *)mem_shared = file_length;

sem_post(sem);
sem_wait(sem1);

if (!file_len) {
goto finish;
}

bytes_transferred = 0;
while (bytes_transferred < file_len) {

//rest of while loop here which transfers file


And here is the block of code in Process 2 where it should be catching the semaphore but doesn't

sem_wait(sem1);

file_size = *(size_t *)mem_shared;

gfs_sendheader(ctx, GF_OK, file_size);

sem_post(sem1);

if (!file_size) {
fprintf(stderr, "File is empty. Go to finish");
break;
}


So the idea is - this process 2 should be getting the seemaphore in between post/wait in the other process- and at that point the shared mem segment has data in it and isn't empty.
However instead, it catches the semaphore at the very END of the other process, when it has emptied the sahred memory segment and deleted any data inside of it.

I did a lot of trouble shooting and confirmed that
a) the semaphore is the same semaphore in each process
b) Process 1 does at some point increment the semaphore, and then catch the same semaphore and decrement it (checked this with sem_getvalue)

I am running this on a Ubuntu virtual machine through Oracle VM VirtualBox. Underlying laptop is a Microsoft Surfacebook.

Have been stuck on this problem for 48 hours and feel extremely discouraged. Any tips or advice on how to more strategically debug it would also be appreciated.

Answer

This doesn't make sense:

sem_post(sem);
sem_wait(sem1);

You increment the semaphore, then immediately decrement it. There's a race condition, where either one of your processes could succeed with the wait (because of the post), but since this process is already on the CPU, perhaps it's always winning.

Normally one process would post, and the other would wait. The first process then proceeds, and can post again if it has more work for the second process, which waits as needed. If the two processes need to coordinate their actions (i.e. the first process pauses until the second says it's fine to go), then you'd use a second semaphore, and on this one the first process always waits, and the second posts. So one particular process only ever waits or posts on a particular semaphore, never both.