LubosD LubosD - 1 month ago 11
Linux Question

Linux kernel: how to wait in multiple wait queues?

I know how to wait in Linux kernel queues using

wait_event
and how to wake them up.

Now I need to figure out how to wait in multiple queues at once. I need to multiplex multiple event sources, basically in a way similar to
poll
or
select
, but since the sources of events don't have the form of a pollable file descriptor, I wasn't able to find inspiration in the implementation of these syscalls.

My initial idea was to take the code from the
wait_event
macro, use
DEFINE_WAIT
multiple times as well as
prepare_to_wait
.

However, given how
prepare_to_wait
is implemented, I'm afraid the internal linked list of the queue would become corrupted if the same "waiter" is added multiple times (which could maybe happen if one queue causes wakeup, but the wait condition isn't met and waiting is being restarted).

Answer

One of possible scenarios for wait in several waitqueues:

int ret = 0; // Result of waiting; in form 0/-err.

// Define wait objects, one object per waitqueue.
DEFINE_WAIT_FUNC(wait1, default_wake_function);
DEFINE_WAIT_FUNC(wait2, default_wake_function);

// Add ourselves to all waitqueues.
add_wait_queue(wq1, &wait1);
add_wait_queue(wq2, &wait2);

// Waiting cycle
while(1) {
    // Change task state for waiting.
    // NOTE: this should come **before** condition checking for avoid races.
    set_current_state(TASK_INTERRUPTIBLE);
    // Check condition(s) which we are waiting
    if(cond) break;
    // Need to wait
    schedule();
    // Check if waiting has been interrupted by signal
    if (signal_pending(current)) {
        ret = -ERESTARTSYS;
        break;
    }               
}
// Remove ourselves from all waitqueues.
remove_wait_queue(wq1, &wait1);
remove_wait_queue(wq2, &wait2);
// Restore task state
__set_current_state(TASK_RUNNING);
// 'ret' contains result of waiting.

Note, that this scenario is slightly different from one of wait_event:

wait_event uses autoremove_wake_function for wait object (created with DEFINE_WAIT). This function, called from wake_up(), removes wait object from the queue. So it is needed to re-add wait object into the queue each iteration.

But in case of multiple waitqueues it is impossible to know, which waitqueue has fired. So following this strategy would require to re-add every wait object every iteration, which is inefficient.

Instead, our scenario uses default_wake_function for wait object, so the object is not removed from the waitqueue on wake_up() call, and it is sufficient to add wait object to the queue only once, before the cycle.