bfrguci bfrguci - 3 months ago 7
Linux Question

O_NONBLOCK does not raise exception in Python

I am trying to write a "cleaner" program to release a potential writer which is blocked at a named pipe (because no reader is reading from the pipe). However, the cleaner itself should not block when no writer is blocked writing to the pipe. In other words, the "cleaner" must return/terminate immediately, whether there is a blocked writer or not.

Therefore I searched for "Python non-blocking read from named pipe", and got these:


  1. How to read named FIFO non-blockingly?

  2. fifo - reading in a loop

  3. What conditions result in an opened, nonblocking named pipe (fifo) being "unavailable" for reads?

  4. Why does a read-only open of a named pipe block?



It seems that they suggest simply using
os.open(file_name, os.O_RDONLY | os.O_NONBLOCK)
should be fine, which didn't really work on my machine. I think I may have messed up somewhere or misunderstood some of their suggestion/situation. However, I really couldn't figure out what's wrong myself.

I found Linux man page (http://man7.org/linux/man-pages/man2/open.2.html), and the explanation of O_NONBLOCK seems consistent with their suggestions but not with my observation on my machine...

Just in case it is related, my OS is Ubuntu 14.04 LTS 64-bit.

Here is my code:

import os
import errno

BUFFER_SIZE = 65536

ph = None
try:
ph = os.open("pipe.fifo", os.O_RDONLY | os.O_NONBLOCK)
os.read(ph, BUFFER_SIZE)
except OSError as err:
if err.errno == errno.EAGAIN or err.errno == errno.EWOULDBLOCK:
raise err
else:
raise err
finally:
if ph:
os.close(ph)


(Don't know how to do Python syntax highlighting...)

Originally there is only the second
raise
, but I found that
os.open
and
os.read
, though not blocking, don't raise any exception either... I don't really know how much the writer will write to the buffer! If the non blocking
read
does not raise exception, how should I know when to stop reading?




Updated on 8/8/2016:

This seems to be a workaround/solution that satisfied my need:

import os
import errno

BUFFER_SIZE = 65536

ph = None
try:
ph = os.open("pipe.fifo", os.O_RDONLY | os.O_NONBLOCK)
while True:
buffer = os.read(ph, BUFFER_SIZE)
if len(buffer) < BUFFER_SIZE:
break
except OSError as err:
if err.errno == errno.EAGAIN or err.errno == errno.EWOULDBLOCK:
pass # It is supposed to raise one of these exceptions
else:
raise err
finally:
if ph:
os.close(ph)


It will loop on
read
. Every time it reads something, it compares the size of the content read with the specified
BUFFER_SIZE
, until it reaches EOF (writer will then unblock and continue/exit).

I still want to know why no exception is raised in that
read
.




Updated on 8/10/2016:

To make it clear, my overall goal is like this.

My main program (Python) has a thread serving as the reader. It normally blocks on the named pipe, waiting for "commands". There is a writer program (Shell script) which will write a one-liner "command" to the same pipe in each run.

In some cases, a writer starts before my main program starts, or after my main program terminates. In this case, the writer will block on the pipe waiting for a reader. In this way, if later my main program starts, it will read immediately from the pipe to get that "command" from the blocked writer - this is NOT what I want. I want my program to disregard writers that started before it.

Therefore, my solution is, during initialization of my reader thread, I do non-blocking read to release the writers, without really executing the "command" they were trying to write to the pipe.

Answer

This solution is incorrect.

while True:
    buffer = os.read(ph, BUFFER_SIZE)
    if len(buffer) < BUFFER_SIZE:
        break

This will not actually read everything, it will only read until it gets a partial read. Remember: You are only guaranteed to fill the buffer with regular files, in all other cases it is possible to get a partial buffer before EOF. The correct way to do this is to loop until the actual end of file is reached, which will give a read of length 0. The end of file indicates that there are no writers (they have all exited or closed the fifo).

while True:
    buffer = os.read(ph, BUFFER_SIZE)
    if not buffer:
        break

However, this will not work correctly in the face of non-blocking IO. It turns out non-blocking IO is completely unnecessary here.

import os
import fcntl

h = os.open("pipe.fifo", os.O_RDONLY | os.O_NONBLOCK)
# Now that we have successfully opened it without blocking,
# we no longer want the handle to be non-blocking
flags = fcntl.fcntl(h, fcntl.F_GETFL)
flags &= ~os.O_NONBLOCK
fcntl.fcntl(h, fcntl.F_SETFL, flags)
try:
    while True:
        # Only blocks if there is a writer
        buf = os.read(h, 65536)
        if not buf:
            # This happens when there are no writers
            break
finally:
    os.close(h)

The only scenario which will cause this code to block is if there is an active writer which has opened the fifo but is not writing to it. From what you've described, it doesn't sound like this is the case.

Non-blocking IO doesn't do that

Your program wants to do two things, depending on circumstance:

  1. If there are no writers, return immediately.

  2. If there are writers, read data from the FIFO until the writers are done.

Non-blocking read() has no effect whatsoever on task #1. Whether you use O_NONBLOCK or not, read() will return immediately in situation #1. So the only difference is in situation #2.

In situation #2, your program's goal is to read the entire block of data from the writers. That is exactly how blocking IO works: it waits for the writers to finish, and then read() returns. The whole point of non-blocking IO is to return early if the operation can't complete immediately, which is the opposite of your program's goal—which is to wait until the operation is complete.

If you use non-blocking read(), in situation #2, your program will sometimes return early, before the writers have finished their jobs. Or maybe your program will return after reading half of a command from the FIFO, leaving the other (now corrupted) half there. This concern is expressed in your question:

If the non blocking read does not raise exception, how should I know when to stop reading?

You know when to stop reading because read() returns zero bytes when all writers have closed the pipe. (Conveniently, this is also what happens if there were no writers in the first place.) This is unfortunately not what happens if the writers do not close their end of the pipe when they are done. It is far simpler and more straightforward if the writers close the pipe when done, so this is the recommended solution, even if you need to modify the writers a little bit. If the writers cannot close the pipe for whatever reason, the solution is more complicated.

The main use case for non-blocking read() is if your program has some other task to complete while IO goes on in the background.

Comments