You've come to this page because you've made a statement similar to the following:
I've encountered problems when setting stdin/stdout/stderr to non-blocking I/O mode using the
O_NDELAY
/O_NONBLOCK
flag.Someone is turning off non-blocking I/O on my shared file descriptor.
This is the frequently given answer to such statements.
Don't set shared (i.e. inheritable, inherited, or duplicated) file descriptors to use non-blocking I/O. Think of some other way to fix the actual problem that you wish to solve.
Setting file descriptors that you've inherited from your parent
process, file descriptors that you intend child processes to
inherit, or file descriptors that you've duplicated using dup()
or dup2()
, to non-blocking I/O mode causes the following
problems:
The code that shares the shared file descriptor will not expect it to be in non-blocking I/O mode. This can cause a range of effects, ranging from parent or child processes exiting with an error to the non-blocking I/O mode being unexpectedly removed from the file descriptor.
Two independent processes cannot reliably coördinate the setting and
resetting of the blocking I/O flag. It is possible for non-blocking I/O
to be left on if two processes happen to execute their respective
fcntl()
calls in a particular sequence.
It is possible for your program to exit prematurely, perhaps due to a fault, before resetting non-blocking I/O mode.
Non-blocking I/O mode applies to (to use the POSIX nomenclature) a file description, i.e. the internal operating system structure that a file descriptor refers to. If you set non-blocking I/O mode on a file descriptor that you've inherited from your parent process, or on a file descriptor that you intend child process to inherit from you, then the file description will be in non-blocking mode for those processes as well.
Programs are not normally coded with the expectation that the file
descriptors that they are using are in non-blocking I/O mode. How they
will behave when file descriptors are in non-blocking I/O mode depends
from how they handle the EAGAIN
error return from the
read()
and write()
system calls. Some programs,
most notably the Bourne Again Shell reading commands from its standard
input, will simply turn off the non-blocking I/O mode and attempt to
continue. Others will regard EAGAIN
as an I/O error. If the
file descriptor is their standard input and they are a filter, they may
well terminate immediately in response to that error.
The same even applies to independent library code within a single program,
sharing a file description via a file descriptor, private to the library,
created with dup()
or dup2()
.
The consequences of the Bourne Again Shell's behaviour are, ironically, that your program will unexpectedly (to it) be dealing with blocking I/O on that file descriptor once again. This is a special case of the more general problem that it is quite possible for other programs to be altering the blocking I/O flag as your program is running, and there is no way for processes to coördinate the setting and resetting of the flag.
If, for example, two processes are executing code where they save the original flags, set blocking I/O mode, perform I/O, and then restore the original saved value of the flag, they can end up executing in parallel in such an order that the restores do not execute in the reverse order of the saves, resulting in the blocking I/O flag being left on after both processes have executed their code. Run alone, each process saves and restores the flags properly. Run together, the entire system fails to work as intended.
Many textbooks and web pages present this very design when they give examples of how to perform non-blocking I/O. Just as these textbooks and web pages normally forget to include any error checking in their examples, they forget to include multiprogramming in the overall design. This design does not work when two separate and independent programs are sharing a single file description via inheritance (or even when independent library code within one process is sharing a single file description via file descriptor duplication).
Another example of not bearing in mind the possibility of errors occurring
is when one forgets that one's program may exit prematurely due to a fault
elsewhere. The manual page for the reset
command on Unix and
Linux systems talks about processes dying and leaving TTYs in abnormal
states. The exact same thing can happen with the blocking I/O flag as can
happen with the I/O mode flags of a TTY.
Often, using non-blocking I/O on shared file descriptors is
a chocolate-covered banana.
The problem that one actually wants to solve in many cases is insufficient
parallelism in one's code with respect to file I/O and other tasks.
The correct solution is to increase the parallelism.
Use polled I/O via
kevent()
, poll()
, /dev/poll
, or select()
; use
asynchronous I/O with aio_read()
, aio_write()
,
and friends; or use multiple threads.
nonblock_read()
and
nonblock_write()
Dan Bernstein would like non-blocking read and write functions to be added to the POSIX API. These allow the blocking mode to be specified in the system call, rather than have it be a property of the file description.
This idea isn't a new one. The I/O system API traps in QDOS for the Sinclair
QL, including both IO.FBYTE
and IO.SBYTE
, all have a
timeout parameter, that can be set to zero to indicate that the call should
return without blocking.
Such system calls could be used (by applications wanting to perform non-blocking I/O) on shared file descriptors without having to change the blocking mode of the file description itself; and would avoid all of the aforementioned problems. However, whilst QDOS had this functionality in 1984, the POSIX API contains no such functions and as of 2007 neither Linux nor any Unix have any such system calls.
© Copyright 2007–2007
Jonathan de Boyne Pollard.
"Moral" rights asserted.
Permission is hereby granted to copy and to distribute this web page in its
original, unmodified form as long as its last modification datestamp is preserved.