Wrapped long lines, wrapped at sentence boundaries; stripped trailing

white space.
This commit is contained in:
Michael Kerrisk 2007-04-12 22:42:49 +00:00
parent 4174ff5658
commit c13182efa3
834 changed files with 13253 additions and 10447 deletions

View File

@ -8,6 +8,8 @@ Contributors
The following people contributed notes, ideas, or patches that have
been incorporated in changes in this release:
Andi Kleen <andi@firstfloor.org>
John Heffner <jheffner@psc.edu>
Apologies if I missed anyone!
@ -15,9 +17,6 @@ Apologies if I missed anyone!
New pages
---------
Andi Kleen <andi@firstfloor.org>
John Heffner <jheffner@psc.edu>
Global changes
--------------

View File

@ -5476,7 +5476,7 @@ Samuel Thibault <samuel.thibault@ens-lyon.org>
Serge E. Hallyn <serge@hallyn.com>
Thomas Huriaux <thomas.huriaux@gmail.com>
Timo Sirainen <tss@iki.fi>
Val Henson <val_henson@linux.interl.com>
Val Henson <val_henson@linux.intel.com>
Apologies if I missed anyone!

View File

@ -40,7 +40,8 @@ _exit, _Exit \- terminate the current process
.SH DESCRIPTION
The function
.BR _exit ()
terminates the calling process "immediately". Any open file descriptors
terminates the calling process "immediately".
Any open file descriptors
belonging to the process are closed; any children of the process are
inherited by process 1,
.IR init ,
@ -83,7 +84,8 @@ is implementation dependent.
On the other hand,
.BR _exit ()
does close open file descriptors, and this may cause an unknown delay,
waiting for pending output to finish. If the delay is undesired,
waiting for pending output to finish.
If the delay is undesired,
it may be useful to call functions like \fItcflush\fP() before
calling \fB_exit\fP().
Whether any pending I/O is cancelled, and which pending I/O may be

View File

@ -85,7 +85,8 @@ argument is a value-result argument: it should initially contain the
size of the structure pointed to by
.IR addr ;
on return it will contain the actual length (in bytes) of the address
returned. When
returned.
When
.I addr
is NULL nothing is filled in.
.PP
@ -93,7 +94,8 @@ If no pending
connections are present on the queue, and the socket is not marked as
non-blocking,
.BR accept ()
blocks the caller until a connection is present. If the socket is marked
blocks the caller until a connection is present.
If the socket is marked
non-blocking and no pending connections are present on the queue,
.BR accept ()
fails with the error EAGAIN.
@ -105,8 +107,8 @@ or
A readable event will be delivered when a new connection is attempted and you
may then call
.BR accept ()
to get a socket for that connection. Alternatively, you can set the socket
to deliver
to get a socket for that connection.
Alternatively, you can set the socket to deliver
.B SIGIO
when activity occurs on a socket; see
.BR socket (7)
@ -117,9 +119,11 @@ such as
DECNet,
.BR accept ()
can be thought of as merely dequeuing the next connection request and not
implying confirmation. Confirmation can be implied by
implying confirmation.
Confirmation can be implied by
a normal read or write on the new file descriptor, and rejection can be
implied by closing the new socket. Currently only
implied by closing the new socket.
Currently only
DECNet
has these semantics on Linux.
.SH NOTES
@ -158,13 +162,15 @@ passes already-pending network errors on the new socket
as an error code from
.BR accept ().
This behaviour differs from other BSD socket
implementations. For reliable operation the application should detect
implementations.
For reliable operation the application should detect
the network errors defined for the protocol after
.BR accept ()
and treat
them like
.BR EAGAIN
by retrying. In case of TCP/IP these are
by retrying.
In case of TCP/IP these are
.BR ENETDOWN ,
.BR EPROTO ,
.BR ENOPROTOOPT ,
@ -234,7 +240,8 @@ may fail if:
Firewall rules forbid connection.
.PP
In addition, network errors for the new socket and as defined
for the protocol may be returned. Various Linux kernels can
for the protocol may be returned.
Various Linux kernels can
return other errors such as
.BR ENOSR ,
.BR ESOCKTNOSUPPORT ,
@ -281,11 +288,13 @@ Quoting Linus Torvalds:
.\" .I fails: only italicizes a single line
"_Any_ sane library _must_ have "socklen_t" be the same size
as int. Anything else breaks any BSD socket layer stuff.
as int.
Anything else breaks any BSD socket layer stuff.
POSIX initially \fIdid\fP make it a size_t, and I (and hopefully others, but
obviously not too many) complained to them very loudly indeed. Making
it a size_t is completely broken, exactly because size_t very seldom is
the same size as "int" on 64-bit architectures, for example. And it
obviously not too many) complained to them very loudly indeed.
Making it a size_t is completely broken, exactly because size_t very
seldom is the same size as "int" on 64-bit architectures, for example.
And it
\fIhas\fP to be the same size as "int" because that's what the BSD socket
interface is.
Anyway, the POSIX people eventually got a clue, and created "socklen_t".

View File

@ -81,10 +81,12 @@ actually attempting an operation.
This is to allow set-user-ID programs to
easily determine the invoking user's authority.
Only access bits are checked, not the file type or contents. Therefore, if
Only access bits are checked, not the file type or contents.
Therefore, if
a directory is found to be "writable," it probably means that files can be
created in the directory, and not that the directory can be written as a
file. Similarly, a DOS file may be found to be "executable," but the
file.
Similarly, a DOS file may be found to be "executable," but the
.BR execve (2)
call will still fail.

View File

@ -43,10 +43,11 @@ acct \- switch process accounting on or off
.SH DESCRIPTION
When called with the name of an existing file as argument, accounting is
turned on, records for each terminating process are appended to
\fIfilename\fP as it terminates. An argument of NULL causes
accounting to be turned off.
\fIfilename\fP as it terminates.
An argument of NULL causes accounting to be turned off.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
@ -122,7 +123,7 @@ SVr4, 4.3BSD (but not POSIX).
.\" (attempt is made to enable accounting using the same file that is
.\" currently being used).
.SH NOTES
No accounting is produced for programs running when a crash occurs. In
particular, nonterminating processes are never accounted for.
No accounting is produced for programs running when a crash occurs.
In particular, nonterminating processes are never accounted for.
.SH "SEE ALSO"
.BR acct (5).

View File

@ -45,14 +45,17 @@ They existed only on i386 and ia64 (when built with CONFIG_HUGETLB_PAGE).
In Linux 2.4.20 the syscall numbers exist, but the calls return ENOSYS.
.LP
On i386 the memory management hardware knows about ordinary pages (4 KiB)
and huge pages (2 or 4 MiB). Similarly ia64 knows about huge pages of
several sizes. These system calls serve to map huge pages into the
and huge pages (2 or 4 MiB).
Similarly ia64 knows about huge pages of
several sizes.
These system calls serve to map huge pages into the
process' memory or to free them again.
Huge pages are locked into memory, and are not swapped.
.LP
The
.I key
parameter is an identifier. When zero the pages are private, and
parameter is an identifier.
When zero the pages are private, and
not inherited by children.
When positive the pages are shared with other applications using the same
.IR key ,
@ -75,8 +78,8 @@ Addresses must be properly aligned.
.LP
The
.I len
parameter is the length of the required segment. It must be
a multiple of the huge page size.
parameter is the length of the required segment.
It must be a multiple of the huge page size.
.LP
The
.I prot
@ -87,10 +90,12 @@ The
.I flag
parameter is ignored, unless
.I key
is positive. In that case, if
is positive.
In that case, if
.I flag
is IPC_CREAT, then a new huge page segment is created when none
with the given key existed. If this flag is not set, then ENOENT
with the given key existed.
If this flag is not set, then ENOENT
is returned when no segment with the given key exists.
.IR
.SH "RETURN VALUE"
@ -98,7 +103,8 @@ On success,
.BR alloc_hugepages ()
returns the allocated virtual address, and
.BR free_hugepages ()
returns zero. On error, \-1 is returned, and
returns zero.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
@ -108,7 +114,8 @@ The system call is not supported on this kernel.
.SH "CONFORMING TO"
These calls existed only in Linux 2.5.36 through to 2.5.54.
These calls are specific to Linux on Intel processors, and should not be
used in programs intended to be portable. Indeed, the system call numbers
used in programs intended to be portable.
Indeed, the system call numbers
are marked for reuse, so programs using these may do something random
on a future kernel.
.SH FILES
@ -120,7 +127,8 @@ This can be read and written.
Gives info on the number of configured hugetlb pages and on their size
in the three variables HugePages_Total, HugePages_Free, Hugepagesize.
.SH NOTES
The system calls are gone. Now the hugetlbfs filesystem can be used instead.
The system calls are gone.
Now the hugetlbfs filesystem can be used instead.
Memory backed by huge pages (if the CPU supports them) is obtained by
using
.BR mmap ()

View File

@ -79,7 +79,8 @@ The 64bit base changes when a new 32bit segment selector is loaded.
.B ARCH_SET_GS
is disabled in some kernels.
Context switches for 64bit segment bases are rather expensive. It may be a
Context switches for 64bit segment bases are rather expensive.
It may be a
faster alternative to set a 32bit base using a segment selector by setting up
an LDT with
.BR modify_ldt (2)

View File

@ -96,8 +96,9 @@ before a
socket may receive connections (see
.BR accept (2)).
The rules used in name binding vary between address families. Consult
the manual entries in Section 7 for detailed information. For
The rules used in name binding vary between address families.
Consult the manual entries in Section 7 for detailed information.
For
.B AF_INET
see
.BR ip (7),
@ -189,7 +190,8 @@ main(int argc, char *argv[])
.fi
.in -0.25in
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -33,7 +33,8 @@ cacheflush \- flush contents of instruction and/or data cache
.SH DESCRIPTION
.BR cacheflush ()
flushes contents of indicated cache(s) for user addresses in the range
addr to (addr+nbytes-1). Cache may be one of:
addr to (addr+nbytes-1).
Cache may be one of:
.TP
.B ICACHE
Flush the instruction cache.
@ -46,8 +47,10 @@ Same as
.BR (ICACHE|DCACHE) .
.SH "RETURN VALUE"
.BR cacheflush ()
returns 0 on success or \-1 on error. If errors are detected,
errno will indicate the error.
returns 0 on success or \-1 on error.
If errors are detected,
.I errno
will indicate the error.
.SH ERRORS
.TP
.B EFAULT
@ -63,5 +66,5 @@ and
arguments.
Therefore, the whole cache is always flushed.
.SH NOTE
This system call is only available on MIPS based systems. It should
not be used in programs intended to be portable.
This system call is only available on MIPS based systems.
It should not be used in programs intended to be portable.

View File

@ -29,7 +29,8 @@ call, and a set of permitted capabilities
that it can make effective or inheritable.
.PP
These two functions are the raw kernel interface for getting and
setting capabilities. Not only are these system calls specific to Linux,
setting capabilities.
Not only are these system calls specific to Linux,
but the kernel API is likely to change and use of
these functions (in particular the format of the
.B cap_user_*_t
@ -99,13 +100,15 @@ to all members of the process group whose ID is \-\fIpid\fP.
For details on the data, see
.BR capabilities (7).
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
.TP
.B EFAULT
Bad memory address. Neither of
Bad memory address.
Neither of
.I hdrp
and
.I datap

View File

@ -50,11 +50,13 @@ is identical to
the only difference is that the directory is given as an
open file descriptor.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
Depending on the file system, other errors can be returned. The more
Depending on the file system, other errors can be returned.
The more
general errors for
.BR chdir ()
are listed below:

View File

@ -114,15 +114,17 @@ directories, see
On NFS file systems, restricting the permissions will immediately influence
already open files, because the access control is done on the server, but
open files are maintained by the client. Widening the permissions may be
open files are maintained by the client.
Widening the permissions may be
delayed for other clients if attribute caching is enabled on them.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
Depending on the file system, other errors can be returned. The more
general errors for
Depending on the file system, other errors can be returned.
The more general errors for
.BR chmod ()
are listed below:
.TP

View File

@ -64,7 +64,8 @@ or
is specified as \-1, then that ID is not changed.
When the owner or group of an executable file are changed by a non-superuser,
the S_ISUID and S_ISGID mode bits are cleared. POSIX does not specify whether
the S_ISUID and S_ISGID mode bits are cleared.
POSIX does not specify whether
this also should happen when root does the
.BR chown ();
the Linux behaviour depends on the kernel version.
@ -76,12 +77,13 @@ the S_ISGID bit indicates mandatory locking, and is not cleared
by a
.BR chown ().
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
Depending on the file system, other errors can be returned. The more
general errors for
Depending on the file system, other errors can be returned.
The more general errors for
.BR chown ()
are listed below.
.TP
@ -170,9 +172,11 @@ used by the superuser (that is, ordinary users cannot give away files).
.\" error conditions.
.SH RESTRICTIONS
The \fBchown\fP() semantics are deliberately violated on NFS file systems
which have UID mapping enabled. Additionally, the semantics of all system
which have UID mapping enabled.
Additionally, the semantics of all system
calls which access the file contents are violated, because \fBchown\fP()
may cause immediate access revocation on already open files. Client side
may cause immediate access revocation on already open files.
Client side
caching may lead to a delay between the time where ownership have
been changed to allow access for a user and the time where the file can
actually be accessed by the user on other clients.

View File

@ -41,8 +41,8 @@ chroot \- change root directory
.BR chroot ()
changes the root directory to that specified in
.IR path .
This directory will be used for pathnames beginning with /. The root
directory is inherited by all children of the current process.
This directory will be used for pathnames beginning with /.
The root directory is inherited by all children of the current process.
Only a privileged process (Linux: one with the
.B CAP_SYS_CHROOT
@ -60,12 +60,13 @@ by doing `mkdir foo; chroot foo; cd ..'.
This call does not close open file descriptors, and such file
descriptors may allow access to files outside the chroot tree.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
Depending on the file system, other errors can be returned. The more
general errors are listed below:
Depending on the file system, other errors can be returned.
The more general errors are listed below:
.TP
.B EACCES
Search permission is denied on a component of the path prefix.

View File

@ -70,9 +70,10 @@ Unlike
these calls
allow the child process to share parts of its execution context with
the calling process, such as the memory space, the table of file
descriptors, and the table of signal handlers. (Note that on this manual
page, "calling process" normally corresponds to "parent process". But see
the description of
descriptors, and the table of signal handlers.
(Note that on this manual
page, "calling process" normally corresponds to "parent process".
But see the description of
.B CLONE_PARENT
below.)
@ -104,20 +105,21 @@ function.
When the
.IR fn ( arg )
function application returns, the child process terminates. The
integer returned by
function application returns, the child process terminates.
The integer returned by
.I fn
is the exit code for the child process. The child process may also
terminate explicitly by calling
is the exit code for the child process.
The child process may also terminate explicitly by calling
.BR exit (2)
or after receiving a fatal signal.
The
.I child_stack
argument specifies the location of the stack used by the child
process. Since the child and calling process may share memory,
argument specifies the location of the stack used by the child process.
Since the child and calling process may share memory,
it is not possible for the child process to execute in the
same stack as the calling process. The calling process must therefore
same stack as the calling process.
The calling process must therefore
set up memory space for the child stack and pass a pointer to this
space to
.BR clone ().
@ -173,8 +175,10 @@ calling process itself, will be signaled.
If
.B CLONE_FS
is set, the caller and the child processes share the same file system
information. This includes the root of the file system, the current
working directory, and the umask. Any call to
information.
This includes the root of the file system, the current
working directory, and the umask.
Any call to
.BR chroot (2),
.BR chdir (2),
or
@ -224,10 +228,12 @@ process or the child process do not affect the other process.
.BR CLONE_NEWNS " (since Linux 2.4.19)"
Start the child in a new namespace.
Every process lives in a namespace. The
Every process lives in a namespace.
The
.I namespace
of a process is the data (the set of mounts) describing the file hierarchy
as seen by that process. After a
as seen by that process.
After a
.BR fork (2)
or
.BR clone (2)
@ -265,12 +271,15 @@ call.
If
.B CLONE_SIGHAND
is set, the calling process and the child processes share the same table of
signal handlers. If the calling process or child process calls
signal handlers.
If the calling process or child process calls
.BR sigaction (2)
to change the behavior associated with a signal, the behavior is
changed in the other process as well. However, the calling process and child
changed in the other process as well.
However, the calling process and child
processes still have distinct signal masks and sets of pending
signals. So, one of them may block or unblock some signals using
signals.
So, one of them may block or unblock some signals using
.BR sigprocmask (2)
without affecting the other process.
@ -279,7 +288,8 @@ If
is not set, the child process inherits a copy of the signal handlers
of the calling process at the time
.BR clone ()
is called. Calls to
is called.
Calls to
.BR sigaction (2)
performed later by one of the processes have no effect on the other
process.
@ -337,7 +347,8 @@ in any particular order.
If
.B CLONE_VM
is set, the calling process and the child processes run in the same memory
space. In particular, memory writes performed by the calling process
space.
In particular, memory writes performed by the calling process
or by the child process are also visible in the other process.
Moreover, any memory mapping or unmapping performed with
.BR mmap (2)
@ -358,8 +369,10 @@ processes do not affect the other, as with
If
.B CLONE_PID
is set, the child process is created with the same process ID as
the calling process. This is good for hacking the system, but otherwise
of not much use. Since 2.3.21 this flag can be
the calling process.
This is good for hacking the system, but otherwise
of not much use.
Since 2.3.21 this flag can be
specified only by the system boot process (PID 0).
It disappeared in Linux 2.5.16.
.TP
@ -514,14 +527,16 @@ in child memory when the child exits, and do a wakeup on the futex
at that address.
The address involved may be changed by the
.BR set_tid_address (2)
system call. This is used by threading libraries.
system call.
This is used by threading libraries.
.SS "sys_clone"
The
.B sys_clone
system call corresponds more closely to
.BR fork (2)
in that execution in the child continues from the point of the
call. Thus,
call.
Thus,
.B sys_clone
only requires the
.I flags
@ -538,7 +553,8 @@ is that the
.I child_stack
argument may be zero, in which case copy-on-write semantics ensure that the
child gets separate copies of stack pages when either process modifies
the stack. In this case, for correct operation, the
the stack.
In this case, for correct operation, the
.B CLONE_VM
option should not be specified.
@ -555,7 +571,8 @@ will be written in case CLONE_CHILD_SETTID was specified.
.\" gettid() returns current->pid;
.\" getpid() returns current->tgid;
On success, the thread ID of the child process is returned
in the caller's thread of execution. On failure, a \-1 will be returned
in the caller's thread of execution.
On failure, a \-1 will be returned
in the caller's context, no child process will be created, and
.I errno
will be set appropriately.

View File

@ -85,18 +85,22 @@ SVr4, 4.3BSD, POSIX.1-2001.
Not checking the return value of
.BR close ()
is a common but nevertheless
serious programming error. It is quite possible that errors on a
serious programming error.
It is quite possible that errors on a
previous
.BR write (2)
operation are first reported at the final
.BR close ().
Not checking the return value when closing the file may lead to
silent loss of data. This can especially be observed with NFS
silent loss of data.
This can especially be observed with NFS
and with disk quota.
.PP
A successful close does not guarantee that the data has been successfully
saved to disk, as the kernel defers writes. It is not common for a filesystem
to flush the buffers when the stream is closed. If you need to be sure that
saved to disk, as the kernel defers writes.
It is not common for a filesystem
to flush the buffers when the stream is closed.
If you need to be sure that
the data is physically stored use
.BR fsync (2).
(It will depend on the disk hardware at this point.)

View File

@ -100,7 +100,8 @@ is of type
then
.I serv_addr
is the address to which datagrams are sent by default, and the only
address from which datagrams are received. If the socket is of type
address from which datagrams are received.
If the socket is of type
.B SOCK_STREAM
or
.BR SOCK_SEQPACKET ,
@ -112,7 +113,8 @@ Generally, connection-based protocol sockets may successfully
.BR connect ()
only once; connectionless protocol sockets may use
.BR connect ()
multiple times to change their association. Connectionless sockets may
multiple times to change their association.
Connectionless sockets may
dissolve the association by connecting to an address with the
.I sa_family
member of
@ -120,13 +122,13 @@ member of
set to
.BR AF_UNSPEC .
.SH "RETURN VALUE"
If the connection or binding succeeds, zero is returned. On error, \-1 is
returned, and
If the connection or binding succeeds, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
The following are general socket errors only. There may be other
domain-specific error codes.
The following are general socket errors only.
There may be other domain-specific error codes.
.TP
.B EACCES
For Unix domain sockets, which are identified by pathname:
@ -150,7 +152,8 @@ The passed address didn't have the correct address family in its
field.
.TP
.B EAGAIN
No more free local ports or insufficient entries in the routing cache. For
No more free local ports or insufficient entries in the routing cache.
For
.B PF_INET
see the
.B net.ipv4.ip_local_port_range
@ -173,11 +176,13 @@ The socket structure address is outside the user's address space.
.TP
.B EINPROGRESS
The socket is non-blocking and the connection cannot be completed
immediately. It is possible to
immediately.
It is possible to
.BR select (2)
or
.BR poll (2)
for completion by selecting the socket for writing. After
for completion by selecting the socket for writing.
After
.BR select (2)
indicates writability, use
.BR getsockopt (2)
@ -209,8 +214,10 @@ Network is unreachable.
The file descriptor is not associated with a socket.
.TP
.B ETIMEDOUT
Timeout while attempting connection. The server may be too
busy to accept new connections. Note that for IP sockets the timeout may
Timeout while attempting connection.
The server may be too
busy to accept new connections.
Note that for IP sockets the timeout may
be very long when syncookies are enabled on the server.
.SH "CONFORMING TO"
SVr4, 4.4BSD, (the

View File

@ -23,7 +23,8 @@ is NULL,
all unused modules marked auto-clean will be removed.
This system call requires privilege.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned and
On success, zero is returned.
On error, \-1 is returned and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -107,7 +107,8 @@ is different from that returned by
.BR fcntl( "..., " F_DUPFD ", ..." )
when
.I newfd
is out of range. On some systems
is out of range.
On some systems
.BR dup2 ()
also sometimes returns
.B EINVAL
@ -118,7 +119,8 @@ If
.I newfd
was open, any errors that would have been reported at
.BR close ()
time, are lost. A careful programmer will not use
time, are lost.
A careful programmer will not use
.BR dup2 ()
without closing
.I newfd

View File

@ -34,13 +34,15 @@ Open an
file descriptor by requesting the kernel allocate
an event backing store dimensioned for
.I size
descriptors. The
descriptors.
The
.I size
is not the maximum size of the backing store but
just a hint to the kernel about how to dimension internal structures.
The returned file descriptor will be used for all the subsequent calls to the
.B epoll
interface. The file descriptor returned by
interface.
The file descriptor returned by
.BR epoll_create (2)
must be closed by using
.BR close (2).

View File

@ -100,7 +100,8 @@ will always wait for this event; it is not necessary to set it in
Sets the Edge Triggered behaviour for the associated file descriptor.
The default behaviour for
.B epoll
is Level Triggered. See
is Level Triggered.
See
.BR epoll (7)
for more detailed information about Edge and Level Triggered event
distribution architectures.
@ -112,7 +113,8 @@ This means that after an event is pulled out with
the associated file descriptor is internally disabled and no other events
will be reported by the
.B epoll
interface. The user must call
interface.
The user must call
.BR epoll_ctl (2)
with
.B EPOLL_CTL_MOD
@ -159,7 +161,8 @@ is ignored and can be NULL (but see BUGS below).
.SH "RETURN VALUE"
When successful,
.BR epoll_ctl (2)
returns zero. When an error occurs,
returns zero.
When an error occurs,
.BR epoll_ctl (2)
returns \-1 and
.I errno

View File

@ -39,7 +39,8 @@ file descriptor
.I epfd
for a maximum time of
.I timeout
milliseconds. The memory area pointed to by
milliseconds.
The memory area pointed to by
.I events
will contain the events that will be available for the caller.
Up to
@ -48,7 +49,8 @@ are returned by
.BR epoll_wait (2).
The
.I maxevents
parameter must be greater than zero. Specifying a
parameter must be greater than zero.
Specifying a
.I timeout
of \-1 makes
.BR epoll_wait (2)
@ -92,7 +94,8 @@ When successful,
returns the number of file descriptors ready for the requested I/O, or zero
if no file descriptor became ready during the requested
.I timeout
milliseconds. When an error occurs,
milliseconds.
When an error occurs,
.BR epoll_wait (2)
returns \-1 and
.I errno

View File

@ -51,9 +51,9 @@ executable which is not itself a script, which will be invoked as
\fIargv\fP is an array of argument strings passed to the new program.
\fIenvp\fP is an array of strings, conventionally of the form
\fBkey=value\fR, which are passed as environment to the new
program. Both \fIargv\fP and \fIenvp\fP must be terminated by a null
pointer. The argument vector and environment can be accessed by the
\fBkey=value\fR, which are passed as environment to the new program.
Both \fIargv\fP and \fIenvp\fP must be terminated by a null pointer.
The argument vector and environment can be accessed by the
called program's main function, when it is defined as \fBint main(int
argc, char *argv[], char *envp[])\fR.
@ -87,7 +87,8 @@ and link the executable with them.
If the executable is a dynamically-linked ELF executable, the
interpreter named in the PT_INTERP segment is used to load the needed
shared libraries. This interpreter is typically
shared libraries.
This interpreter is typically
\fI/lib/ld-linux.so.1\fR for binaries linked with the Linux libc
version 5, or \fI/lib/ld-linux.so.2\fR for binaries linked with the
GNU libc version 2.

View File

@ -427,7 +427,8 @@ refers to a socket,
.B F_SETOWN
also selects
the recipient of SIGURG signals that are delivered when out-of-band
data arrives on that socket. (SIGURG is sent in any situation where
data arrives on that socket.
(SIGURG is sent in any situation where
.BR select (2)
would report the socket as having an "exceptional condition".)
.\" The following appears to be rubbish. It doesn't seem to
@ -489,14 +490,16 @@ process rather than to a specific thread.
.\" See fs/fcntl.c::send_sigio_to_task() (2.4/2.6) sources -- MTK, Apr 05
.TP
.B F_GETSIG
Get the signal sent when input or output becomes possible. A value of
zero means SIGIO is sent. Any other value (including SIGIO) is the
Get the signal sent when input or output becomes possible.
A value of zero means SIGIO is sent.
Any other value (including SIGIO) is the
signal sent instead, and in this case additional info is available to
the signal handler if installed with SA_SIGINFO.
.TP
.B F_SETSIG
Sets the signal sent when input or output becomes possible. A value of
zero means to send the default SIGIO signal. Any other value (including
Sets the signal sent when input or output becomes possible.
A value of zero means to send the default SIGIO signal.
Any other value (including
SIGIO) is the signal to send instead, and in this case additional info
is available to the signal handler if installed with SA_SIGINFO.
.sp
@ -521,7 +524,8 @@ If the
.I si_code
field indicates the source is SI_SIGIO, the
.I si_fd
field gives the file descriptor associated with the event. Otherwise,
field gives the file descriptor associated with the event.
Otherwise,
there is no indication which file descriptors are pending, and you
should use the usual mechanisms
.RB ( select (2),
@ -532,8 +536,9 @@ with
set etc.) to determine which file descriptors are available for I/O.
.sp
By selecting a real time signal (value >= SIGRTMIN), multiple
I/O events may be queued using the same signal numbers. (Queuing is
dependent on available memory). Extra information is available
I/O events may be queued using the same signal numbers.
(Queuing is dependent on available memory).
Extra information is available
if SA_SIGINFO is set for the signal handler, as above.
.PP
Using these mechanisms, a program can implement fully asynchronous I/O
@ -551,7 +556,8 @@ is specific to BSD and Linux.
.B F_GETSIG
and
.B F_SETSIG
are Linux specific. POSIX has asynchronous I/O and the
are Linux specific.
POSIX has asynchronous I/O and the
.I aio_sigevent
structure to achieve similar things; these are also available
in Linux as part of the GNU C Library (Glibc).
@ -769,7 +775,8 @@ New applications should consider using the
.I inotify
interface (available since kernel 2.6.13),
which provides a superior interface for obtaining notifications of
file system events. See
file system events.
See
.BR inotify (7).
.SH "RETURN VALUE"
For a successful call, the return value depends on the operation:
@ -830,14 +837,16 @@ the command was interrupted by a signal.
For
.BR F_GETLK " and " F_SETLK ,
the command was interrupted by a signal before the lock was checked or
acquired. Most likely when locking a remote file (e.g. locking over
acquired.
Most likely when locking a remote file (e.g. locking over
NFS), but can sometimes happen locally.
.TP
.B EINVAL
For
.BR F_DUPFD ,
.I arg
is negative or is greater than the maximum allowable value. For
is negative or is greater than the maximum allowable value.
For
.BR F_SETSIG ,
.I arg
is not an allowable signal number.
@ -869,7 +878,8 @@ and
POSIX.1-2001 allows
.I l_len
to be negative. (And if it is, the interval described by the lock
to be negative.
(And if it is, the interval described by the lock
covers bytes
.IR l_start + l_len
up to and including

View File

@ -38,7 +38,8 @@ fdatasync \- synchronize a file's in-core data with that on disk
.SH DESCRIPTION
.BR fdatasync ()
flushes all data buffers of a file to disk (before the system
call returns). It resembles
call returns).
It resembles
.BR fsync ()
but is not required to update the metadata such as access time.
@ -46,16 +47,19 @@ Applications that access databases or log files often write a tiny
data fragment (e.g., one line in a log file) and then call
.BR fsync ()
immediately in order to ensure that the written data is physically
stored on the harddisk. Unfortunately,
stored on the harddisk.
Unfortunately,
.BR fsync ()
will always initiate two write operations: one for the newly written
data and another one in order to update the modification time stored
in the inode. If the modification time is not a part of the transaction
in the inode.
If the modification time is not a part of the transaction
concept
.BR fdatasync ()
can be used to avoid unnecessary inode disk write operations.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -107,7 +107,8 @@ are preserved across an
A shared or exclusive lock can be placed on a file regardless of the
mode in which the file was opened.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
@ -142,7 +143,8 @@ possibly implemented in terms of
appears on most Unix systems.
.SH NOTES
.BR flock (2)
does not lock files over NFS. Use
does not lock files over NFS.
Use
.BR fcntl (2)
instead: that does work over NFS, given a sufficiently recent version of
Linux and a server which supports locking.

View File

@ -150,9 +150,9 @@ This means that the two descriptors share the same flags
.RI ( mq_flags ).
.SH "RETURN VALUE"
On success, the PID of the child process is returned in the parent's thread
of execution, and a 0 is returned in the child's thread of execution. On
failure, a \-1 will be returned in the parent's context, no child process
will be created, and
of execution, and a 0 is returned in the child's thread of execution.
On failure, a \-1 will be returned in the parent's context,
no child process will be created, and
.I errno
will be set appropriately.
.SH ERRORS

View File

@ -89,7 +89,8 @@ The aim of
is to reduce disk activity for applications that do not
require all metadata to be synchronised with the disk.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -38,7 +38,8 @@ addresses for the same memory in separate processes may not be
equal, the kernel maps them internally so the same memory mapped in
different locations will correspond for
.BR futex ()
calls). It is typically used to
calls).
It is typically used to
implement the contended case of a lock in shared memory, as
described in
.BR futex (7).
@ -46,7 +47,8 @@ described in
When a
.BR futex (7)
operation did not finish uncontended in userspace, a call needs to be made
to the kernel to arbitrate. Arbitration can either mean putting the calling
to the kernel to arbitrate.
Arbitration can either mean putting the calling
process to sleep or, conversely, waking a waiting process.
.PP
Callers of this function are expected to adhere to the semantics as set out in
@ -71,10 +73,12 @@ This operation atomically verifies that the futex address
.I uaddr
still contains the value
.IR val ,
and sleeps awaiting FUTEX_WAKE on this futex address. If the
and sleeps awaiting FUTEX_WAKE on this futex address.
If the
.I timeout
argument is non-NULL, its contents describe the maximum
duration of the wait, which is infinite otherwise. The arguments
duration of the wait, which is infinite otherwise.
The arguments
.I uaddr2
and
.I val3
@ -127,7 +131,8 @@ in June 2007; any applications that use it should be fixed now.
.BR FUTEX_REQUEUE " (since Linux 2.5.70)"
This operation was introduced in order to avoid a "thundering herd" effect
when FUTEX_WAKE is used and all processes woken up need to acquire another
futex. This call wakes up
futex.
This call wakes up
.I val
processes, and requeues all other waiters on the futex at address
.IR uaddr2 .
@ -139,7 +144,8 @@ are ignored.
.TP
.BR FUTEX_CMP_REQUEUE " (since Linux 2.6.7)"
There was a race in the intended use of FUTEX_REQUEUE, so
FUTEX_CMP_REQUEUE was introduced. This is similar to FUTEX_REQUEUE,
FUTEX_CMP_REQUEUE was introduced.
This is similar to FUTEX_REQUEUE,
but first checks whether the location
.I uaddr
still contains the value
@ -154,9 +160,12 @@ Depending on which operation was executed, the returned value can have
differing meanings.
.TP
.B FUTEX_WAIT
Returns 0 if the process was woken by a FUTEX_WAKE call. In case of timeout,
ETIMEDOUT is returned. If the futex was not equal to the expected value,
the operation returns EWOULDBLOCK. Signals (or other spurious wakeups)
Returns 0 if the process was woken by a FUTEX_WAKE call.
In case of timeout,
ETIMEDOUT is returned.
If the futex was not equal to the expected value,
the operation returns EWOULDBLOCK.
Signals (or other spurious wakeups)
cause FUTEX_WAIT to return EINTR.
.TP
.B FUTEX_WAKE
@ -193,7 +202,8 @@ The system limit on the total number of open files has been reached.
.SH "NOTES"
.PP
To reiterate, bare futexes are not intended as an easy to use abstraction
for end-users. Implementors are expected to be assembly literate and to have
for end-users.
Implementors are expected to be assembly literate and to have
read the sources of the futex userspace library referenced below.
.\" .SH "AUTHORS"
.\" .PP
@ -205,9 +215,12 @@ read the sources of the futex userspace library referenced below.
.SH "VERSIONS"
.PP
Initial futex support was merged in Linux 2.5.7 but with different semantics
from what was described above. A 4-parameter system call with the semantics
given here was introduced in Linux 2.5.40. In Linux 2.5.70 one parameter
was added. In Linux 2.6.7 a sixth parameter was added \(em messy, especially
from what was described above.
A 4-parameter system call with the semantics
given here was introduced in Linux 2.5.40.
In Linux 2.5.70 one parameter
was added.
In Linux 2.6.7 a sixth parameter was added \(em messy, especially
on the s390 architecture.
.SH "CONFORMING TO"
This system call is Linux specific.

View File

@ -72,7 +72,8 @@ The function \fBgetcontext\fP() initializes the structure
pointed at by \fIucp\fP to the currently active context.
.LP
The function \fBsetcontext\fP() restores the user context
pointed at by \fIucp\fP. A successful call does not return.
pointed at by \fIucp\fP.
A successful call does not return.
The context should have been obtained by a call of \fBgetcontext\fP(),
or \fBmakecontext\fP(), or passed as third argument to a signal
handler.
@ -91,20 +92,24 @@ When this member is NULL, the thread exits.
If the context was obtained by a call to a signal handler,
then old standard text says that "program execution continues with the
program instruction following the instruction interrupted
by the signal". However, this sentence was removed in SUSv2,
by the signal".
However, this sentence was removed in SUSv2,
and the present verdict is "the result is unspecified".
.SH "RETURN VALUE"
When successful, \fBgetcontext\fP() returns 0 and \fBsetcontext\fP()
does not return. On error, both return \-1 and set \fIerrno\fP
does not return.
On error, both return \-1 and set \fIerrno\fP
appropriately.
.SH ERRORS
None defined.
.SH NOTES
The earliest incarnation of this mechanism was the
\fIsetjmp\fP()/\fIlongjmp\fP() mechanism. Since that does not define
\fIsetjmp\fP()/\fIlongjmp\fP() mechanism.
Since that does not define
the handling of the signal context, the next stage was the
\fIsigsetjmp\fP()/\fIsiglongjmp\fP() pair.
The present mechanism gives much more control. On the other hand,
The present mechanism gives much more control.
On the other hand,
there is no easy way to detect whether a return from \fBgetcontext\fP()
is from the first call, or via a \fBsetcontext\fP() call.
The user has to invent her own bookkeeping device, and a register
@ -113,7 +118,8 @@ variable won't do since registers are restored.
When a signal occurs, the current user context is saved and
a new context is created by the kernel for the signal handler.
Do not leave the handler using \fIlongjmp\fP(): it is undefined
what would happen with contexts. Use \fIsiglongjmp\fP() or
what would happen with contexts.
Use \fIsiglongjmp\fP() or
\fIsetcontext\fP() instead.
.SH "CONFORMING TO"
SUSv2, POSIX.1-2001.

View File

@ -41,7 +41,8 @@ If the null-terminated domain name requires more than \fIlen\fP bytes,
.BR getdomainname ()
returns the first \fIlen\fP bytes (glibc) or returns an error (libc).
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -39,7 +39,8 @@ one more than the largest possible value for a file descriptor.
The current limit on the number of open files per process.
.SH NOTES
.BR getdtablesize ()
is implemented as a libc library function. The glibc version calls
is implemented as a libc library function.
The glibc version calls
.BR getrlimit (2)
and returns the current
.B RLIMIT_NOFILE

View File

@ -47,7 +47,8 @@ Up to
supplementary group IDs (of the calling process) are returned in
.IR list .
It is unspecified whether the effective group ID of the calling process
is included in the returned list. (Thus, an application should also call
is included in the returned list.
(Thus, an application should also call
.BR getegid (2)
and add or remove the resulting value.)
If
@ -71,7 +72,8 @@ On error, \-1 is returned, and
is set appropriately.
.TP
.BR setgroups ()
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -36,9 +36,10 @@ gethostid, sethostid \- get or set the unique identifier of the current host
.br
.BI "int sethostid(long " hostid );
.SH DESCRIPTION
Get or set a unique 32-bit identifier for the current machine. The 32-bit
identifier is intended to be unique among all UNIX systems in
existence. This normally resembles the Internet address for the local
Get or set a unique 32-bit identifier for the current machine.
The 32-bit identifier is intended to be unique among all UNIX systems in
existence.
This normally resembles the Internet address for the local
machine, as returned by
.BR gethostbyname (3),
and thus usually never needs to be set.

View File

@ -46,10 +46,12 @@ system call returns a null-terminated hostname (set earlier by
.BR sethostname ())
in the array \fIname\fP that has a length of \fIlen\fP bytes.
In case the null-terminated hostname does not fit, no error is
returned, but the hostname is truncated. It is unspecified
returned, but the hostname is truncated.
It is unspecified
whether the truncated hostname will be null-terminated.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -17,8 +17,9 @@ getitimer, setitimer \- get or set value of an interval timer
.BI " struct itimerval *" ovalue );
.fi
.SH DESCRIPTION
The system provides each process with three interval timers, each decrementing
in a distinct time domain. When any timer expires, a signal is sent to the
The system provides each process with three interval timers,
each decrementing in a distinct time domain.
When any timer expires, a signal is sent to the
process, and the timer (potentially) restarts.
.TP 1.5i
.B ITIMER_REAL
@ -33,10 +34,11 @@ upon expiration.
.TP
.B ITIMER_PROF
decrements both when the process executes and when the system is executing
on behalf of the process. Coupled with
on behalf of the process.
Coupled with
.BR ITIMER_VIRTUAL ,
this timer is usually used to profile the time spent by the application in user
and kernel space.
this timer is usually used to profile the time spent by the
application in user and kernel space.
.B SIGPROF
is delivered upon expiration.
.LP
@ -71,7 +73,8 @@ or
The element
.I it_value
is set to the amount of time remaining on the timer, or zero if the timer
is disabled. Similarly,
is disabled.
Similarly,
.I it_interval
is set to the reset value.
The function
@ -105,10 +108,12 @@ on the system timer resolution and on the system load.
Upon expiration, a signal will be generated and the timer reset.
If the timer expires while the process is active (always true for
.BR ITIMER_VIRTUAL )
the signal will be delivered immediately when generated. Otherwise the
the signal will be delivered immediately when generated.
Otherwise the
delivery will be offset by a small time dependent on the system loading.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -75,7 +75,8 @@ If it is, it returns the kernel symbol PAGE_SIZE,
which is architecture and machine model dependent.
Generally, one uses binaries that are architecture but not
machine model dependent, in order to have a single binary
distribution per architecture. This means that a user program
distribution per architecture.
This means that a user program
should not find PAGE_SIZE at compile time from a header file,
but use an actual system call, at least for those architectures
(like sun4) where this dependency exists.

View File

@ -53,10 +53,11 @@ The
parameter should be initialized to indicate the amount of space pointed to
by
.IR name .
On return it contains the actual size of the name returned (in bytes). The
name is truncated if the buffer provided is too small.
On return it contains the actual size of the name returned (in bytes).
The name is truncated if the buffer provided is too small.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -33,7 +33,8 @@ getpid, getppid \- get process identification
.B pid_t getppid(void);
.SH DESCRIPTION
.BR getpid ()
returns the process ID of the current process. (This is often used by
returns the process ID of the current process.
(This is often used by
routines that generate unique temporary filenames.)
.BR getppid ()

View File

@ -93,10 +93,12 @@ lower priorities cause more favorable scheduling.
The
.BR getpriority ()
call returns the highest priority (lowest numerical value)
enjoyed by any of the specified processes. The
enjoyed by any of the specified processes.
The
.BR setpriority ()
call sets the priorities of all of the specified processes
to the specified value. Only the superuser may lower priorities.
to the specified value.
Only the superuser may lower priorities.
.SH "RETURN VALUE"
Since
.BR getpriority ()

View File

@ -42,7 +42,8 @@ and
get the real UID, effective UID, and saved set-user-ID (resp. group ID's)
of the current process.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -327,7 +327,8 @@ To handle this signal, a process must employ an alternate signal stack
is the BSD name for
.BR RLIMIT_NOFILE .
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -75,7 +75,8 @@ struct rusage {
.fi
.in -0.5i
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -48,7 +48,8 @@ getsockname \- get socket name
.BR getsockname ()
returns the current
.I name
for the specified socket. The
for the specified socket.
The
.I namelen
parameter should be initialized to indicate
the amount of space pointed to by
@ -56,7 +57,8 @@ the amount of space pointed to by
On return it contains the actual size of the name
returned (in bytes).
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -60,7 +60,8 @@ and
.BR setsockopt ()
manipulate the
.I options
associated with a socket. Options may exist at multiple
associated with a socket.
Options may exist at multiple
protocol levels; they are always present at the uppermost
.B socket
level.
@ -73,7 +74,8 @@ is specified as
.BR SOL_SOCKET .
To manipulate options at any
other level the protocol number of the appropriate protocol
controlling the option is supplied. For example,
controlling the option is supplied.
For example,
to indicate that an option is to be interpreted by the
.B TCP
protocol,
@ -92,23 +94,26 @@ are used to access option values for
For
.BR getsockopt ()
they identify a buffer in which the value for the
requested option(s) are to be returned. For
requested option(s) are to be returned.
For
.BR getsockopt (),
.I optlen
is a value-result parameter, initially containing the
size of the buffer pointed to by
.IR optval ,
and modified on return to indicate the actual size of
the value returned. If no option value is
to be supplied or returned,
the value returned.
If no option value is to be supplied or returned,
.I optval
may be NULL.
.I Optname
and any specified options are passed uninterpreted to the appropriate
protocol module for interpretation. The include file
protocol module for interpretation.
The include file
.I <sys/socket.h>
contains definitions for socket level options, described below. Options at
contains definitions for socket level options, described below.
Options at
other protocol levels vary in format and name; consult the appropriate
entries in section 4 of the manual.
@ -125,7 +130,8 @@ For a description of the available socket options see
.BR socket (7)
and the appropriate protocol man pages.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
@ -138,7 +144,8 @@ is not a valid descriptor.
.B EFAULT
The address pointed to by
.I optval
is not in a valid part of the process address space. For
is not in a valid part of the process address space.
For
.BR getsockopt (),
this error may also be returned if
.I optlen

View File

@ -38,13 +38,15 @@ gettid \- get thread identification
.fi
.B pid_t gettid(void);
.SH DESCRIPTION
\fBgettid\fP() returns the thread ID of the current process. This is equal
\fBgettid\fP() returns the thread ID of the current process.
This is equal
to the process ID (as returned by
.BR getpid (2)),
unless the process is part of a thread group (created by specifying
the CLONE_THREAD flag to the
.BR clone (2)
system call). All processes in the same thread group
system call).
All processes in the same thread group
have the same PID, but each one has a unique TID.
.SH "RETURN VALUE"
On success, returns the thread ID of the current process.

View File

@ -105,14 +105,16 @@ The
field has never been used under Linux; it has not
been and will not be supported by libc or glibc.
Each and every occurrence of this field in the kernel source
(other than the declaration) is a bug. Thus, the following
(other than the declaration) is a bug.
Thus, the following
is purely of historic interest.
The field
.I tz_dsttime
contains a symbolic constant (values are given below)
that indicates in which part of the year Daylight Saving Time
is in force. (Note: its value is constant throughout the year:
is in force.
(Note: its value is constant throughout the year:
it does not indicate that DST is in force, it just selects an
algorithm.)
The daylight saving time algorithms defined are as follows :
@ -143,8 +145,10 @@ Of course it turned out that the period in which
Daylight Saving Time is in force cannot be given
by a simple algorithm, one per country; indeed,
this period is determined by unpredictable political
decisions. So this method of representing time zones
has been abandoned. Under Linux, in a call to
decisions.
So this method of representing time zones
has been abandoned.
Under Linux, in a call to
.BR settimeofday ()
the
.I tz_dsttime
@ -160,7 +164,8 @@ argument, the
.I tv
argument is NULL and the
.I tz_minuteswest
field is non-zero. In such a case it is assumed that the CMOS clock
field is non-zero.
In such a case it is assumed that the CMOS clock
is on local time, and that it has to be incremented by this amount
to get UTC system time.
No doubt it is a bad idea to use this feature.

View File

@ -60,7 +60,8 @@ the rest of the module.
.PP
This system call requires privilege.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned and
On success, zero is returned.
On error, \-1 is returned and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -41,7 +41,8 @@ from the inotify instance associated with the file descriptor
Removing a watch causes an
.B IN_IGNORED
event to be generated for this watch descriptor. (See
event to be generated for this watch descriptor.
(See
.BR inotify (7).)
.SH "RETURN VALUE"
On success,

View File

@ -56,8 +56,9 @@ A _syscall macro
desired system call
.SS Setup
The important thing to know about a system call is its prototype. You
need to know how many arguments, their types, and the function return type.
The important thing to know about a system call is its prototype.
You need to know how many arguments, their types,
and the function return type.
There are six macros that make the actual call into the system easier.
They have the form:
.sp
@ -79,7 +80,8 @@ system call
.RE
.sp
These macros create a function called \fIname\fP with the arguments you
specify. Once you include the _syscall() in your source file,
specify.
Once you include the _syscall() in your source file,
you call the system call by \fIname\fP.
.SH EXAMPLE
.sp
@ -126,11 +128,13 @@ Swap: total 27881472 / free 24698880
Number of processes = 40
.fi
.SH NOTES
The _syscall() macros DO NOT produce a prototype. You may have to
The _syscall() macros DO NOT produce a prototype.
You may have to
create one, especially for C++ users.
.sp
System calls are not required to return only positive or negative error
codes. You need to read the source to be sure how it will return errors.
codes.
You need to read the source to be sure how it will return errors.
Usually, it is the negative of a standard error code, e.g., \-\fBEPERM\fP.
The _syscall() macros will return the result \fIr\fP of the system call
when \fIr\fP is nonnegative, but will return \-1 and set the variable
@ -141,7 +145,8 @@ For the error codes, see
.sp
Some system calls, such as
.BR mmap (),
require more than five arguments. These are handled by pushing the
require more than five arguments.
These are handled by pushing the
arguments on the stack and passing a pointer to the block of arguments.
.sp
When defining a system call, the argument types MUST be passed by-value

View File

@ -46,16 +46,18 @@ ioctl \- control device
.SH DESCRIPTION
The
.BR ioctl ()
function manipulates the underlying device parameters of special files. In
particular, many operating characteristics of character special files
function manipulates the underlying device parameters of special files.
In particular, many operating characteristics of character special files
(e.g. terminals) may be controlled with
.BR ioctl ()
requests. The argument
requests.
The argument
.I d
must be an open file descriptor.
.PP
The second argument is a device-dependent request code. The third
argument is an untyped pointer to memory. It's traditionally
The second argument is a device-dependent request code.
The third argument is an untyped pointer to memory.
It's traditionally
.BI "char *" argp
(from the days before
.B "void *"
@ -70,7 +72,8 @@ parameter or
.I out
parameter, and the size of the argument
.I argp
in bytes. Macros and defines used in specifying an
in bytes.
Macros and defines used in specifying an
.BR ioctl ()
.I request
are located in the file

View File

@ -26,7 +26,8 @@
ioctl_list \- list of ioctl calls in Linux/i386 kernel
.SH DESCRIPTION
This is Ioctl List 1.3.27, a list of ioctl calls in Linux/i386 kernel
1.3.27. It contains 421 ioctls from /usr/include/{asm,linux}/*.h.
1.3.27.
It contains 421 ioctls from /usr/include/{asm,linux}/*.h.
For each ioctl, its numerical value, its name, and its argument
type are given.
.PP
@ -36,7 +37,8 @@ If the kernel uses the argument for both input and output, this is
marked with // I-O.
.PP
Some ioctls take more arguments or return more values than a single
structure. These are marked // MORE and documented further in a
structure.
These are marked // MORE and documented further in a
separate section.
.PP
This list is very incomplete.
@ -49,9 +51,12 @@ tried to build some structure into them.
.LP
The old Linux situation was that of mostly 16-bit constants, where the
last byte is a serial number, and the preceding byte(s) give a type
indicating the driver. Sometimes the major number was used: 0x03
for the HDIO_* ioctls, 0x06 for the LP* ioctls. And sometimes
one or more ASCII letters were used. For example, TCGETS has value
indicating the driver.
Sometimes the major number was used: 0x03
for the HDIO_* ioctls, 0x06 for the LP* ioctls.
And sometimes
one or more ASCII letters were used.
For example, TCGETS has value
0x00005401, with 0x54 = 'T' indicating the terminal driver, and
CYGETTIMEOUT has value 0x00435906, with 0x43 0x59 = 'C' 'Y'
indicating the cyclades driver.
@ -78,7 +83,8 @@ it does not help in checking, but it causes varying values
for the various architectures.
.SH "RETURN VALUE"
Decent ioctls return 0 on success and \-1 on error, while
any output value is stored via the argument. However,
any output value is stored via the argument.
However,
quite a few ioctls in fact return an output value.
This is not yet indicated below.
.nf
@ -567,16 +573,18 @@ This is not yet indicated below.
// More arguments.
Some ioctl's take a pointer to a structure which contains additional
pointers. These are documented here in alphabetical order.
pointers.
These are documented here in alphabetical order.
CDROMREADAUDIO takes an input pointer 'const struct cdrom_read_audio *'.
The 'buf' field points to an output buffer
of length 'nframes * CD_FRAMESIZE_RAW'.
CDROMREADCOOKED, CDROMREADMODE1, CDROMREADMODE2, and CDROMREADRAW take
an input pointer 'const struct cdrom_msf *'. They use the same pointer
as an output pointer to 'char []'. The length varies by request. For
CDROMREADMODE1, most drivers use 'CD_FRAMESIZE', but the Optics Storage
an input pointer 'const struct cdrom_msf *'.
They use the same pointer as an output pointer to 'char []'.
The length varies by request.
For CDROMREADMODE1, most drivers use 'CD_FRAMESIZE', but the Optics Storage
driver uses 'OPT_BLOCKSIZE' instead (both have the numerical value
2048).
@ -596,29 +604,35 @@ The 'ifr_data' field is a pointer to another structure as follows:
EQL_GETMASTERCFG struct master_config *
EQL_SETMASTERCFG const struct master_config *
FDRAWCMD takes a 'struct floppy raw_cmd *'. If 'flags & FD_RAW_WRITE'
FDRAWCMD takes a 'struct floppy raw_cmd *'.
If 'flags & FD_RAW_WRITE'
is non-zero, then 'data' points to an input buffer of length 'length'.
If 'flags & FD_RAW_READ' is non-zero, then 'data' points to an output
buffer of length 'length'.
GIO_FONTX and PIO_FONTX take a 'struct console_font_desc *' or
a 'const struct console_font_desc *', respectively. 'chardata' points to
a buffer of 'char [charcount]'. This is an output buffer for GIO_FONTX
a buffer of 'char [charcount]'.
This is an output buffer for GIO_FONTX
and an input buffer for PIO_FONTX.
GIO_UNIMAP and PIO_UNIMAP take a 'struct unimapdesc *' or
a 'const struct unimapdesc *', respectively. 'entries' points to a buffer
of 'struct unipair [entry_ct]'. This is an output buffer for GIO_UNIMAP
of 'struct unipair [entry_ct]'.
This is an output buffer for GIO_UNIMAP
and an input buffer for PIO_UNIMAP.
KDADDIO, KDDELIO, KDDISABIO, and KDENABIO enable or disable access to
I/O ports. They are essentially alternate interfaces to 'ioperm'.
I/O ports.
They are essentially alternate interfaces to 'ioperm'.
KDMAPDISP and KDUNMAPDISP enable or disable memory mappings or I/O port
access. They are not implemented in the kernel.
access.
They are not implemented in the kernel.
SCSI_IOCTL_PROBE_HOST takes an input pointer 'const int *', which is a
length. It uses the same pointer as an output pointer to a 'char []'
length.
It uses the same pointer as an output pointer to a 'char []'
buffer of this length.
SIOCADDRT and SIOCDELRT take an input pointer whose type depends on
@ -628,7 +642,8 @@ the protocol:
AX.25 const struct ax25_route *
NET/ROM const struct nr_route_struct *
SIOCGIFCONF takes a 'struct ifconf *'. The 'ifc_buf' field points to a
SIOCGIFCONF takes a 'struct ifconf *'.
The 'ifc_buf' field points to a
buffer of length 'ifc_len' bytes, into which the kernel writes a list of
type 'struct ifreq []'.
@ -637,8 +652,10 @@ SIOCSIFHWADDR takes an input pointer whose type depends on the protocol:
Most protocols const struct ifreq *
AX.25 const char [AX25_ADDR_LEN]
TIOCLINUX takes a 'const char *'. It uses this to distinguish several
independent sub-cases. In the table below, 'N + foo' means 'foo' after
TIOCLINUX takes a 'const char *'.
It uses this to distinguish several
independent sub-cases.
In the table below, 'N + foo' means 'foo' after
an N-byte pad. 'struct selection' is implicitly defined
in 'drivers/char/selection.c'

View File

@ -65,7 +65,8 @@ This call is mostly for the i386 architecture.
On many other architectures it does not exist or will always
return an error.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -43,13 +43,14 @@ changes the I/O privilege level of the current process, as specified in
.IR level .
This call is necessary to allow 8514-compatible X servers to run under
Linux. Since these X servers require access to all 65536 I/O ports, the
Linux.
Since these X servers require access to all 65536 I/O ports, the
.BR ioperm ()
call is not sufficient.
In addition to granting unrestricted I/O port access, running at a higher
I/O privilege level also allows the process to disable interrupts. This
will probably crash the system, and is not recommended.
I/O privilege level also allows the process to disable interrupts.
This will probably crash the system, and is not recommended.
Permissions are inherited by
.BR fork ()
@ -62,7 +63,8 @@ This call is mostly for the i386 architecture.
On many other architectures it does not exist or will always
return an error.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
@ -86,7 +88,8 @@ intended to be portable.
.SH NOTES
Libc5 treats it as a system call and has a prototype in
.IR <unistd.h> .
Glibc1 does not have a prototype. Glibc2 has a prototype both in
Glibc1 does not have a prototype.
Glibc2 has a prototype both in
.I <sys/io.h>
and in
.IR <sys/perm.h> .

View File

@ -138,7 +138,8 @@ See the NOTES section for more
information on scheduling classes and priorities.
I/O priorities are supported for reads and for synchronous (O_DIRECT,
O_SYNC) writes. I/O priorities are not supported for asynchronous
O_SYNC) writes.
I/O priorities are not supported for asynchronous
writes because they are issued outside the context of the program
dirtying the memory, and thus program-specific priorities do not apply.
.SH "RETURN VALUE"
@ -235,7 +236,8 @@ These nice levels are grouped in three scheduling classes
each one containing one or more priority levels:
.TP
.BR IOPRIO_CLASS_RT " (1)"
This is the real-time I/O class. This scheduling class is given
This is the real-time I/O class.
This scheduling class is given
higher priority than any other class:
processes from this class are
given first access to the disk every time.
@ -264,7 +266,8 @@ Priority levels range from 0 (highest) to 7 (lowest).
.BR IOPRIO_CLASS_IDLE " (3)"
This is the idle scheduling class.
Processes running at this level only get I/O
time when no one else needs the disk. The idle class has no class
time when no one else needs the disk.
The idle class has no class
data.
Attention is required when assigning this priority class to a process,
since it may become starved if higher priority processes are

View File

@ -72,7 +72,8 @@ saved set-user-ID of the target process.
In the case of SIGCONT it suffices when the sending and receiving
processes belong to the same session.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
@ -94,7 +95,8 @@ The process group was given as 0 but the sending process does not
have a process group.
.SH NOTES
There are various differences between the permission checking
in BSD-type systems and System V-type systems. See the POSIX rationale
in BSD-type systems and System V-type systems.
See the POSIX rationale
for
.BR kill ().
A difference not mentioned by POSIX concerns the return

View File

@ -50,7 +50,8 @@ both names refer to the same file (and so have the same permissions
and ownership) and it is impossible to tell which name was the
\`original'.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
@ -127,7 +128,8 @@ even if the same filesystem is mounted on both.)
.SH NOTES
Hard links, as created by
.BR link (),
cannot span filesystems. Use
cannot span filesystems.
Use
.BR symlink ()
if this is required.
@ -158,7 +160,8 @@ SVr4, 4.3BSD, POSIX.1-2001 (except as noted above).
.\" X/OPEN does not document EFAULT, ENOMEM or EIO.
.SH BUGS
On NFS file systems, the return code may be wrong in case the NFS server
performs the link creation and dies before it can say so. Use
performs the link creation and dies before it can say so.
Use
.BR stat (2)
to find out if the link got created.
.SH "SEE ALSO"

View File

@ -64,7 +64,8 @@ or
The
.I backlog
parameter defines the maximum length the queue of pending connections may
grow to. If a connection request arrives with the queue full the client
grow to.
If a connection request arrives with the queue full the client
may receive an error with an indication of
.B ECONNREFUSED
or, if the underlying protocol supports retransmission, the request may be
@ -76,7 +77,8 @@ parameter on TCP sockets changed with Linux 2.2.
Now it specifies the queue length for
.I completely
established sockets waiting to be accepted, instead of the number of incomplete
connection requests. The maximum length of the queue for incomplete sockets
connection requests.
The maximum length of the queue for incomplete sockets
can be set using the
.B tcp_max_syn_backlog
sysctl.
@ -86,7 +88,8 @@ See
.BR tcp (7)
for more information.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -34,9 +34,9 @@ lookup_dcookie \- return a directory entry's path
Look up the full path of the directory entry specified by the value
.I cookie
.
The cookie is an opaque identifier uniquely identifying a particular directory
entry. The buffer given is filled in with the full path of the directory
entry.
The cookie is an opaque identifier uniquely identifying a particular
directory entry.
The buffer given is filled in with the full path of the directory entry.
For
.BR lookup_dcookie ()

View File

@ -49,21 +49,24 @@ the address range beginning at address
.I start
and with size
.I length
bytes. It allows an application to tell the kernel how it expects to use
bytes.
It allows an application to tell the kernel how it expects to use
some mapped or shared memory areas, so that the kernel can choose
appropriate read-ahead and caching techniques.
This call does not influence the semantics of the application
(except in the case of
.BR MADV_DONTNEED ),
but
may influence its performance. The kernel is free to ignore the advice.
may influence its performance.
The kernel is free to ignore the advice.
.LP
The advice is indicated in the
.I advice
parameter which can be
.TP
.B MADV_NORMAL
No special treatment. This is the default.
No special treatment.
This is the default.
.TP
.B MADV_RANDOM
Expect page references in random order.
@ -134,7 +137,8 @@ restoring the default behaviour, whereby a mapping is inherited across
.SH "RETURN VALUE"
On success
.BR madvise ()
returns zero. On error, it returns \-1 and
returns zero.
On error, it returns \-1 and
.I errno
is set appropriately.
.SH ERRORS
@ -179,7 +183,8 @@ The Linux implementation requires that the address
.I start
be page-aligned, and allows
.I length
to be zero. If there are some parts of the specified address range
to be zero.
If there are some parts of the specified address range
that are not mapped, the Linux version of
.BR madvise ()
ignores them and applies the call to the rest (but returns

View File

@ -24,7 +24,8 @@ attempts to create a directory named
The parameter
.I mode
specifies the permissions to use. It is modified by the process's
specifies the permissions to use.
It is modified by the process's
.I umask
in the usual way: the permissions of the created directory are
.RI ( mode " & ~" umask " & 0777)."
@ -32,7 +33,8 @@ Other mode bits of the created directory depend on the operating system.
For Linux, see below.
The newly created directory will be owned by the effective user ID of the
process. If the directory containing the file has the set-group-ID
process.
If the directory containing the file has the set-group-ID
bit set, or if the filesystem is mounted with BSD group semantics, the
new directory will inherit the group ownership from its parent;
otherwise it will be owned by the effective group ID of the process.
@ -106,13 +108,14 @@ SVr4, BSD, POSIX.1-2001.
.\" SVr4 documents additional EIO, EMULTIHOP
.SH NOTES
Under Linux apart from the permission bits, only the S_ISVTX mode bit
is honored. That is, under Linux the created directory actually gets mode
is honored.
That is, under Linux the created directory actually gets mode
.RI ( mode " & ~" umask " & 01777)."
See also
.BR stat (2).
.PP
There are many infelicities in the protocol underlying NFS. Some
of these affect
There are many infelicities in the protocol underlying NFS.
Some of these affect
.BR mkdir ().
.SH "SEE ALSO"
.BR mkdir (1),

View File

@ -70,7 +70,8 @@ If
already exists, or is a symbolic link, this call fails with an EEXIST error.
The newly created node will be owned by the effective user ID of the
process. If the directory containing the node has the set-group-ID
process.
If the directory containing the node has the set-group-ID
bit set, or if the filesystem is mounted with BSD group semantics, the
new node will inherit the group ownership from its parent directory;
otherwise it will be owned by the effective group ID of the process.
@ -151,7 +152,8 @@ SVr4, 4.4BSD, POSIX.1-2001 (but see below).
.SH NOTES
POSIX.1-2001 says: "The only portable use of
.BR mknod ()
is to create a FIFO-special file. If
is to create a FIFO-special file.
If
.I mode
is not S_IFIFO or
.I dev
@ -166,8 +168,8 @@ and FIFOs with
.BR mkfifo (2).
.\" Unix domain sockets with .BR socket " (and " bind ),
There are many infelicities in the protocol underlying NFS. Some
of these affect
There are many infelicities in the protocol underlying NFS.
Some of these affect
.BR mknod ().
.SH "SEE ALSO"
.BR fcntl (2),

View File

@ -76,9 +76,11 @@ memory range can be moved to external swap space again by the kernel.
.SS "mlockall() and munlockall()"
.BR mlockall ()
locks all pages mapped into the address space of the
calling process. This includes the pages of the code, data and stack
calling process.
This includes the pages of the code, data and stack
segment, as well as shared libraries, user space kernel data, shared
memory, and memory\-mapped files. All mapped pages are guaranteed
memory, and memory\-mapped files.
All mapped pages are guaranteed
to be resident in RAM when the call returns successfully;
the pages are guaranteed to stay in RAM until later unlocked.
@ -93,7 +95,8 @@ the process.
.TP
.B MCL_FUTURE
Lock all pages which will become mapped into the address space of the
process in the future. These could be for instance new pages required
process in the future.
These could be for instance new pages required
by a growing heap and stack as well as new memory mapped files or
shared memory regions.
.PP
@ -115,13 +118,16 @@ unlocks all pages mapped into the address space of the
calling process.
.SH "NOTES"
Memory locking has two main applications: real-time algorithms and
high-security data processing. Real-time applications require
high-security data processing.
Real-time applications require
deterministic timing, and, like scheduling, paging is one major cause
of unexpected program execution delays. Real-time applications will
of unexpected program execution delays.
Real-time applications will
usually also switch to a real-time scheduler with
.BR sched_setscheduler (2).
Cryptographic security software often handles critical bytes like
passwords or secret keys as data structures. As a result of paging,
passwords or secret keys as data structures.
As a result of paging,
these secrets could be transferred onto a persistent swap store medium,
where they might be accessible to the enemy long after the security
software has erased the secrets in RAM and terminated.
@ -138,7 +144,8 @@ This can be achieved by calling a function that allocates a
sufficiently large automatic variable (an array) and writes to the
memory occupied by this array in order to touch these stack pages.
This way, enough pages will be mapped for the stack and can be
locked into RAM. The dummy writes ensure that not even copy-on-write
locked into RAM.
The dummy writes ensure that not even copy-on-write
page faults can occur in the critical section.
Memory locks are not inherited by a child created via

View File

@ -222,7 +222,8 @@ in conjunction with
is only supported on Linux since kernel 2.4.
.TP
.B MAP_FILE
Compatibility flag. Ignored.
Compatibility flag.
Ignored.
.TP
.B MAP_32BIT
Put the mapping into the first 2GB of the process address space.
@ -260,9 +261,11 @@ is preserved across
.BR fork (2),
with the same attributes.
.LP
A file is mapped in multiples of the page size. For a file that is not
A file is mapped in multiples of the page size.
For a file that is not
a multiple of the page size, the remaining memory is zeroed when mapped,
and writes to that region are not written out to the file. The effect of
and writes to that region are not written out to the file.
The effect of
changing the size of the underlying file of a mapping on the pages that
correspond to added or removed regions of the file is unspecified.
@ -270,15 +273,19 @@ The
.BR munmap ()
system call deletes the mappings for the specified address range, and
causes further references to addresses within the range to generate
invalid memory references. The region is also automatically unmapped
when the process is terminated. On the other hand, closing the file
invalid memory references.
The region is also automatically unmapped
when the process is terminated.
On the other hand, closing the file
descriptor does not unmap the region.
.LP
The address
.I start
must be a multiple of the page size. All pages containing a part
must be a multiple of the page size.
All pages containing a part
of the indicated range are unmapped, and subsequent references
to these pages will generate SIGSEGV. It is not an error if the
to these pages will generate SIGSEGV.
It is not an error if the
indicated range does not contain any mapped pages.
For file-backed mappings, the

View File

@ -58,7 +58,8 @@ larger files (typically up to 2^44 bytes).
.SH "RETURN VALUE"
On success,
.BR mmap2 ()
returns a pointer to the mapped area. On error \-1 is returned
returns a pointer to the mapped area.
On error \-1 is returned
and
.I errno
is set appropriately.

View File

@ -168,7 +168,8 @@ This option is useful for programs, such as
that need to know when a file has been read since it was last modified.
.TP
.B MS_REMOUNT
Remount an existing mount. This is allows you to change the
Remount an existing mount.
This is allows you to change the
.I mountflags
and
.I data
@ -238,7 +239,8 @@ unmounts a target, but allows additional
controlling the behaviour of the operation:
.TP
.BR MNT_FORCE " (since Linux 2.1.116)"
Force unmount even if busy. This can cause data loss.
Force unmount even if busy.
This can cause data loss.
(Only for NFS mounts.)
.\" FIXME Can MNT_FORCE result in data loss? According to
.\" the Solaris manual page it can cause data loss on Solaris.
@ -268,16 +270,20 @@ This flag cannot be specified with either
or
.BR MNT_DETACH .
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
The error values given below result from filesystem type independent
errors. Each filesystem type may have its own special errors and its
own special behavior. See the kernel source code for details.
errors.
Each filesystem type may have its own special errors and its
own special behavior.
See the kernel source code for details.
.TP
.B EACCES
A component of a path was not searchable. (See also
A component of a path was not searchable.
(See also
.BR path_resolution (2).)
Or, mounting a read-only filesystem was attempted without giving the
.B MS_RDONLY
@ -299,7 +305,8 @@ successfully marked an unbusy file system as expired.
.TP
.B EBUSY
.I source
is already mounted. Or, it cannot be remounted read-only,
is already mounted.
Or, it cannot be remounted read-only,
because it still holds files open for writing.
Or, it cannot be mounted on
.I target

View File

@ -61,21 +61,23 @@ The memory can contain executing code.
.\" FIXME
.\" Document MAP_GROWSUP and MAP_GROWSDOWN
.PP
The new protection replaces any existing protection. For example, if the
The new protection replaces any existing protection.
For example, if the
memory had previously been marked \fBPROT_READ\fR, and \fBmprotect\fR()
is then called with \fIprot\fR \fBPROT_WRITE\fR, it will no longer
be readable.
.SH "RETURN VALUE"
On success,
.BR mprotect ()
returns zero. On error, \-1 is returned, and
returns zero.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
.TP
.B EACCES
The memory cannot be given the specified access. This can happen,
for example, if you
The memory cannot be given the specified access.
This can happen, for example, if you
.BR mmap (2)
a file to which you have read-only access, then ask
.BR mprotect ()

View File

@ -46,23 +46,28 @@ moving it at the same time (controlled by the \fIflags\fR argument and
the available virtual address space).
\fIold_address\fR is the old address of the virtual memory block that you
want to expand (or shrink). Note that \fIold_address\fR has to be page
want to expand (or shrink).
Note that \fIold_address\fR has to be page
aligned. \fIold_size\fR is the old size of the
virtual memory block. \fInew_size\fR is the requested size of the
virtual memory block after the resize.
In Linux the memory is divided into pages. A user process has (one or)
several linear virtual memory segments. Each virtual memory segment has one
or more mappings to real memory pages (in the page table). Each virtual
memory segment has its own protection (access rights), which may cause
In Linux the memory is divided into pages.
A user process has (one or)
several linear virtual memory segments.
Each virtual memory segment has one
or more mappings to real memory pages (in the page table).
Each virtual memory segment has its own
protection (access rights), which may cause
a segmentation violation if the memory is accessed incorrectly (e.g.,
writing to a read-only segment). Accessing virtual memory outside of the
writing to a read-only segment).
Accessing virtual memory outside of the
segments will also cause a segmentation violation.
\fBmremap\fR() uses the Linux page table scheme.
\fBmremap\fR() changes the
mapping between virtual addresses and memory pages. This can be used to
implement a very efficient \fBrealloc\fR().
mapping between virtual addresses and memory pages.
This can be used to implement a very efficient \fBrealloc\fR().
The \fIflags\fR bit-mask argument may be 0, or include the following flag:
.TP

View File

@ -34,10 +34,12 @@ msync \- synchronize a file with a memory map
flushes changes made to the in-core copy of a file that was mapped
into memory using
.BR mmap (2)
back to disk. Without use of this call
back to disk.
Without use of this call
there is no guarantee that changes are written back before
.BR munmap (2)
is called. To be more precise, the part of the file that
is called.
To be more precise, the part of the file that
corresponds to the memory area starting at
.I start
and having length
@ -64,7 +66,8 @@ asks for an update and waits for it to complete.
asks to invalidate other mappings of the same file
(so that they can be updated with the fresh values just written).
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -39,7 +39,8 @@ nanosleep \- pause execution for a specified time
delays the execution of the program for at least the time specified in
.IR *req .
The function can return earlier if a signal has been delivered to the
process. In this case, it returns \-1, sets \fIerrno\fR to
process.
In this case, it returns \-1, sets \fIerrno\fR to
.BR EINTR ,
and writes the
remaining time into the structure pointed to by
@ -55,7 +56,8 @@ again and complete the specified pause.
The structure
.I timespec
is used to specify intervals of time with nanosecond precision. It is
is used to specify intervals of time with nanosecond precision.
It is
specified in
.I <time.h>
and has the form
@ -94,7 +96,8 @@ Problem with copying information from user space.
.TP
.B EINTR
The pause has been interrupted by a non-blocked signal that was
delivered to the process. The remaining sleep time has been written
delivered to the process.
The remaining sleep time has been written
into *\fIrem\fR so that the process can easily call
.BR nanosleep ()
again and continue with the pause.
@ -115,7 +118,8 @@ Therefore,
.BR nanosleep ()
pauses always for at least the specified time, however it can take up
to 10 ms longer than specified until the process becomes runnable
again. For the same reason, the value returned in case of a delivered
again.
For the same reason, the value returned in case of a delivered
signal in *\fIrem\fR is usually rounded to the next larger multiple of
1/\fIHZ\fR\ s.
.SS "Old behaviour"

View File

@ -41,7 +41,8 @@ union nfsctl_res {
};
.fi
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH "CONFORMING TO"

View File

@ -116,14 +116,16 @@ modified using
The full list of file creation flags and file status flags is as follows:
.TP
.B O_APPEND
The file is opened in append mode. Before each
The file is opened in append mode.
Before each
.BR write (),
the file offset is positioned at the end of the file,
as if with
.BR lseek ().
.B O_APPEND
may lead to corrupted files on NFS file systems if more than one process
appends data to a file at once. This is because NFS does not support
appends data to a file at once.
This is because NFS does not support
appending to a file, so the client kernel has to simulate it, which
can't be done without a race condition.
.TP
@ -141,7 +143,8 @@ for further details.
.B O_CREAT
If the file does not exist it will be created.
The owner (user ID) of the file is set to the effective user ID
of the process. The group ownership (group ID) is set either to
of the process.
The group ownership (group ID) is set either to
the effective group ID of the process or to the group ID of the
parent directory (depending on filesystem type and mount options,
and the mode of the parent directory, see, e.g., the mount options
@ -163,7 +166,8 @@ or
data is guaranteed to have been transferred.
Under Linux 2.4 transfer sizes, and the alignment of user buffer
and file offset must all be multiples of the logical block size
of the file system. Under Linux 2.6 alignment to 512-byte boundaries
of the file system.
Under Linux 2.6 alignment to 512-byte boundaries
suffices.
.\" Alignment should satisfy requirements for the underlying device
.\" There may be coherency problems.
@ -188,16 +192,20 @@ When used with
.BR O_CREAT ,
if the file already exists it is an error and the
.BR open ()
will fail. In this context, a symbolic link exists, regardless
will fail.
In this context, a symbolic link exists, regardless
of where it points to.
.B O_EXCL
is broken on NFS file systems; programs which rely on it for performing
locking tasks will contain a race condition. The solution for performing
locking tasks will contain a race condition.
The solution for performing
atomic file locking using a lockfile is to create a unique file on
the same file system (e.g., incorporating hostname and pid), use
.BR link (2)
to make a link to the lockfile. If \fBlink\fP() returns 0, the lock is
successful. Otherwise, use
to make a link to the lockfile.
If \fBlink\fP() returns 0, the lock is
successful.
Otherwise, use
.BR stat (2)
on the unique file to check if its link count has increased to 2,
in which case the lock is also successful.
@ -241,8 +249,8 @@ refers to a terminal device \(em see
process does not have one.
.TP
.B O_NOFOLLOW
If \fIpathname\fR is a symbolic link, then the open fails. This is a
FreeBSD extension, which was added to Linux in version 2.1.126.
If \fIpathname\fR is a symbolic link, then the open fails.
This is a FreeBSD extension, which was added to Linux in version 2.1.126.
Symbolic links in earlier components of the pathname will still be
followed.
.\" The headers from glibc 2.0.100 and later include a
@ -250,7 +258,8 @@ followed.
.\" used\fR.
.TP
.BR O_NONBLOCK " or " O_NDELAY
When possible, the file is opened in non-blocking mode. Neither the
When possible, the file is opened in non-blocking mode.
Neither the
.BR open ()
nor any subsequent operations on the file descriptor which is
returned will cause the calling process to wait.
@ -262,7 +271,8 @@ in conjunction with mandatory file locks and with file leases, see
.BR fcntl (2).
.TP
.B O_SYNC
The file is opened for synchronous I/O. Any
The file is opened for synchronous I/O.
Any
.BR write ()s
on the resulting file descriptor will block the calling process until
the data has been physically written to the underlying hardware.
@ -272,7 +282,8 @@ the data has been physically written to the underlying hardware.
If the file already exists and is a regular file and the open mode allows
writing (i.e., is O_RDWR or O_WRONLY) it will be truncated to length 0.
If the file is a FIFO or terminal device file, the O_TRUNC
flag is ignored. Otherwise the effect of O_TRUNC is unspecified.
flag is ignored.
Otherwise the effect of O_TRUNC is unspecified.
.PP
Some of these optional flags can be altered using
.BR fcntl ()
@ -280,7 +291,8 @@ after the file has been opened.
The argument
.I mode
specifies the permissions to use in case a new file is created. It is
specifies the permissions to use in case a new file is created.
It is
modified by the process's
.BR umask
in the usual way: the permissions of the created file are
@ -523,8 +535,10 @@ On many systems the file is actually truncated.
The
.B O_DIRECT
flag was introduced in SGI IRIX, where it has alignment restrictions
similar to those of Linux 2.4. IRIX has also a fcntl(2) call to
query appropriate alignments, and sizes. FreeBSD 4.x introduced
similar to those of Linux 2.4.
IRIX has also a fcntl(2) call to
query appropriate alignments, and sizes.
FreeBSD 4.x introduced
a flag of same name, but without alignment restrictions.
Support was added under Linux in kernel version 2.4.10.
Older Linux kernels simply ignore this flag.
@ -553,7 +567,8 @@ amongst others
POSIX provides for three different variants of synchronised I/O,
corresponding to the flags \fBO_SYNC\fR, \fBO_DSYNC\fR and
\fBO_RSYNC\fR. Currently (2.1.130) these are all synonymous under Linux.
\fBO_RSYNC\fR.
Currently (2.1.130) these are all synonymous under Linux.
.SH "SEE ALSO"
.BR close (2),
.BR dup (2),

View File

@ -42,7 +42,8 @@ but can be used from user space.
.\" in addition to that given in
.\" .BR outb (9).
.sp
You compile with \fB\-O\fP or \fB\-O2\fP or similar. The functions
You compile with \fB\-O\fP or \fB\-O2\fP or similar.
The functions
are defined as inline macros, and will not be substituted in without
optimization enabled, causing unresolved references at link time.
.sp
@ -51,10 +52,12 @@ You use
or alternatively
.BR iopl (2)
to tell the kernel to allow the user space application to access the
I/O ports in question. Failure to do this will cause the application
I/O ports in question.
Failure to do this will cause the application
to receive a segmentation fault.
.SH "CONFORMING TO"
\fBoutb\fP() and friends are hardware specific. The
\fBoutb\fP() and friends are hardware specific.
The
.I value
argument is passed first and the
.I port

View File

@ -28,12 +28,16 @@ Some Unix/Linux system calls have as parameter one or more filenames.
A filename (or pathname) is resolved as follows.
.SS "Step 1: Start of the resolution process"
If the pathname starts with the '/' character, the starting lookup directory
is the root directory of the current process. (A process inherits its
root directory from its parent. Usually this will be the root directory
of the file hierarchy. A process may get a different root directory
is the root directory of the current process.
(A process inherits its
root directory from its parent.
Usually this will be the root directory
of the file hierarchy.
A process may get a different root directory
by use of the
.BR chroot (2)
system call. A process may get an entirely private namespace in case
system call.
A process may get an entirely private namespace in case
it \(em or one of its ancestors \(em was started by an invocation of the
.BR clone (2)
system call that had the CLONE_NEWNS flag set.)
@ -41,7 +45,8 @@ This handles the '/' part of the pathname.
If the pathname does not start with the '/' character, the
starting lookup directory of the resolution process is the current working
directory of the process. (This is also inherited from the parent.
directory of the process.
(This is also inherited from the parent.
It can be changed by use of the
.BR chdir (2)
system call.)
@ -54,7 +59,8 @@ Now, for each non-final component of the pathname, where a component
is a substring delimited by '/' characters, this component is looked up
in the current lookup directory.
If the process does not have search permission on the current lookup directory,
If the process does not have search permission on
the current lookup directory,
an EACCES error is returned ("Permission denied").
If the component is not found, an ENOENT error is returned
@ -69,7 +75,8 @@ next component.
If the component is found and is a symbolic link (symlink), we first
resolve this symbolic link (with the current lookup directory
as starting lookup directory). Upon error, that error is returned.
as starting lookup directory).
Upon error, that error is returned.
If the result is not a directory, an ENOTDIR error is returned.
If the resolution of the symlink is successful and returns a directory,
we set the current lookup directory to that directory, and go to
@ -78,7 +85,8 @@ Note that the resolution process here involves recursion.
In order to protect the kernel against stack overflow, and also
to protect against denial of service, there are limits on the
maximum recursion depth, and on the maximum number of symlinks
followed. An ELOOP error is returned when the maximum is
followed.
An ELOOP error is returned when the maximum is
exceeded ("Too many levels of symbolic links").
.\"
.\" presently: max recursion depth during symlink resolution: 5
@ -92,7 +100,8 @@ directory (at least as far as the path resolution process is concerned \(em
it may have to be a directory, or a non-directory, because of
the requirements of the specific system call), and (ii) it
is not necessarily an error if the component is not found \(em
maybe we are just creating it. The details on the treatment
maybe we are just creating it.
The details on the treatment
of the final entry are described in the manual pages of the specific
system calls.
.SS ". and .."
@ -129,20 +138,23 @@ will operate on the symlink, while
.BR stat (2)
operates on the file pointed to by the symlink.
.SS "Length limit"
There is a maximum length for pathnames. If the pathname (or some
There is a maximum length for pathnames.
If the pathname (or some
intermediate pathname obtained while resolving symbolic links)
is too long, an ENAMETOOLONG error is returned ("File name too long").
.SS "Empty pathname"
In the original Unix, the empty pathname referred to the current directory.
Nowadays POSIX decrees that an empty pathname must not be resolved
successfully. Linux returns ENOENT in this case.
successfully.
Linux returns ENOENT in this case.
.SS "Permissions"
The permission bits of a file consist of three groups of three bits, cf.\&
.BR chmod (1)
and
.BR stat (2).
The first group of three is used when the effective user ID of
the current process equals the owner ID of the file. The second group
the current process equals the owner ID of the file.
The second group
of three is used when the group ID of the file either equals the
effective group ID of the current process, or is one of the
supplementary group IDs of the current process (as set by
@ -161,11 +173,14 @@ changed by the system call
(Here "fsuid" stands for something like "file system user ID".
The concept was required for the implementation of a user space
NFS server at a time when processes could send a signal to a process
with the same effective user ID. It is obsolete now. Nobody should use
with the same effective user ID.
It is obsolete now.
Nobody should use
.BR setfsuid (2).)
Similarly, Linux uses the fsgid ("file system group ID")
instead of the effective group ID. See
instead of the effective group ID.
See
.BR setfsgid (2).
.\" FIXME say something about filesystem mounted read-only ?
.SS "Bypassing permission checks: superuser and capabilities"

View File

@ -48,7 +48,8 @@ to call a signal-catching function.
The
.BR pause ()
function only returns when a signal was caught and the
signal-catching function returned. In this case
signal-catching function returned.
In this case
.BR pause ()
returns \-1, and
.I errno

View File

@ -18,7 +18,9 @@ pciconfig_read, pciconfig_write, pciconfig_iobase \- pci device information hand
.fi
.SH DESCRIPTION
.TP
Most of the interaction with PCI devices is already handled by the kernel PCI layer, and thus these calls should not normally need to be accessed from userspace.
Most of the interaction with PCI devices is already handled by the
kernel PCI layer,
and thus these calls should not normally need to be accessed from userspace.
.TP
.BR pciconfig_read ()
Reads to
@ -45,19 +47,24 @@ off
value.
.TP
.BR pciconfig_iobase ()
You pass it a bus/devfn pair and get a physical address for either the memory offset (for things like prep, this is 0xc0000000), the IO base for PIO cycles, or the ISA holes if any.
You pass it a bus/devfn pair and get a physical address for either the
memory offset (for things like prep, this is 0xc0000000),
the IO base for PIO cycles, or the ISA holes if any.
.SH "RETURN VALUE"
.TP
.BR pciconfig_read ()
On success zero is returned. On error, \-1 is returned and errno is set appropriately.
On success zero is returned.
On error, \-1 is returned and errno is set appropriately.
.TP
.BR pciconfig_write ()
On success zero is returned. On error, \-1 is returned and errno is set appropriately.
On success zero is returned.
On error, \-1 is returned and errno is set appropriately.
.TP
.BR pciconfig_iobase ()
Returns information on locations of various I/O regions in physical memory according to the
.I which
value. Values for
value.
Values for
.I which
are: IOBASE_BRIDGE_NUMBER, IOBASE_MEMORY, IOBASE_IO, IOBASE_ISA_IO, IOBASE_ISA_MEM.
.SH ERRORS

View File

@ -36,8 +36,10 @@ personality \- set the process execution domain
.BI "int personality(unsigned long " persona );
.SH DESCRIPTION
Linux supports different execution domains, or personalities, for each
process. Among other things, execution domains tell Linux how to map
signal numbers into signal actions. The execution domain system allows
process.
Among other things, execution domains tell Linux how to map
signal numbers into signal actions.
The execution domain system allows
Linux to provide limited support for binaries compiled under other
Unix-like operating systems.
@ -45,14 +47,16 @@ This function will return the current
.BR personality ()
when
.I persona
equals 0xffffffff. Otherwise, it will make the execution domain
equals 0xffffffff.
Otherwise, it will make the execution domain
referenced by
.I persona
the new execution domain of the current process.
.SH "RETURN VALUE"
On success, the previous
.I persona
is returned. On error, \-1 is returned, and
is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -44,7 +44,8 @@ is for reading,
.I filedes[1]
is for writing.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -32,24 +32,30 @@ the current root of all relevant processes or threads.
\fBpivot_root\fP() may or may not change the current root and the current
working directory (cwd) of any processes or threads which use the old
root directory. The caller of \fBpivot_root\fP()
root directory.
The caller of \fBpivot_root\fP()
must ensure that processes with root or cwd at the old root operate
correctly in either case. An easy way to ensure this is to change their
correctly in either case.
An easy way to ensure this is to change their
root and cwd to \fInew_root\fP before invoking \fBpivot_root\fP().
The paragraph above is intentionally vague because the implementation
of \fBpivot_root\fP() may change in the future. At the time of writing,
of \fBpivot_root\fP() may change in the future.
At the time of writing,
\fBpivot_root\fP() changes root and cwd of each process or
thread to \fInew_root\fP if they point to the old root directory. This
thread to \fInew_root\fP if they point to the old root directory.
This
is necessary in order to prevent kernel threads from keeping the old
root directory busy with their root and cwd, even if they never access
the file system in any way. In the future, there may be a mechanism for
the file system in any way.
In the future, there may be a mechanism for
kernel threads to explicitly relinquish any access to the file system,
such that this fairly intrusive mechanism can be removed from
\fBpivot_root\fP().
Note that this also applies to the current process: \fBpivot_root\fP() may
or may not affect its cwd. It is therefore recommended to call
or may not affect its cwd.
It is therefore recommended to call
\fBchdir("/")\fP immediately after \fBpivot_root\fP().
The following restrictions apply to \fInew_root\fP and \fIput_old\fP:
@ -71,15 +77,18 @@ If the current root is not a mount point (e.g. after \fBchroot(2)\fP or
\fBpivot_root\fP(), see also below), not the old root directory, but the
mount point of that file system is mounted on \fIput_old\fP.
.SH NOTES
\fInew_root\fP does not have to be a mount point. In this case,
\fInew_root\fP does not have to be a mount point.
In this case,
\fI/proc/mounts\fP will show the mount point of the file system containing
\fInew_root\fP as root (\fI/\fP).
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
\fIerrno\fP is set appropriately.
.SH ERRORS
\fBpivot_root\fP() may return (in \fIerrno\fP) any of the errors returned by
\fBstat(2)\fP. Additionally, it may return:
\fBstat(2)\fP.
Additionally, it may return:
.TP
.B EBUSY
\fInew_root\fP or \fIput_old\fP are on the current root file system,

View File

@ -240,7 +240,8 @@ the number of structures which have non-zero
.I revents
fields (in other words, those descriptors with events or errors reported).
A value of 0 indicates that the call timed out and no file
descriptors were ready. On error, \-1 is returned, and
descriptors were ready.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -40,15 +40,16 @@ to perform appropriate optimisations.
The \fIadvice\fP applies to a (not necessarily existent) region starting
at \fIoffset\fP and extending for \fIlen\fP bytes (or until the end of
the file if \fIlen\fP is 0) within the file referred to by \fIfd\fP. The
advice is not binding; it merely constitutes an expectation on behalf of
the file if \fIlen\fP is 0) within the file referred to by \fIfd\fP.
The advice is not binding; it merely constitutes an expectation on behalf of
the application.
Permissible values for \fIadvice\fP include:
.TP
.B POSIX_FADV_NORMAL
Indicates that the application has no advice to give about its access
pattern for the specified data. If no advice is given for an open file,
pattern for the specified data.
If no advice is given for an open file,
this is the default assumption.
.TP
.B POSIX_FADV_SEQUENTIAL
@ -103,8 +104,10 @@ same semantics as \fBPOSIX_FADV_WILLNEED\fP.
This was probably a bug; since kernel 2.6.18, this flag is a no-op.
\fBPOSIX_FADV_DONTNEED\fP attempts to free cached pages associated with
the specified region. This is useful, for example, while streaming large
files. A program may periodically request the kernel to free cached data
the specified region.
This is useful, for example, while streaming large
files.
A program may periodically request the kernel to free cached data
that has already been used, so that more useful cached pages are not
discarded instead.

View File

@ -84,7 +84,8 @@ via PTRACE_DETACH.
The value of \fIrequest\fP determines the action to be performed:
.TP
PTRACE_TRACEME
Indicates that this process is to be traced by its parent. Any signal
Indicates that this process is to be traced by its parent.
Any signal
(except SIGKILL) delivered to this process will cause it to stop and its
parent to be notified via
.BR wait ().
@ -132,7 +133,8 @@ Copies the word
.IR data
to location
.IR addr
in the child's memory. As above, the two requests are currently equivalent.
in the child's memory.
As above, the two requests are currently equivalent.
.TP
PTRACE_POKEUSR
Copies the word
@ -173,7 +175,8 @@ Set signal information.
Copies a \fIsiginfo_t\fP structure from location \fIdata\fP in the
parent to the child.
This will only affect signals that would normally be delivered to
the child and were caught by the tracer. It may be difficult to tell
the child and were caught by the tracer.
It may be difficult to tell
these normal signals from synthetic signals generated by
.BR ptrace ()
itself. (\fIaddr\fP is ignored.)
@ -215,7 +218,8 @@ tracing the newly cloned process, which will start with a SIGSTOP.
The PID for the new process can be retrieved with PTRACE_GETEVENTMSG.
This option may not catch
.BR clone ()
calls in all cases. If the child calls
calls in all cases.
If the child calls
.BR clone ()
with the CLONE_VFORK flag, PTRACE_EVENT_VFORK will be delivered instead
if PTRACE_O_TRACEVFORK is set; otherwise if the child calls
@ -249,15 +253,16 @@ Retrieve a message (as an
.IR "unsigned long" )
about the ptrace event
that just happened, placing it in the location \fIdata\fP in the parent.
For PTRACE_EVENT_EXIT this is the child's exit status. For
PTRACE_EVENT_FORK, PTRACE_EVENT_VFORK and PTRACE_EVENT_CLONE this
For PTRACE_EVENT_EXIT this is the child's exit status.
For PTRACE_EVENT_FORK, PTRACE_EVENT_VFORK and PTRACE_EVENT_CLONE this
is the PID of the new process.
Since Linux 2.6.18, the PID of the new process is also available
for PTRACE_EVENT_VFORK_DONE.
(\fIaddr\fP is ignored.)
.TP
PTRACE_CONT
Restarts the stopped child process. If \fIdata\fP is non-zero and not
Restarts the stopped child process.
If \fIdata\fP is non-zero and not
SIGSTOP, it is interpreted as a signal to be delivered to the child;
otherwise, no signal is delivered.
Thus, for example, the parent can control
@ -279,8 +284,10 @@ the system call at the second stop.
.TP
PTRACE_SYSEMU, PTRACE_SYSEMU_SINGLESTEP (since Linux 2.6.14)
For PTRACE_SYSEMU, continue and stop on entry to the next syscall,
which will not be executed. For PTRACE_SYSEMU_SINGLESTEP, do the same
but also singlestep if not a syscall. This call is used by programs like
which will not be executed.
For PTRACE_SYSEMU_SINGLESTEP, do the same
but also singlestep if not a syscall.
This call is used by programs like
User Mode Linux that want to emulate all the child's system calls.
(\fIaddr\fP and \fIdata\fP are ignored;
not supported on all architectures.)
@ -409,7 +416,8 @@ or there was a word-alignment violation,
or an invalid signal was specified during a restart request.
.TP
.B EPERM
The specified process cannot be traced. This could be because the
The specified process cannot be traced.
This could be because the
parent has insufficient privileges (the required capability is
.BR CAP_SYS_PTRACE );
non-root processes cannot trace processes that they

View File

@ -54,8 +54,8 @@ The returned buffer consists of a sequence of null-terminated strings;
is set to the number of modules.
.TP
.B QM_REFS
Returns the names of all modules using the indicated module. This is
the inverse of
Returns the names of all modules using the indicated module.
This is the inverse of
.BR QM_DEPS .
The returned buffer consists of a sequence of null-terminated strings;
.I ret
@ -84,8 +84,8 @@ is set to the number of symbols.
.RE
.TP
.B QM_INFO
Returns miscellaneous information about the indicated module. The output
buffer format is:
Returns miscellaneous information about the indicated module.
The output buffer format is:
.RS
.PP
.nf
@ -114,7 +114,8 @@ is set to the size of the
structure.
.RE
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned and
On success, zero is returned.
On error, \-1 is returned and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -97,14 +97,17 @@ quotactl \- manipulate disk quota
.SH DESCRIPTION
The quota system defines for each user and/or group a soft limit
and a hard limit bounding the amount of disk space that can be
used on a given file system. The hard limit cannot be crossed.
The soft limit can be crossed, but warnings will ensue. Moreover,
used on a given file system.
The hard limit cannot be crossed.
The soft limit can be crossed, but warnings will ensue.
Moreover,
the user cannot be above the soft limit for more than one week (by default)
at a time: after this week the soft limit counts as hard limit.
The
.BR quotactl ()
system call manipulates these quota. Its first argument is
system call manipulates these quota.
Its first argument is
of the form
.BI QCMD( subcmd , type )
where
@ -135,7 +138,8 @@ The
is one of
.TP 1.1i
.B Q_QUOTAON
Enable quota. The
Enable quota.
The
.I addr
argument is the pathname of the file containing
the quota for the filesystem.
@ -144,7 +148,8 @@ the quota for the filesystem.
Disable quota.
.TP
.B Q_GETQUOTA
Get limits and current usage of disk space. The
Get limits and current usage of disk space.
The
.I addr
argument is a pointer to a dqblk structure (defined in
.IR <sys/quota.h> ).
@ -170,7 +175,8 @@ Get collected stats.
.SH "RETURN VALUE"
On success,
.BR quotactl ()
returns 0. On error, \-1 is returned, and
returns 0.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
@ -189,7 +195,8 @@ value.
.TP
.B EINVAL
.I type
is not a known quota type. Or,
is not a known quota type.
Or,
.I special
could not be found.
.TP

View File

@ -66,7 +66,8 @@ because we are reading from a pipe, or from a terminal), or because
\fBread\fP() was interrupted by a signal.
On error, \-1 is returned, and
.I errno
is set appropriately. In this case it is left unspecified whether
is set appropriately.
In this case it is left unspecified whether
the file position (if any) changes.
.SH ERRORS
.TP
@ -98,10 +99,12 @@ the value specified in
or the current file offset is not suitably aligned.
.TP
.B EIO
I/O error. This will happen for example when the process is in a
I/O error.
This will happen for example when the process is in a
background process group, tries to read from its controlling tty,
and either it is ignoring or blocking SIGTTIN or its process group
is orphaned. It may also occur when there is a low-level I/O error
is orphaned.
It may also occur when there is a low-level I/O error
while reading from a disk or tape.
.TP
.B EISDIR
@ -120,19 +123,22 @@ set to EINTR) or to return the number of bytes already read.
SVr4, 4.3BSD, POSIX.1-2001.
.SH RESTRICTIONS
On NFS file systems, reading small amounts of data will only update the
time stamp the first time, subsequent calls may not do so. This is caused
time stamp the first time, subsequent calls may not do so.
This is caused
by client side attribute caching, because most if not all NFS clients
leave st_atime (last file access time)
updates to the server and client side reads satisfied from the
client's cache will not cause st_atime updates on the server as there are no
server side reads. UNIX semantics can be obtained by disabling client
server side reads.
UNIX semantics can be obtained by disabling client
side attribute caching, but in most situations this will substantially
increase server load and decrease performance.
.PP
Many filesystems and disks were considered to be fast enough that the
implementation of
.B O_NONBLOCK
was deemed unnecessary. So, O_NONBLOCK may not be available on files
was deemed unnecessary.
So, O_NONBLOCK may not be available on files
and/or disks.
.SH "SEE ALSO"
.BR close (2),

View File

@ -103,8 +103,8 @@ The sum of the
.I iov_len
values overflows an
.I ssize_t
value. Or,
the vector count \fIcount\fR is less than zero or greater than the
value.
Or, the vector count \fIcount\fR is less than zero or greater than the
permitted maximum.
.SH "CONFORMING TO"
4.4BSD (the
@ -130,7 +130,8 @@ On Linux, the limit advertised by these mechanisms is 1024,
which is the true kernel limit.
However, the glibc wrapper functions do some extra work if
they detect that the underlying kernel system call failed because this
limit was exceeded. In the case of
limit was exceeded.
In the case of
.BR readv ()
the wrapper function allocates a temporary buffer large enough
for all of the items specified by

View File

@ -123,7 +123,8 @@ anything at present (2.1.122), but the type of reboot can be
determined by kernel command line arguments (`reboot=...') to be
either warm or cold, and either hard or through the BIOS.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -95,7 +95,8 @@ with a NULL
parameter.
.PP
All three routines return the length of the message on successful
completion. If a message is too long to fit in the supplied buffer, excess
completion.
If a message is too long to fit in the supplied buffer, excess
bytes may be discarded depending on the type of socket the message is
received from.
.PP
@ -138,7 +139,8 @@ specifies that queued errors should be received from the socket error queue.
The error is passed in
an ancillary message with a type dependent on the protocol (for IPv4
.BR IP_RECVERR ).
The user should supply a buffer of sufficient size. See
The user should supply a buffer of sufficient size.
See
.BR cmsg (3)
and
.BR ip (7)
@ -193,7 +195,8 @@ struct sockaddr *SO_EE_OFFENDER(struct sock_extended_err *);
contains the errno number of the queued error.
.B ee_origin
is the origin code of where the error originated.
The other fields are protocol specific. The macro
The other fields are protocol specific.
The macro
.B SOCK_EE_OFFENDER
returns a pointer to the address of the network object
where the error originated from given a pointer to the ancillary message.
@ -205,8 +208,8 @@ contains
.B AF_UNSPEC
and the other fields of the
.B sockaddr
are undefined. The payload of the packet
that caused the error is passed as normal data.
are undefined.
The payload of the packet that caused the error is passed as normal data.
.IP
For local errors, no address is passed (this
can be checked with the
@ -224,22 +227,27 @@ on the next socket operation.
.TP
.B MSG_OOB
This flag requests receipt of out-of-band data that would not be received
in the normal data stream. Some protocols place expedited data
in the normal data stream.
Some protocols place expedited data
at the head of the normal data queue, and thus this flag cannot
be used with such protocols.
.TP
.B MSG_PEEK
This flag causes the receive operation to return data from the beginning of the
receive queue without removing that data from the queue. Thus, a
This flag causes the receive operation to
return data from the beginning of the
receive queue without removing that data from the queue.
Thus, a
subsequent receive call will return the same data.
.TP
.B MSG_TRUNC
Return the real length of the packet, even when it was longer than
the passed buffer. Only valid for packet sockets.
the passed buffer.
Only valid for packet sockets.
.TP
.B MSG_WAITALL
This flag requests that the operation block until the full request is
satisfied. However, the call may still return less data than requested if
satisfied.
However, the call may still return less data than requested if
a signal is caught, an error or disconnect occurs, or the next data to be
received is of a different type than that returned.
.PP
@ -247,8 +255,8 @@ The
.BR recvmsg ()
call uses a
.I msghdr
structure to minimize the number of directly supplied parameters. This
structure has the following form, as defined in
structure to minimize the number of directly supplied parameters.
This structure has the following form, as defined in
.IR <sys/socket.h> :
.in +0.25i
.nf
@ -283,7 +291,8 @@ The field
which has length
.IR msg_controllen ,
points to a buffer for other protocol control related messages or
miscellaneous ancillary data. When
miscellaneous ancillary data.
When
.BR recvmsg ()
is called,
.I msg_controllen
@ -339,12 +348,14 @@ indicates that no data was received but an extended error from the socket
error queue.
.SH "RETURN VALUE"
These calls return the number of bytes received, or \-1
if an error occurred. The return value will be 0 when the
if an error occurred.
The return value will be 0 when the
peer has performed an orderly shutdown.
.SH ERRORS
These are some standard errors generated by the socket layer. Additional errors
may be generated and returned from the underlying protocol modules; see their
manual pages.
These are some standard errors generated by the socket layer.
Additional errors
may be generated and returned from the underlying protocol modules;
see their manual pages.
.TP
.B EAGAIN
The socket is marked non-blocking and the receive operation

View File

@ -89,7 +89,8 @@ refers to a symbolic link the link is renamed; if
.I newpath
refers to a symbolic link the link will be overwritten.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
@ -217,10 +218,13 @@ even if the same filesystem is mounted on both.)
4.3BSD, C89, C99, POSIX.1-2001.
.SH BUGS
On NFS filesystems, you can not assume that if the operation
failed the file was not renamed. If the server does the rename operation
failed the file was not renamed.
If the server does the rename operation
and then crashes, the retransmitted RPC which will be processed when the
server is up again causes a failure. The application is expected to
deal with this. See
server is up again causes a failure.
The application is expected to
deal with this.
See
.BR link (2)
for a similar problem.
.SH "SEE ALSO"

View File

@ -38,7 +38,8 @@ rmdir \- delete a directory
.BR rmdir ()
deletes a directory, which must be empty.
.SH "RETURN VALUE"
On success, zero is returned. On error, \-1 is returned, and
On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -43,7 +43,8 @@ returns the maximum priority value that can be used with the
scheduling algorithm identified by \fIpolicy\fR.
.BR sched_get_priority_min ()
returns the minimum priority value that can be used with the
scheduling algorithm identified by \fIpolicy\fR. Supported \fIpolicy\fR
scheduling algorithm identified by \fIpolicy\fR.
Supported \fIpolicy\fR
values are
.IR SCHED_FIFO ,
.IR SCHED_RR ,
@ -54,7 +55,8 @@ Further details about these policies can be found in
.BR sched_setscheduler (2).
Processes with numerically higher priority values are scheduled before
processes with numerically lower priority values. Thus, the value
processes with numerically lower priority values.
Thus, the value
returned by \fBsched_get_priority_max\fR() will be greater than the
value returned by \fBsched_get_priority_min\fR().

View File

@ -48,8 +48,10 @@ sched_setparam, sched_getparam \- set and get scheduling parameters
.SH DESCRIPTION
.BR sched_setparam ()
sets the scheduling parameters associated with the scheduling policy
for the process identified by \fIpid\fR. If \fIpid\fR is zero, then
the parameters of the current process are set. The interpretation of
for the process identified by \fIpid\fR.
If \fIpid\fR is zero, then
the parameters of the current process are set.
The interpretation of
the parameter \fIparam\fR depends on the scheduling
policy of the process identified by
.IR pid .
@ -59,12 +61,14 @@ for a description of the scheduling policies supported under Linux.
.BR sched_getparam ()
retrieves the scheduling parameters for the
process identified by \fIpid\fR. If \fIpid\fR is zero, then the parameters
process identified by \fIpid\fR.
If \fIpid\fR is zero, then the parameters
of the current process are retrieved.
.BR sched_setparam ()
checks the validity of \fIparam\fR for the scheduling policy of the
process. The parameter \fIparam->sched_priority\fR must lie within the
process.
The parameter \fIparam->sched_priority\fR must lie within the
range given by \fBsched_get_priority_min\fR(2) and
\fBsched_get_priority_max\fR(2).

View File

@ -57,9 +57,12 @@ set and get scheduling algorithm/parameters
.SH DESCRIPTION
.BR sched_setscheduler ()
sets both the scheduling policy and the associated parameters for the
process identified by \fIpid\fP. If \fIpid\fP equals zero, the
scheduler of the calling process will be set. The interpretation of
the parameter \fIparam\fP depends on the selected policy. Currently, the
process identified by \fIpid\fP.
If \fIpid\fP equals zero, the
scheduler of the calling process will be set.
The interpretation of
the parameter \fIparam\fP depends on the selected policy.
Currently, the
following three scheduling policies are supported under Linux:
.IR SCHED_FIFO ,
.IR SCHED_RR ,
@ -72,20 +75,26 @@ their respective semantics are described below.
.BR sched_getscheduler ()
queries the scheduling policy currently applied to the process
identified by \fIpid\fP. If \fIpid\fP equals zero, the policy of the
identified by \fIpid\fP.
If \fIpid\fP equals zero, the policy of the
calling process will be retrieved.
.SS Scheduling Policies
The scheduler is the kernel part that decides which runnable process
will be executed by the CPU next. The Linux scheduler offers three
will be executed by the CPU next.
The Linux scheduler offers three
different scheduling policies, one for normal processes and two for
real-time applications. A static priority value \fIsched_priority\fP
real-time applications.
A static priority value \fIsched_priority\fP
is assigned to each process and this value can be changed only via
system calls. Conceptually, the scheduler maintains a list of runnable
system calls.
Conceptually, the scheduler maintains a list of runnable
processes for each possible \fIsched_priority\fP value, and
\fIsched_priority\fP can have a value in the range 0 to 99. In order
\fIsched_priority\fP can have a value in the range 0 to 99.
In order
to determine the process that runs next, the Linux scheduler looks for
the non-empty list with the highest static priority and takes the
process at the head of this list. The scheduling policy determines for
process at the head of this list.
The scheduling policy determines for
each process, where it will be inserted into the list of processes
with equal static priority and how it will move inside this list.
@ -108,7 +117,8 @@ POSIX.1-2001 conforming systems.
All scheduling is preemptive: If a process with a higher static
priority gets ready to run, the current process will be preempted and
returned into its wait list. The scheduling policy only determines the
returned into its wait list.
The scheduling policy only determines the
ordering within the list of runnable processes with equal static
priority.
.SS SCHED_FIFO: First In-First Out scheduling
@ -117,13 +127,16 @@ priority.
it will always immediately preempt any currently running
\fISCHED_OTHER\fP or \fISCHED_BATCH\fP process.
\fISCHED_FIFO\fP is a simple scheduling
algorithm without time slicing. For processes scheduled under the
algorithm without time slicing.
For processes scheduled under the
\fISCHED_FIFO\fP policy, the following rules are applied: A
\fISCHED_FIFO\fP process that has been preempted by another process of
higher priority will stay at the head of the list for its priority and
will resume execution as soon as all processes of higher priority are
blocked again. When a \fISCHED_FIFO\fP process becomes runnable, it
will be inserted at the end of the list for its priority. A call to
blocked again.
When a \fISCHED_FIFO\fP process becomes runnable, it
will be inserted at the end of the list for its priority.
A call to
\fBsched_setscheduler\fP() or \fBsched_setparam\fP() will put the
\fISCHED_FIFO\fP (or \fISCHED_RR\fP) process identified by
\fIpid\fP at the start of the list if it was runnable.
@ -134,21 +147,27 @@ of the list.)
.\" In 2.2.x and 2.4.x, the process is placed at the front of the queue
.\" In 2.0.x, the Right Thing happened: the process went to the back -- MTK
A process calling \fBsched_yield\fP() will be
put at the end of the list. No other events will move a process
put at the end of the list.
No other events will move a process
scheduled under the \fISCHED_FIFO\fP policy in the wait list of
runnable processes with equal static priority. A \fISCHED_FIFO\fP
runnable processes with equal static priority.
A \fISCHED_FIFO\fP
process runs until either it is blocked by an I/O request, it is
preempted by a higher priority process, or it calls \fBsched_yield\fP().
.SS SCHED_RR: Round Robin scheduling
\fISCHED_RR\fP is a simple enhancement of \fISCHED_FIFO\fP. Everything
\fISCHED_RR\fP is a simple enhancement of \fISCHED_FIFO\fP.
Everything
described above for \fISCHED_FIFO\fP also applies to \fISCHED_RR\fP,
except that each process is only allowed to run for a maximum time
quantum. If a \fISCHED_RR\fP process has been running for a time
quantum.
If a \fISCHED_RR\fP process has been running for a time
period equal to or longer than the time quantum, it will be put at the
end of the list for its priority. A \fISCHED_RR\fP process that has
end of the list for its priority.
A \fISCHED_RR\fP process that has
been preempted by a higher priority process and subsequently resumes
execution as a running process will complete the unexpired portion of
its round robin time quantum. The length of the time quantum can be
its round robin time quantum.
The length of the time quantum can be
retrieved using \fBsched_rr_get_interval\fP(2).
.\" On Linux 2.4, the length of the RR interval is influenced
.\" by the process nice value -- MTK
@ -157,12 +176,15 @@ retrieved using \fBsched_rr_get_interval\fP(2).
\fISCHED_OTHER\fP can only be used at static priority 0.
\fISCHED_OTHER\fP is the standard Linux time-sharing scheduler that is
intended for all processes that do not require special static priority
real-time mechanisms. The process to run is chosen from the static
real-time mechanisms.
The process to run is chosen from the static
priority 0 list based on a dynamic priority that is determined only
inside this list. The dynamic priority is based on the nice level (set
inside this list.
The dynamic priority is based on the nice level (set
by \fBnice\fP(2) or \fBsetpriority\fP(2)) and increased for
each time quantum the process is ready to run, but denied to run by
the scheduler. This ensures fair progress among all \fISCHED_OTHER\fP
the scheduler.
This ensures fair progress among all \fISCHED_OTHER\fP
processes.
.SS SCHED_BATCH: Scheduling batch processes
(Since Linux 2.6.16.)
@ -235,7 +257,8 @@ processes ignore this limit; as with older kernels,
they can make arbitrary changes to scheduling policy and priority.
.SS Response time
A blocked high priority process waiting for the I/O has a certain
response time before it is scheduled again. The device driver writer
response time before it is scheduled again.
The device driver writer
can greatly reduce this response time by using a "slow interrupt"
interrupt handler.
.\" as described in
@ -256,7 +279,8 @@ As a non-blocking end-less loop in a process scheduled under
\fISCHED_FIFO\fP or \fISCHED_RR\fP will block all processes with lower
priority forever, a software developer should always keep available on
the console a shell scheduled under a higher static priority than the
tested application. This will allow an emergency kill of tested
tested application.
This will allow an emergency kill of tested
real-time applications that do not block or terminate as expected.
POSIX systems on which

View File

@ -127,7 +127,8 @@ those in
will be watched to see if a write will not block, and
those in
.I exceptfds
will be watched for exceptions. On exit, the sets are modified in place
will be watched for exceptions.
On exit, the sets are modified in place
to indicate which file descriptors actually changed status.
Each of the three file descriptor sets may be specified as NULL
if no file descriptors are to be watched for the corresponding class
@ -152,9 +153,12 @@ is the highest-numbered file descriptor in any of the three sets, plus 1.
.I timeout
is an upper bound on the amount of time elapsed before
.BR select ()
returns. It may be zero, causing
returns.
It may be zero, causing
.BR select ()
to return immediately. (This is useful for polling.) If
to return immediately.
(This is useful for polling.)
If
.I timeout
is NULL (no timeout),
.BR select ()
@ -199,7 +203,8 @@ is needed is that if one wants to wait for either a signal
or for a file descriptor to become ready, then
an atomic test is needed to prevent race conditions.
(Suppose the signal handler sets a global flag and
returns. Then a test of this global flag followed by a call of
returns.
Then a test of this global flag followed by a call of
.BR select ()
could hang indefinitely if the signal arrived just after the test
but just before the call.
@ -258,7 +263,8 @@ This causes problems both when Linux code which reads
is ported to other operating systems, and when code is ported to Linux
that reuses a struct timeval for multiple
.BR select ()s
in a loop without reinitializing it. Consider
in a loop without reinitializing it.
Consider
.I timeout
to be undefined after
.BR select ()
@ -343,9 +349,11 @@ main(void)
conforms to POSIX.1-2001 and
4.4BSD
.RB ( select ()
first appeared in 4.2BSD). Generally portable to/from
first appeared in 4.2BSD).
Generally portable to/from
non-BSD systems supporting clones of the BSD socket layer (including
System V variants). However, note that the System V variant typically
System V variants).
However, note that the System V variant typically
sets the timeout variable before exit, but the BSD variant does not.
.PP
.BR pselect ()
@ -362,7 +370,8 @@ or
with a value of
.I fd
that is negative or is equal to or larger than FD_SETSIZE will result
in undefined behavior. Moreover, POSIX requires
in undefined behavior.
Moreover, POSIX requires
.I fd
to be a valid file descriptor.
@ -463,9 +472,11 @@ in the main program.)
Under Linux,
.BR select ()
may report a socket file descriptor as "ready for reading", while
nevertheless a subsequent read blocks. This could for example
nevertheless a subsequent read blocks.
This could for example
happen when data has arrived but upon examination has wrong
checksum and is discarded. There may be other circumstances
checksum and is discarded.
There may be other circumstances
in which a file descriptor is spuriously reported as ready.
.\" Stevens discusses a case where accept can block after select
.\" returns successfully because of an intervening RST from the client.

View File

@ -68,10 +68,13 @@ synchronous I/O multiplexing
most C programs that
handle more than one simultaneous file descriptor (or socket handle)
in an efficient
manner. Its principal arguments are three arrays of file descriptors:
\fIreadfds\fP, \fIwritefds\fP, and \fIexceptfds\fP. The way that
manner.
Its principal arguments are three arrays of file descriptors:
\fIreadfds\fP, \fIwritefds\fP, and \fIexceptfds\fP.
The way that
\fBselect\fP() is usually used is to block while waiting for a "change of
status" on one or more of the file descriptors. A "change of status" is
status" on one or more of the file descriptors.
A "change of status" is
when more characters become available from the file descriptor, \fIor\fP
when space becomes available within the kernel's internal buffers for
more to be written to the file descriptor, \fIor\fP when a file
@ -85,7 +88,8 @@ The arrays of file descriptors are called \fIfile descriptor sets\fP.
Each set is declared as type \fBfd_set\fP, and its contents can be
altered with the macros \fBFD_CLR\fP(), \fBFD_ISSET\fP(), \fBFD_SET\fP(), and
\fBFD_ZERO\fP(). \fBFD_ZERO\fP() is usually the first function to be used on
a newly declared set. Thereafter, the individual file descriptors that
a newly declared set.
Thereafter, the individual file descriptors that
you are interested in can be added one by one with \fBFD_SET\fP().
\fBselect\fP() modifies the contents of the sets according to the rules
described below; after calling \fBselect\fP() you can test if your file
@ -96,7 +100,8 @@ it is not. \fBFD_CLR\fP() removes a file descriptor from the set.
.TP
\fIreadfds\fP
This set is watched to see if data is available for reading from any of
its file descriptors. After \fBselect\fP() has returned, \fIreadfds\fP will be
its file descriptors.
After \fBselect\fP() has returned, \fIreadfds\fP will be
cleared of all file descriptors except for those file descriptors that
are immediately available for reading with a \fBrecv\fP() (for sockets) or
\fBread\fP() (for pipes, files, and sockets) call.
@ -111,21 +116,29 @@ are immediately available for writing with a \fBsend\fP() (for sockets) or
.TP
\fIexceptfds\fP
This set is watched for exceptions or errors on any of the file
descriptors. However, that is actually just a rumor. How you use
\fIexceptfds\fP is to watch for \fIout\-of\-band\fP (OOB) data. OOB data
descriptors.
However, that is actually just a rumor.
How you use
\fIexceptfds\fP is to watch for \fIout\-of\-band\fP (OOB) data.
OOB data
is data sent on a socket using the \fBMSG_OOB\fP flag, and hence
\fIexceptfds\fP only really applies to sockets. See \fBrecv\fP(2) and
\fBsend\fP(2) about this. After \fBselect\fP() has returned,
\fIexceptfds\fP only really applies to sockets.
See \fBrecv\fP(2) and
\fBsend\fP(2) about this.
After \fBselect\fP() has returned,
\fIexceptfds\fP will be cleared of all file descriptors except for those
descriptors that are available for reading OOB data. You can only ever
descriptors that are available for reading OOB data.
You can only ever
read one byte of OOB data though (which is done with \fBrecv\fP()), and
writing OOB data (done with \fBsend\fP()) can be done at any time and will
not block. Hence there is no need for a fourth set to check if a socket
not block.
Hence there is no need for a fourth set to check if a socket
is available for writing OOB data.
.TP
\fInfds\fP
This is an integer one more than the maximum of any file descriptor in
any of the sets. In other words, while you are busy adding file descriptors
any of the sets.
In other words, while you are busy adding file descriptors
to your sets, you must calculate the maximum integer value of all of
them, then increment this value by one, and then pass this as \fInfds\fP to
\fBselect\fP().
@ -133,10 +146,12 @@ them, then increment this value by one, and then pass this as \fInfds\fP to
\fIutimeout\fP
.RS
This is the longest time \fBselect\fP() must wait before returning, even
if nothing interesting happened. If this value is passed as NULL,
if nothing interesting happened.
If this value is passed as NULL,
then \fBselect\fP() blocks indefinitely waiting for an event.
\fIutimeout\fP can be set to zero seconds, which causes \fBselect\fP() to
return immediately. The structure \fBstruct timeval\fP is defined as,
return immediately.
The structure \fBstruct timeval\fP is defined as,
.PP
.nf
struct timeval {
@ -164,26 +179,39 @@ This argument holds a set of signals to allow while performing a
\fBpselect\fP() call (see \fBsigaddset\fP(3) and \fBsigprocmask\fP(2)).
It can be passed
as NULL, in which case it does not modify the set of allowed signals on
entry and exit to the function. It will then behave just like \fBselect\fP().
entry and exit to the function.
It will then behave just like \fBselect\fP().
.SH COMBINING SIGNAL AND DATA EVENTS
\fBpselect\fP() must be used if you are waiting for a signal as well as
data from a file descriptor. Programs that receive signals as events
normally use the signal handler only to raise a global flag. The global
data from a file descriptor.
Programs that receive signals as events
normally use the signal handler only to raise a global flag.
The global
flag will indicate that the event must be processed in the main loop of
the program. A signal will cause the \fBselect\fP() (or \fBpselect\fP())
call to return with \fIerrno\fP set to \fBEINTR\fP. This behavior is
the program.
A signal will cause the \fBselect\fP() (or \fBpselect\fP())
call to return with \fIerrno\fP set to \fBEINTR\fP.
This behavior is
essential so that signals can be processed in the main loop of the
program, otherwise \fBselect\fP() would block indefinitely. Now, somewhere
in the main loop will be a conditional to check the global flag. So we
program, otherwise \fBselect\fP() would block indefinitely.
Now, somewhere
in the main loop will be a conditional to check the global flag.
So we
must ask: what if a signal arrives after the conditional, but before the
\fBselect\fP() call? The answer is that \fBselect\fP() would block
indefinitely, even though an event is actually pending. This race
condition is solved by the \fBpselect\fP() call. This call can be used to
indefinitely, even though an event is actually pending.
This race
condition is solved by the \fBpselect\fP() call.
This call can be used to
mask out signals that are not to be received except within the
\fBpselect\fP() call. For instance, let us say that the event in question
was the exit of a child process. Before the start of the main loop, we
would block \fBSIGCHLD\fP using \fBsigprocmask\fP(). Our \fBpselect\fP()
call would enable \fBSIGCHLD\fP by using the virgin signal mask. Our
\fBpselect\fP() call.
For instance, let us say that the event in question
was the exit of a child process.
Before the start of the main loop, we
would block \fBSIGCHLD\fP using \fBsigprocmask\fP().
Our \fBpselect\fP()
call would enable \fBSIGCHLD\fP by using the virgin signal mask.
Our
program would look like:
.PP
.nf
@ -222,10 +250,13 @@ So what is the point of \fBselect\fP()? Can't I just read and write to my
descriptors whenever I want?
The point of \fBselect\fP() is that it watches
multiple descriptors at the same time and properly puts the process to
sleep if there is no activity. It does this while enabling you to handle
multiple simultaneous pipes and sockets. Unix programmers often find
sleep if there is no activity.
It does this while enabling you to handle
multiple simultaneous pipes and sockets.
Unix programmers often find
themselves in a position where they have to handle I/O from more than one
file descriptor where the data flow may be intermittent. If you were to
file descriptor where the data flow may be intermittent.
If you were to
merely create a sequence of \fBread\fP() and \fBwrite\fP() calls, you would
find that one of your calls may block waiting for data from/to a file
descriptor, while another file descriptor is unused though available
@ -507,31 +538,41 @@ main(int argc, char **argv)
.fi
.PP
The above program properly forwards most kinds of TCP connections
including OOB signal data transmitted by \fBtelnet\fP servers. It
including OOB signal data transmitted by \fBtelnet\fP servers.
It
handles the tricky problem of having data flow in both directions
simultaneously. You might think it more efficient to use a \fBfork\fP()
call and devote a thread to each stream. This becomes more tricky than
you might suspect. Another idea is to set non-blocking I/O using an
\fBioctl\fP() call. This also has its problems because you end up having
simultaneously.
You might think it more efficient to use a \fBfork\fP()
call and devote a thread to each stream.
This becomes more tricky than
you might suspect.
Another idea is to set non-blocking I/O using an
\fBioctl\fP() call.
This also has its problems because you end up having
to have inefficient timeouts.
The program does not handle more than one simultaneous connection at a
time, although it could easily be extended to do this with a linked list
of buffers \(em one for each connection. At the moment, new
of buffers \(em one for each connection.
At the moment, new
connections cause the current connection to be dropped.
.SH SELECT LAW
Many people who try to use \fBselect\fP() come across behavior that is
difficult to understand and produces non-portable or borderline
results. For instance, the above program is carefully written not to
results.
For instance, the above program is carefully written not to
block at any point, even though it does not set its file descriptors to
non-blocking mode at all (see \fBioctl\fP(2)). It is easy to introduce
non-blocking mode at all (see \fBioctl\fP(2)).
It is easy to introduce
subtle errors that will remove the advantage of using \fBselect\fP(),
hence I will present a list of essentials to watch for when using the
\fBselect\fP() call.
.TP
\fB1.\fP
You should always try to use \fBselect\fP() without a timeout. Your program
should have nothing to do if there is no data available. Code that
You should always try to use \fBselect\fP() without a timeout.
Your program
should have nothing to do if there is no data available.
Code that
depends on timeouts is not usually portable and is difficult to debug.
.TP
\fB2.\fP
@ -541,7 +582,8 @@ explained above.
\fB3.\fP
No file descriptor must be added to any set if you do not intend
to check its result after the \fBselect\fP() call, and respond
appropriately. See next rule.
appropriately.
See next rule.
.TP
\fB4.\fP
After \fBselect\fP() returns, all file descriptors in all sets
@ -554,14 +596,18 @@ should be checked to see if they are ready.
\fB5.\fP
The functions \fBread\fP(), \fBrecv\fP(), \fBwrite\fP(), and
\fBsend\fP() do \fInot\fP necessarily read/write the full amount of data
that you have requested. If they do read/write the full amount, its
because you have a low traffic load and a fast stream. This is not
always going to be the case. You should cope with the case of your
that you have requested.
If they do read/write the full amount, its
because you have a low traffic load and a fast stream.
This is not
always going to be the case.
You should cope with the case of your
functions only managing to send or receive a single byte.
.TP
\fB6.\fP
Never read/write only in single bytes at a time unless your are really
sure that you have a small amount of data to process. It is extremely
sure that you have a small amount of data to process.
It is extremely
inefficient not to read/write as much data as you can buffer each time.
The buffers in the example above are 1024 bytes although they could
easily be made larger.
@ -575,9 +621,12 @@ or with
.I errno
set to \fBEAGAIN\fP (\fBEWOULDBLOCK\fP).
These results must be properly managed (not done properly
above). If your program is not going to receive any signals then
it is unlikely you will get \fBEINTR\fP. If your program does not
set non-blocking I/O, you will not get \fBEAGAIN\fP. Nonetheless
above).
If your program is not going to receive any signals then
it is unlikely you will get \fBEINTR\fP.
If your program does not
set non-blocking I/O, you will not get \fBEAGAIN\fP.
Nonetheless
you should still cope with these errors for completeness.
.TP
\fB8.\fP
@ -603,8 +652,10 @@ however does not modify its timeout structure.
.TP
\fB11.\fP
I have heard that the Windows socket layer does not cope with OOB data
properly. It also does not cope with \fBselect\fP() calls when no file
descriptors are set at all. Having no file descriptors set is a useful
properly.
It also does not cope with \fBselect\fP() calls when no file
descriptors are set at all.
Having no file descriptors set is a useful
way to sleep the process with sub-second precision by using the timeout.
(See further on.)
.SH USLEEP EMULATION
@ -630,7 +681,8 @@ The file descriptors set should be all
empty (but may not be on some systems).
A return value of \-1 indicates an error, with \fIerrno\fP being
set appropriately. In the case of an error, the returned sets and
set appropriately.
In the case of an error, the returned sets and
the timeout struct contents are undefined and should not be used.
\fBpselect\fP() however never modifies \fIntimeout\fP.
.SH NOTES

View File

@ -283,7 +283,8 @@ time specified by the
.B timespec
structure whose address is passed in the
.I timeout
parameter. If the specified time limit has been reached,
parameter.
If the specified time limit has been reached,
.BR semtimedop ()
fails with
.I errno

View File

@ -96,7 +96,8 @@ and
.I tolen
are ignored (and the error EISCONN may be returned when they are
not NULL and 0), and the error ENOTCONN is returned when the socket was
not actually connected. Otherwise, the address of the target is given by
not actually connected.
Otherwise, the address of the target is given by
.I to
with
.I tolen
@ -137,7 +138,8 @@ Locally detected errors are indicated by a return value of \-1.
When the message does not fit into the send buffer of the socket,
.BR send ()
normally blocks, unless the socket has been placed in non-blocking I/O
mode. In non-blocking mode it would return
mode.
In non-blocking mode it would return
.B EAGAIN
in this case.
The
@ -152,20 +154,24 @@ of zero or more of the following flags.
.TP
.BR MSG_CONFIRM " (Linux 2.3+ only)"
Tell the link layer that forward progress happened: you got a successful
reply from the other side. If the link layer doesn't get this
reply from the other side.
If the link layer doesn't get this
it will regularly reprobe the neighbour (e.g. via a unicast ARP).
Only valid on
.B SOCK_DGRAM
and
.B SOCK_RAW
sockets and currently only implemented for IPv4 and IPv6. See
sockets and currently only implemented for IPv4 and IPv6.
See
.BR arp (7)
for details.
.TP
.B MSG_DONTROUTE
Don't use a gateway to send out the packet, only send to hosts on
directly connected networks. This is usually used only
by diagnostic or routing programs. This is only defined for protocol
directly connected networks.
This is usually used only
by diagnostic or routing programs.
This is only defined for protocol
families that route; packet sockets don't.
.TP
.B MSG_DONTWAIT
@ -201,7 +207,8 @@ socket option described in
Requests not to send
.B SIGPIPE
on errors on stream oriented sockets when the other end breaks the
connection. The
connection.
The
.B EPIPE
error is still returned.
.TP
@ -216,7 +223,8 @@ data.
.PP
The definition of the
.I msghdr
structure follows. See
structure follows.
See
.BR recv (2)
and below for an exact description of its fields.
.in +0.25i
@ -238,7 +246,8 @@ You may send control information using the
.I msg_control
and
.I msg_controllen
members. The maximum control buffer length the kernel can process is limited
members.
The maximum control buffer length the kernel can process is limited
per socket by the
.B net.core.optmem_max
sysctl; see
@ -253,15 +262,17 @@ On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS
These are some standard errors generated by the socket layer. Additional errors
may be generated and returned from the underlying protocol modules; see their
respective manual pages.
These are some standard errors generated by the socket layer.
Additional errors
may be generated and returned from the underlying protocol modules;
see their respective manual pages.
.TP
.B EACCES
(For Unix domain sockets, which are identified by pathname)
Write permission is denied on the destination socket file,
or search permission is denied for one of the directories
the path prefix. (See
the path prefix.
(See
.BR path_resolution (2).)
.TP
.BR EAGAIN " or " EWOULDBLOCK
@ -302,7 +313,8 @@ of the message to be sent made this impossible.
The output queue for a network interface was full.
This generally indicates that the interface has stopped sending,
but may be caused by transient congestion.
(Normally, this does not occur in Linux. Packets are just silently dropped
(Normally, this does not occur in Linux.
Packets are just silently dropped
when a device queue overflows.)
.TP
.B ENOMEM

View File

@ -114,7 +114,8 @@ changed the current offset of that file.
.SH "RETURN VALUE"
If the transfer was successful, the number of bytes written to
.I out_fd
is returned. On error, \-1 is returned, and
is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.SH ERRORS

View File

@ -43,7 +43,8 @@ This system call defines the default policy for the process;
in addition a policy can be set for specific memory ranges using
.BR mbind (2).
The policy is only applied when a new page is allocated
for the process. For anonymous memory this is when the page is first
for the process.
For anonymous memory this is when the page is first
touched by the application.
Available policies are
@ -59,7 +60,8 @@ parameter.
.I nodemask
is pointer to a bit field of nodes that contains up to
.I maxnode
bits. The bit field size is rounded to the next multiple of
bits.
The bit field size is rounded to the next multiple of
.IR "sizeof(unsigned long)" ,
but the kernel will only use bits up to
.IR maxnode .
@ -89,7 +91,8 @@ at least 1MB or bigger.
sets the preferred node for allocation.
The kernel will try to allocate in this
node first and fall back to other nodes if the preferred node is low on free
memory. Only the first node in the
memory.
Only the first node in the
.I nodemask
is used.
If no node is set in the mask, then the memory is allocated on

Some files were not shown because too many files have changed in this diff Show More