mirror of https://github.com/mkerrisk/man-pages
1711 lines
46 KiB
Groff
1711 lines
46 KiB
Groff
'\" t
|
|
.\" This manpage is Copyright (C) 1992 Drew Eckhardt;
|
|
.\" and Copyright (C) 1993 Michael Haardt, Ian Jackson;
|
|
.\" and Copyright (C) 1998 Jamie Lokier;
|
|
.\" and Copyright (C) 2002-2010, 2014 Michael Kerrisk;
|
|
.\" and Copyright (C) 2014 Jeff Layton
|
|
.\"
|
|
.\" %%%LICENSE_START(VERBATIM)
|
|
.\" Permission is granted to make and distribute verbatim copies of this
|
|
.\" manual provided the copyright notice and this permission notice are
|
|
.\" preserved on all copies.
|
|
.\"
|
|
.\" Permission is granted to copy and distribute modified versions of this
|
|
.\" manual under the conditions for verbatim copying, provided that the
|
|
.\" entire resulting derived work is distributed under the terms of a
|
|
.\" permission notice identical to this one.
|
|
.\"
|
|
.\" Since the Linux kernel and libraries are constantly changing, this
|
|
.\" manual page may be incorrect or out-of-date. The author(s) assume no
|
|
.\" responsibility for errors or omissions, or for damages resulting from
|
|
.\" the use of the information contained herein. The author(s) may not
|
|
.\" have taken the same level of care in the production of this manual,
|
|
.\" which is licensed free of charge, as they might when working
|
|
.\" professionally.
|
|
.\"
|
|
.\" Formatted or processed versions of this manual, if unaccompanied by
|
|
.\" the source, must acknowledge the copyright and authors of this work.
|
|
.\" %%%LICENSE_END
|
|
.\"
|
|
.\" Modified 1993-07-24 by Rik Faith <faith@cs.unc.edu>
|
|
.\" Modified 1995-09-26 by Andries Brouwer <aeb@cwi.nl>
|
|
.\" and again on 960413 and 980804 and 981223.
|
|
.\" Modified 1998-12-11 by Jamie Lokier <jamie@imbolc.ucc.ie>
|
|
.\" Applied correction by Christian Ehrhardt - aeb, 990712
|
|
.\" Modified 2002-04-23 by Michael Kerrisk <mtk.manpages@gmail.com>
|
|
.\" Added note on F_SETFL and O_DIRECT
|
|
.\" Complete rewrite + expansion of material on file locking
|
|
.\" Incorporated description of F_NOTIFY, drawing on
|
|
.\" Stephen Rothwell's notes in Documentation/dnotify.txt.
|
|
.\" Added description of F_SETLEASE and F_GETLEASE
|
|
.\" Corrected and polished, aeb, 020527.
|
|
.\" Modified 2004-03-03 by Michael Kerrisk <mtk.manpages@gmail.com>
|
|
.\" Modified description of file leases: fixed some errors of detail
|
|
.\" Replaced the term "lease contestant" by "lease breaker"
|
|
.\" Modified, 27 May 2004, Michael Kerrisk <mtk.manpages@gmail.com>
|
|
.\" Added notes on capability requirements
|
|
.\" Modified 2004-12-08, added O_NOATIME after note from Martin Pool
|
|
.\" 2004-12-10, mtk, noted F_GETOWN bug after suggestion from aeb.
|
|
.\" 2005-04-08 Jamie Lokier <jamie@shareable.org>, mtk
|
|
.\" Described behavior of F_SETOWN/F_SETSIG in
|
|
.\" multithreaded processes, and generally cleaned
|
|
.\" up the discussion of F_SETOWN.
|
|
.\" 2005-05-20, Johannes Nicolai <johannes.nicolai@hpi.uni-potsdam.de>,
|
|
.\" mtk: Noted F_SETOWN bug for socket file descriptor in Linux 2.4
|
|
.\" and earlier. Added text on permissions required to send signal.
|
|
.\" 2009-09-30, Michael Kerrisk
|
|
.\" Note obsolete F_SETOWN behavior with threads.
|
|
.\" Document F_SETOWN_EX and F_GETOWN_EX
|
|
.\" 2010-06-17, Michael Kerrisk
|
|
.\" Document F_SETPIPE_SZ and F_GETPIPE_SZ.
|
|
.\"
|
|
.TH FCNTL 2 2014-09-06 "Linux" "Linux Programmer's Manual"
|
|
.SH NAME
|
|
fcntl \- manipulate file descriptor
|
|
.SH SYNOPSIS
|
|
.nf
|
|
.B #include <unistd.h>
|
|
.B #include <fcntl.h>
|
|
.sp
|
|
.BI "int fcntl(int " fd ", int " cmd ", ... /* " arg " */ );"
|
|
.fi
|
|
.SH DESCRIPTION
|
|
.BR fcntl ()
|
|
performs one of the operations described below on the open file descriptor
|
|
.IR fd .
|
|
The operation is determined by
|
|
.IR cmd .
|
|
|
|
.BR fcntl ()
|
|
can take an optional third argument.
|
|
Whether or not this argument is required is determined by
|
|
.IR cmd .
|
|
The required argument type is indicated in parentheses after each
|
|
.I cmd
|
|
name (in most cases, the required type is
|
|
.IR int ,
|
|
and we identify the argument using the name
|
|
.IR arg ),
|
|
or
|
|
.I void
|
|
is specified if the argument is not required.
|
|
|
|
Certain of the operations below are supported only since a particular
|
|
Linux kernel version.
|
|
The preferred method of checking whether the host kernel supports
|
|
a particular operation is to invoke
|
|
.BR fcntl ()
|
|
with the desired
|
|
.IR cmd
|
|
value and then test whether the call failed with
|
|
.BR EINVAL ,
|
|
indicating that the kernel does not recognize this value.
|
|
.SS Duplicating a file descriptor
|
|
.TP
|
|
.BR F_DUPFD " (\fIint\fP)"
|
|
Find the lowest numbered available file descriptor
|
|
greater than or equal to
|
|
.I arg
|
|
and make it be a copy of
|
|
.IR fd .
|
|
This is different from
|
|
.BR dup2 (2),
|
|
which uses exactly the descriptor specified.
|
|
.IP
|
|
On success, the new descriptor is returned.
|
|
.IP
|
|
See
|
|
.BR dup (2)
|
|
for further details.
|
|
.TP
|
|
.BR F_DUPFD_CLOEXEC " (\fIint\fP; since Linux 2.6.24)"
|
|
As for
|
|
.BR F_DUPFD ,
|
|
but additionally set the
|
|
close-on-exec flag for the duplicate descriptor.
|
|
Specifying this flag permits a program to avoid an additional
|
|
.BR fcntl ()
|
|
.B F_SETFD
|
|
operation to set the
|
|
.B FD_CLOEXEC
|
|
flag.
|
|
For an explanation of why this flag is useful,
|
|
see the description of
|
|
.B O_CLOEXEC
|
|
in
|
|
.BR open (2).
|
|
.SS File descriptor flags
|
|
The following commands manipulate the flags associated with
|
|
a file descriptor.
|
|
Currently, only one such flag is defined:
|
|
.BR FD_CLOEXEC ,
|
|
the close-on-exec flag.
|
|
If the
|
|
.B FD_CLOEXEC
|
|
bit is 0, the file descriptor will remain open across an
|
|
.BR execve (2),
|
|
otherwise it will be closed.
|
|
.TP
|
|
.BR F_GETFD " (\fIvoid\fP)"
|
|
Read the file descriptor flags;
|
|
.I arg
|
|
is ignored.
|
|
.TP
|
|
.BR F_SETFD " (\fIint\fP)"
|
|
Set the file descriptor flags to the value specified by
|
|
.IR arg .
|
|
.PP
|
|
In multithreaded programs, using
|
|
.BR fcntl ()
|
|
.B F_SETFD
|
|
to set the close-on-exec flag at the same time as another thread performs a
|
|
.BR fork (2)
|
|
plus
|
|
.BR execve (2)
|
|
is vulnerable to a race condition that may unintentionally leak
|
|
the file descriptor to the program executed in the child process.
|
|
See the discussion of the
|
|
.BR O_CLOEXEC
|
|
flag in
|
|
.BR open (2)
|
|
for details and a remedy to the problem.
|
|
.SS File status flags
|
|
Each open file description has certain associated status flags,
|
|
initialized by
|
|
.BR open (2)
|
|
.\" or
|
|
.\" .BR creat (2),
|
|
and possibly modified by
|
|
.BR fcntl ().
|
|
Duplicated file descriptors
|
|
(made with
|
|
.BR dup (2),
|
|
.BR fcntl (F_DUPFD),
|
|
.BR fork (2),
|
|
etc.) refer to the same open file description, and thus
|
|
share the same file status flags.
|
|
|
|
The file status flags and their semantics are described in
|
|
.BR open (2).
|
|
.TP
|
|
.BR F_GETFL " (\fIvoid\fP)"
|
|
Get the file access mode and the file status flags;
|
|
.I arg
|
|
is ignored.
|
|
.TP
|
|
.BR F_SETFL " (\fIint\fP)"
|
|
Set the file status flags to the value specified by
|
|
.IR arg .
|
|
File access mode
|
|
.RB ( O_RDONLY ", " O_WRONLY ", " O_RDWR )
|
|
and file creation flags
|
|
(i.e.,
|
|
.BR O_CREAT ", " O_EXCL ", " O_NOCTTY ", " O_TRUNC )
|
|
in
|
|
.I arg
|
|
are ignored.
|
|
On Linux this command can change only the
|
|
.BR O_APPEND ,
|
|
.BR O_ASYNC ,
|
|
.BR O_DIRECT ,
|
|
.BR O_NOATIME ,
|
|
and
|
|
.B O_NONBLOCK
|
|
flags.
|
|
It is not possible to change the
|
|
.BR O_DSYNC
|
|
and
|
|
.BR O_SYNC
|
|
flags; see BUGS, below.
|
|
.SS Advisory record locking
|
|
Linux implements traditional ("process-associated") UNIX record locks,
|
|
as standardized by POSIX.
|
|
For a Linux-specific alternative with better semantics,
|
|
see the discussion of open file description locks below.
|
|
|
|
.BR F_SETLK ,
|
|
.BR F_SETLKW ,
|
|
and
|
|
.BR F_GETLK
|
|
are used to acquire, release, and test for the existence of record
|
|
locks (also known as byte-range, file-segment, or file-region locks).
|
|
The third argument,
|
|
.IR lock ,
|
|
is a pointer to a structure that has at least the following fields
|
|
(in unspecified order).
|
|
.in +4n
|
|
.nf
|
|
.sp
|
|
struct flock {
|
|
...
|
|
short l_type; /* Type of lock: F_RDLCK,
|
|
F_WRLCK, F_UNLCK */
|
|
short l_whence; /* How to interpret l_start:
|
|
SEEK_SET, SEEK_CUR, SEEK_END */
|
|
off_t l_start; /* Starting offset for lock */
|
|
off_t l_len; /* Number of bytes to lock */
|
|
pid_t l_pid; /* PID of process blocking our lock
|
|
(set by F_GETLK and F_OFD_GETLK) */
|
|
...
|
|
};
|
|
.fi
|
|
.in
|
|
.P
|
|
The
|
|
.IR l_whence ", " l_start ", and " l_len
|
|
fields of this structure specify the range of bytes we wish to lock.
|
|
Bytes past the end of the file may be locked,
|
|
but not bytes before the start of the file.
|
|
|
|
.I l_start
|
|
is the starting offset for the lock, and is interpreted
|
|
relative to either:
|
|
the start of the file (if
|
|
.I l_whence
|
|
is
|
|
.BR SEEK_SET );
|
|
the current file offset (if
|
|
.I l_whence
|
|
is
|
|
.BR SEEK_CUR );
|
|
or the end of the file (if
|
|
.I l_whence
|
|
is
|
|
.BR SEEK_END ).
|
|
In the final two cases,
|
|
.I l_start
|
|
can be a negative number provided the
|
|
offset does not lie before the start of the file.
|
|
|
|
.I l_len
|
|
specifies the number of bytes to be locked.
|
|
If
|
|
.I l_len
|
|
is positive, then the range to be locked covers bytes
|
|
.I l_start
|
|
up to and including
|
|
.IR l_start + l_len \-1.
|
|
Specifying 0 for
|
|
.I l_len
|
|
has the special meaning: lock all bytes starting at the
|
|
location specified by
|
|
.IR l_whence " and " l_start
|
|
through to the end of file, no matter how large the file grows.
|
|
|
|
POSIX.1-2001 allows (but does not require)
|
|
an implementation to support a negative
|
|
.I l_len
|
|
value; if
|
|
.I l_len
|
|
is negative, the interval described by
|
|
.I lock
|
|
covers bytes
|
|
.IR l_start + l_len
|
|
up to and including
|
|
.IR l_start \-1.
|
|
This is supported by Linux since kernel versions 2.4.21 and 2.5.49.
|
|
|
|
The
|
|
.I l_type
|
|
field can be used to place a read
|
|
.RB ( F_RDLCK )
|
|
or a write
|
|
.RB ( F_WRLCK )
|
|
lock on a file.
|
|
Any number of processes may hold a read lock (shared lock)
|
|
on a file region, but only one process may hold a write lock
|
|
(exclusive lock).
|
|
An exclusive lock excludes all other locks,
|
|
both shared and exclusive.
|
|
A single process can hold only one type of lock on a file region;
|
|
if a new lock is applied to an already-locked region,
|
|
then the existing lock is converted to the new lock type.
|
|
(Such conversions may involve splitting, shrinking, or coalescing with
|
|
an existing lock if the byte range specified by the new lock does not
|
|
precisely coincide with the range of the existing lock.)
|
|
.TP
|
|
.BR F_SETLK " (\fIstruct flock *\fP)"
|
|
Acquire a lock (when
|
|
.I l_type
|
|
is
|
|
.B F_RDLCK
|
|
or
|
|
.BR F_WRLCK )
|
|
or release a lock (when
|
|
.I l_type
|
|
is
|
|
.BR F_UNLCK )
|
|
on the bytes specified by the
|
|
.IR l_whence ", " l_start ", and " l_len
|
|
fields of
|
|
.IR lock .
|
|
If a conflicting lock is held by another process,
|
|
this call returns \-1 and sets
|
|
.I errno
|
|
to
|
|
.B EACCES
|
|
or
|
|
.BR EAGAIN .
|
|
(The error returned in this case differs across implementations,
|
|
so POSIX requires a portable application to check for both errors.)
|
|
.TP
|
|
.BR F_SETLKW " (\fIstruct flock *\fP)"
|
|
As for
|
|
.BR F_SETLK ,
|
|
but if a conflicting lock is held on the file, then wait for that
|
|
lock to be released.
|
|
If a signal is caught while waiting, then the call is interrupted
|
|
and (after the signal handler has returned)
|
|
returns immediately (with return value \-1 and
|
|
.I errno
|
|
set to
|
|
.BR EINTR ;
|
|
see
|
|
.BR signal (7)).
|
|
.TP
|
|
.BR F_GETLK " (\fIstruct flock *\fP)"
|
|
On input to this call,
|
|
.I lock
|
|
describes a lock we would like to place on the file.
|
|
If the lock could be placed,
|
|
.BR fcntl ()
|
|
does not actually place it, but returns
|
|
.B F_UNLCK
|
|
in the
|
|
.I l_type
|
|
field of
|
|
.I lock
|
|
and leaves the other fields of the structure unchanged.
|
|
|
|
If one or more incompatible locks would prevent
|
|
this lock being placed, then
|
|
.BR fcntl ()
|
|
returns details about one of those locks in the
|
|
.IR l_type ", " l_whence ", " l_start ", and " l_len
|
|
fields of
|
|
.IR lock .
|
|
If the conflicting lock is a traditional (process-associated) record lock,
|
|
then the
|
|
.I l_pid
|
|
field is set to the PID of the process holding that lock.
|
|
If the conflicting lock is an open file description lock, then
|
|
.I l_pid
|
|
is set to \-1.
|
|
Note that the returned information
|
|
may already be out of date by the time the caller inspects it.
|
|
.P
|
|
In order to place a read lock,
|
|
.I fd
|
|
must be open for reading.
|
|
In order to place a write lock,
|
|
.I fd
|
|
must be open for writing.
|
|
To place both types of lock, open a file read-write.
|
|
|
|
When placing locks with
|
|
.BR F_SETLKW ,
|
|
the kernel detects
|
|
.IR deadlocks ,
|
|
whereby two or more processes have their
|
|
lock requests mutually blocked by locks held by the other processes.
|
|
For example, suppose process A holds a write lock on byte 100 of a file,
|
|
and process B holds a write lock on byte 200.
|
|
If each process then attempts to lock the byte already
|
|
locked by the other process using
|
|
.BR F_SETLKW ,
|
|
then, without deadlock detection,
|
|
both processes would remain blocked indefinitely.
|
|
When the kernel detects such deadlocks,
|
|
it causes one of the blocking lock requests to immediately fail with the error
|
|
.BR EDEADLK ;
|
|
an application that encounters such an error should release
|
|
some of its locks to allow other applications to proceed before
|
|
attempting regain the locks that it requires.
|
|
Circular deadlocks involving more than two processes are also detected.
|
|
Note, however, that there are limitations to the kernel's
|
|
deadlock-detection algorithm; see BUGS.
|
|
|
|
As well as being removed by an explicit
|
|
.BR F_UNLCK ,
|
|
record locks are automatically released when the process terminates.
|
|
.P
|
|
Record locks are not inherited by a child created via
|
|
.BR fork (2),
|
|
but are preserved across an
|
|
.BR execve (2).
|
|
.P
|
|
Because of the buffering performed by the
|
|
.BR stdio (3)
|
|
library, the use of record locking with routines in that package
|
|
should be avoided; use
|
|
.BR read (2)
|
|
and
|
|
.BR write (2)
|
|
instead.
|
|
|
|
The record locks described above are associated with the process
|
|
(unlike the open file description locks described below).
|
|
This has some unfortunate consequences:
|
|
.IP * 3
|
|
If a process closes
|
|
.I any
|
|
file descriptor referring to a file,
|
|
then all of the process's locks on that file are released,
|
|
regardless of the file descriptor(s) on which the locks were obtained.
|
|
.\" (Additional file descriptors referring to the same file
|
|
.\" may have been obtained by calls to
|
|
.\" .BR open "(2), " dup "(2), " dup2 "(2), or " fcntl ().)
|
|
This is bad: it means that a process can lose its locks on
|
|
a file such as
|
|
.I /etc/passwd
|
|
or
|
|
.I /etc/mtab
|
|
when for some reason a library function decides to open, read,
|
|
and close the same file.
|
|
.IP *
|
|
The threads in a process share locks.
|
|
In other words,
|
|
a multithreaded program can't use record locking to ensure
|
|
that threads don't simultaneously access the same region of a file.
|
|
.PP
|
|
Open file description locks solve both of these problems.
|
|
.SS Open file description locks (non-POSIX)
|
|
Open file description locks are advisory byte-range locks whose operation is
|
|
in most respects identical to the traditional record locks described above.
|
|
This lock type is Linux-specific,
|
|
and available since Linux 3.15.
|
|
For an explanation of open file descriptions, see
|
|
.BR open (2).
|
|
|
|
The principal difference between the two lock types
|
|
is that whereas traditional record locks
|
|
are associated with a process,
|
|
open file description locks are associated with the
|
|
open file description on which they are acquired,
|
|
much like locks acquired with
|
|
.BR flock (2).
|
|
Consequently (and unlike traditional advisory record locks),
|
|
open file description locks are inherited across
|
|
.BR fork (2)
|
|
(and
|
|
.BR clone (2)
|
|
with
|
|
.BR CLONE_FILES ),
|
|
and are only automatically released on the last close
|
|
of the open file description,
|
|
instead of being released on any close of the file.
|
|
.PP
|
|
Open file description locks always conflict with traditional record locks,
|
|
even when they are acquired by the same process on the same file descriptor.
|
|
|
|
Open file description locks placed via the same open file description
|
|
(i.e., via the same file descriptor,
|
|
or via a duplicate of the file descriptor created by
|
|
.BR fork (2),
|
|
.BR dup (2),
|
|
.BR fcntl (2)
|
|
.BR F_DUPFD ,
|
|
and so on) are always compatible:
|
|
if a new lock is placed on an already locked region,
|
|
then the existing lock is converted to the new lock type.
|
|
(Such conversions may result in splitting, shrinking, or coalescing with
|
|
an existing lock as discussed above.)
|
|
|
|
On the other hand, open file description locks may conflict with
|
|
each other when they are acquired via different open file descriptions.
|
|
Thus, the threads in a multithreaded program can use
|
|
open file description locks to synchronize access to a file region
|
|
by having each thread perform its own
|
|
.BR open (2)
|
|
on the file and applying locks via the resulting file descriptor.
|
|
.PP
|
|
As with traditional advisory locks, the third argument to
|
|
.BR fcntl (),
|
|
.IR lock ,
|
|
is a pointer to an
|
|
.IR flock
|
|
structure.
|
|
By contrast with traditional record locks, the
|
|
.I l_pid
|
|
field of that structure must be set to zero
|
|
when using the commands described below.
|
|
|
|
The commands for working with open file description locks are analogous
|
|
to those used with traditional locks:
|
|
.TP
|
|
.BR F_OFD_SETLK " (\fIstruct flock *\fP)"
|
|
Acquire an open file description lock (when
|
|
.I l_type
|
|
is
|
|
.B F_RDLCK
|
|
or
|
|
.BR F_WRLCK )
|
|
or release an open file description lock (when
|
|
.I l_type
|
|
is
|
|
.BR F_UNLCK )
|
|
on the bytes specified by the
|
|
.IR l_whence ", " l_start ", and " l_len
|
|
fields of
|
|
.IR lock .
|
|
If a conflicting lock is held by another process,
|
|
this call returns \-1 and sets
|
|
.I errno
|
|
to
|
|
.BR EAGAIN .
|
|
.TP
|
|
.BR F_OFD_SETLKW " (\fIstruct flock *\fP)"
|
|
As for
|
|
.BR F_OFD_SETLK ,
|
|
but if a conflicting lock is held on the file, then wait for that lock to be
|
|
released.
|
|
If a signal is caught while waiting, then the call is interrupted
|
|
and (after the signal handler has returned) returns immediately
|
|
(with return value \-1 and
|
|
.I errno
|
|
set to
|
|
.BR EINTR ;
|
|
see
|
|
.BR signal (7)).
|
|
.TP
|
|
.BR F_OFD_GETLK " (\fIstruct flock *\fP)"
|
|
On input to this call,
|
|
.I lock
|
|
describes an open file description lock we would like to place on the file.
|
|
If the lock could be placed,
|
|
.BR fcntl ()
|
|
does not actually place it, but returns
|
|
.B F_UNLCK
|
|
in the
|
|
.I l_type
|
|
field of
|
|
.I lock
|
|
and leaves the other fields of the structure unchanged.
|
|
If one or more incompatible locks would prevent this lock being placed,
|
|
then details about one of these locks are returned via
|
|
.IR lock ,
|
|
as described above for
|
|
.BR F_GETLK .
|
|
.PP
|
|
In the current implementation,
|
|
.\" commit 57b65325fe34ec4c917bc4e555144b4a94d9e1f7
|
|
no deadlock detection is performed for open file description locks.
|
|
(This contrasts with process-associated record locks,
|
|
for which the kernel does perform deadlock detection.)
|
|
.\"
|
|
.SS Mandatory locking
|
|
.IR Warning :
|
|
the Linux implementation of mandatory locking is unreliable.
|
|
See BUGS below.
|
|
|
|
By default, both traditional (process-associated) and open file description
|
|
record locks are advisory.
|
|
Advisory locks are not enforced and are useful only between
|
|
cooperating processes.
|
|
|
|
Both lock types can also be mandatory.
|
|
Mandatory locks are enforced for all processes.
|
|
If a process tries to perform an incompatible access (e.g.,
|
|
.BR read (2)
|
|
or
|
|
.BR write (2))
|
|
on a file region that has an incompatible mandatory lock,
|
|
then the result depends upon whether the
|
|
.B O_NONBLOCK
|
|
flag is enabled for its open file description.
|
|
If the
|
|
.B O_NONBLOCK
|
|
flag is not enabled, then
|
|
the system call is blocked until the lock is removed
|
|
or converted to a mode that is compatible with the access.
|
|
If the
|
|
.B O_NONBLOCK
|
|
flag is enabled, then the system call fails with the error
|
|
.BR EAGAIN .
|
|
|
|
To make use of mandatory locks, mandatory locking must be enabled
|
|
both on the filesystem that contains the file to be locked,
|
|
and on the file itself.
|
|
Mandatory locking is enabled on a filesystem
|
|
using the "\-o mand" option to
|
|
.BR mount (8),
|
|
or the
|
|
.B MS_MANDLOCK
|
|
flag for
|
|
.BR mount (2).
|
|
Mandatory locking is enabled on a file by disabling
|
|
group execute permission on the file and enabling the set-group-ID
|
|
permission bit (see
|
|
.BR chmod (1)
|
|
and
|
|
.BR chmod (2)).
|
|
|
|
Mandatory locking is not specified by POSIX.
|
|
Some other systems also support mandatory locking,
|
|
although the details of how to enable it vary across systems.
|
|
.SS Managing signals
|
|
.BR F_GETOWN ,
|
|
.BR F_SETOWN ,
|
|
.BR F_GETOWN_EX ,
|
|
.BR F_SETOWN_EX ,
|
|
.BR F_GETSIG
|
|
and
|
|
.B F_SETSIG
|
|
are used to manage I/O availability signals:
|
|
.TP
|
|
.BR F_GETOWN " (\fIvoid\fP)"
|
|
Return (as the function result)
|
|
the process ID or process group currently receiving
|
|
.B SIGIO
|
|
and
|
|
.B SIGURG
|
|
signals for events on file descriptor
|
|
.IR fd .
|
|
Process IDs are returned as positive values;
|
|
process group IDs are returned as negative values (but see BUGS below).
|
|
.I arg
|
|
is ignored.
|
|
.TP
|
|
.BR F_SETOWN " (\fIint\fP)"
|
|
Set the process ID or process group ID that will receive
|
|
.B SIGIO
|
|
and
|
|
.B SIGURG
|
|
signals for events on file descriptor
|
|
.IR fd
|
|
to the ID given in
|
|
.IR arg .
|
|
A process ID is specified as a positive value;
|
|
a process group ID is specified as a negative value.
|
|
Most commonly, the calling process specifies itself as the owner
|
|
(that is,
|
|
.I arg
|
|
is specified as
|
|
.BR getpid (2)).
|
|
|
|
.\" From glibc.info:
|
|
If you set the
|
|
.B O_ASYNC
|
|
status flag on a file descriptor by using the
|
|
.B F_SETFL
|
|
command of
|
|
.BR fcntl (),
|
|
a
|
|
.B SIGIO
|
|
signal is sent whenever input or output becomes possible
|
|
on that file descriptor.
|
|
.B F_SETSIG
|
|
can be used to obtain delivery of a signal other than
|
|
.BR SIGIO .
|
|
If this permission check fails, then the signal is
|
|
silently discarded.
|
|
|
|
Sending a signal to the owner process (group) specified by
|
|
.B F_SETOWN
|
|
is subject to the same permissions checks as are described for
|
|
.BR kill (2),
|
|
where the sending process is the one that employs
|
|
.B F_SETOWN
|
|
(but see BUGS below).
|
|
|
|
If the file descriptor
|
|
.I fd
|
|
refers to a socket,
|
|
.B F_SETOWN
|
|
also selects
|
|
the recipient of
|
|
.B SIGURG
|
|
signals that are delivered when out-of-band
|
|
data arrives on that socket.
|
|
.RB ( SIGURG
|
|
is sent in any situation where
|
|
.BR select (2)
|
|
would report the socket as having an "exceptional condition".)
|
|
.\" The following appears to be rubbish. It doesn't seem to
|
|
.\" be true according to the kernel source, and I can write
|
|
.\" a program that gets a terminal-generated SIGIO even though
|
|
.\" it is not the foreground process group of the terminal.
|
|
.\" -- MTK, 8 Apr 05
|
|
.\"
|
|
.\" If the file descriptor
|
|
.\" .I fd
|
|
.\" refers to a terminal device, then SIGIO
|
|
.\" signals are sent to the foreground process group of the terminal.
|
|
|
|
The following was true in 2.6.x kernels up to and including
|
|
kernel 2.6.11:
|
|
.RS
|
|
.IP
|
|
If a nonzero value is given to
|
|
.B F_SETSIG
|
|
in a multithreaded process running with a threading library
|
|
that supports thread groups (e.g., NPTL),
|
|
then a positive value given to
|
|
.B F_SETOWN
|
|
has a different meaning:
|
|
.\" The relevant place in the (2.6) kernel source is the
|
|
.\" 'switch' in fs/fcntl.c::send_sigio_to_task() -- MTK, Apr 2005
|
|
instead of being a process ID identifying a whole process,
|
|
it is a thread ID identifying a specific thread within a process.
|
|
Consequently, it may be necessary to pass
|
|
.B F_SETOWN
|
|
the result of
|
|
.BR gettid (2)
|
|
instead of
|
|
.BR getpid (2)
|
|
to get sensible results when
|
|
.B F_SETSIG
|
|
is used.
|
|
(In current Linux threading implementations,
|
|
a main thread's thread ID is the same as its process ID.
|
|
This means that a single-threaded program can equally use
|
|
.BR gettid (2)
|
|
or
|
|
.BR getpid (2)
|
|
in this scenario.)
|
|
Note, however, that the statements in this paragraph do not apply
|
|
to the
|
|
.B SIGURG
|
|
signal generated for out-of-band data on a socket:
|
|
this signal is always sent to either a process or a process group,
|
|
depending on the value given to
|
|
.BR F_SETOWN .
|
|
.\" send_sigurg()/send_sigurg_to_task() bypasses
|
|
.\" kill_fasync()/send_sigio()/send_sigio_to_task()
|
|
.\" to directly call send_group_sig_info()
|
|
.\" -- MTK, Apr 2005 (kernel 2.6.11)
|
|
.RE
|
|
.IP
|
|
The above behavior was accidentally dropped in Linux 2.6.12,
|
|
and won't be restored.
|
|
From Linux 2.6.32 onward, use
|
|
.BR F_SETOWN_EX
|
|
to target
|
|
.B SIGIO
|
|
and
|
|
.B SIGURG
|
|
signals at a particular thread.
|
|
.TP
|
|
.BR F_GETOWN_EX " (struct f_owner_ex *) (since Linux 2.6.32)"
|
|
Return the current file descriptor owner settings
|
|
as defined by a previous
|
|
.BR F_SETOWN_EX
|
|
operation.
|
|
The information is returned in the structure pointed to by
|
|
.IR arg ,
|
|
which has the following form:
|
|
.nf
|
|
.in +4n
|
|
|
|
struct f_owner_ex {
|
|
int type;
|
|
pid_t pid;
|
|
};
|
|
|
|
.in
|
|
.fi
|
|
The
|
|
.I type
|
|
field will have one of the values
|
|
.BR F_OWNER_TID ,
|
|
.BR F_OWNER_PID ,
|
|
or
|
|
.BR F_OWNER_PGRP .
|
|
The
|
|
.I pid
|
|
field is a positive integer representing a thread ID, process ID,
|
|
or process group ID.
|
|
See
|
|
.B F_SETOWN_EX
|
|
for more details.
|
|
.TP
|
|
.BR F_SETOWN_EX " (struct f_owner_ex *) (since Linux 2.6.32)"
|
|
This operation performs a similar task to
|
|
.BR F_SETOWN .
|
|
It allows the caller to direct I/O availability signals
|
|
to a specific thread, process, or process group.
|
|
The caller specifies the target of signals via
|
|
.IR arg ,
|
|
which is a pointer to a
|
|
.IR f_owner_ex
|
|
structure.
|
|
The
|
|
.I type
|
|
field has one of the following values, which define how
|
|
.I pid
|
|
is interpreted:
|
|
.RS
|
|
.TP
|
|
.BR F_OWNER_TID
|
|
Send the signal to the thread whose thread ID
|
|
(the value returned by a call to
|
|
.BR clone (2)
|
|
or
|
|
.BR gettid (2))
|
|
is specified in
|
|
.IR pid .
|
|
.TP
|
|
.BR F_OWNER_PID
|
|
Send the signal to the process whose ID
|
|
is specified in
|
|
.IR pid .
|
|
.TP
|
|
.BR F_OWNER_PGRP
|
|
Send the signal to the process group whose ID
|
|
is specified in
|
|
.IR pid .
|
|
(Note that, unlike with
|
|
.BR F_SETOWN ,
|
|
a process group ID is specified as a positive value here.)
|
|
.RE
|
|
.TP
|
|
.BR F_GETSIG " (\fIvoid\fP)"
|
|
Return (as the function result)
|
|
the signal sent when input or output becomes possible.
|
|
A value of zero means
|
|
.B SIGIO
|
|
is sent.
|
|
Any other value (including
|
|
.BR SIGIO )
|
|
is the
|
|
signal sent instead, and in this case additional info is available to
|
|
the signal handler if installed with
|
|
.BR SA_SIGINFO .
|
|
.I arg
|
|
is ignored.
|
|
.TP
|
|
.BR F_SETSIG " (\fIint\fP)"
|
|
Set the signal sent when input or output becomes possible
|
|
to the value given in
|
|
.IR arg .
|
|
A value of zero means to send the default
|
|
.B SIGIO
|
|
signal.
|
|
Any other value (including
|
|
.BR SIGIO )
|
|
is the signal to send instead, and in this case additional info
|
|
is available to the signal handler if installed with
|
|
.BR SA_SIGINFO .
|
|
.\"
|
|
.\" The following was true only up until 2.6.11:
|
|
.\"
|
|
.\" Additionally, passing a nonzero value to
|
|
.\" .B F_SETSIG
|
|
.\" changes the signal recipient from a whole process to a specific thread
|
|
.\" within a process.
|
|
.\" See the description of
|
|
.\" .B F_SETOWN
|
|
.\" for more details.
|
|
|
|
By using
|
|
.B F_SETSIG
|
|
with a nonzero value, and setting
|
|
.B SA_SIGINFO
|
|
for the
|
|
signal handler (see
|
|
.BR sigaction (2)),
|
|
extra information about I/O events is passed to
|
|
the handler in a
|
|
.I siginfo_t
|
|
structure.
|
|
If the
|
|
.I si_code
|
|
field indicates the source is
|
|
.BR SI_SIGIO ,
|
|
the
|
|
.I si_fd
|
|
field gives the file descriptor associated with the event.
|
|
Otherwise,
|
|
there is no indication which file descriptors are pending, and you
|
|
should use the usual mechanisms
|
|
.RB ( select (2),
|
|
.BR poll (2),
|
|
.BR read (2)
|
|
with
|
|
.B O_NONBLOCK
|
|
set etc.) to determine which file descriptors are available for I/O.
|
|
|
|
By selecting a real time signal (value >=
|
|
.BR SIGRTMIN ),
|
|
multiple I/O events may be queued using the same signal numbers.
|
|
(Queuing is dependent on available memory).
|
|
Extra information is available
|
|
if
|
|
.B SA_SIGINFO
|
|
is set for the signal handler, as above.
|
|
|
|
Note that Linux imposes a limit on the
|
|
number of real-time signals that may be queued to a
|
|
process (see
|
|
.BR getrlimit (2)
|
|
and
|
|
.BR signal (7))
|
|
and if this limit is reached, then the kernel reverts to
|
|
delivering
|
|
.BR SIGIO ,
|
|
and this signal is delivered to the entire
|
|
process rather than to a specific thread.
|
|
.\" See fs/fcntl.c::send_sigio_to_task() (2.4/2.6) sources -- MTK, Apr 05
|
|
.PP
|
|
Using these mechanisms, a program can implement fully asynchronous I/O
|
|
without using
|
|
.BR select (2)
|
|
or
|
|
.BR poll (2)
|
|
most of the time.
|
|
.PP
|
|
The use of
|
|
.BR O_ASYNC
|
|
is specific to BSD and Linux.
|
|
The only use of
|
|
.BR F_GETOWN
|
|
and
|
|
.B F_SETOWN
|
|
specified in POSIX.1 is in conjunction with the use of the
|
|
.B SIGURG
|
|
signal on sockets.
|
|
(POSIX does not specify the
|
|
.BR SIGIO
|
|
signal.)
|
|
.BR F_GETOWN_EX ,
|
|
.BR F_SETOWN_EX ,
|
|
.BR F_GETSIG ,
|
|
and
|
|
.B F_SETSIG
|
|
are Linux-specific.
|
|
POSIX has asynchronous I/O and the
|
|
.I aio_sigevent
|
|
structure to achieve similar things; these are also available
|
|
in Linux as part of the GNU C Library (Glibc).
|
|
.SS Leases
|
|
.B F_SETLEASE
|
|
and
|
|
.B F_GETLEASE
|
|
(Linux 2.4 onward) are used (respectively) to establish a new lease,
|
|
and retrieve the current lease, on the open file description
|
|
referred to by the file descriptor
|
|
.IR fd .
|
|
A file lease provides a mechanism whereby the process holding
|
|
the lease (the "lease holder") is notified (via delivery of a signal)
|
|
when a process (the "lease breaker") tries to
|
|
.BR open (2)
|
|
or
|
|
.BR truncate (2)
|
|
the file referred to by that file descriptor.
|
|
.TP
|
|
.BR F_SETLEASE " (\fIint\fP)"
|
|
Set or remove a file lease according to which of the following
|
|
values is specified in the integer
|
|
.IR arg :
|
|
.RS
|
|
.TP
|
|
.B F_RDLCK
|
|
Take out a read lease.
|
|
This will cause the calling process to be notified when
|
|
the file is opened for writing or is truncated.
|
|
.\" The following became true in kernel 2.6.10:
|
|
.\" See the man-pages-2.09 Changelog for further info.
|
|
A read lease can be placed only on a file descriptor that
|
|
is opened read-only.
|
|
.TP
|
|
.B F_WRLCK
|
|
Take out a write lease.
|
|
This will cause the caller to be notified when
|
|
the file is opened for reading or writing or is truncated.
|
|
A write lease may be placed on a file only if there are no
|
|
other open file descriptors for the file.
|
|
.TP
|
|
.B F_UNLCK
|
|
Remove our lease from the file.
|
|
.RE
|
|
.P
|
|
Leases are associated with an open file description (see
|
|
.BR open (2)).
|
|
This means that duplicate file descriptors (created by, for example,
|
|
.BR fork (2)
|
|
or
|
|
.BR dup (2))
|
|
refer to the same lease, and this lease may be modified
|
|
or released using any of these descriptors.
|
|
Furthermore, the lease is released by either an explicit
|
|
.B F_UNLCK
|
|
operation on any of these duplicate descriptors, or when all
|
|
such descriptors have been closed.
|
|
.P
|
|
Leases may be taken out only on regular files.
|
|
An unprivileged process may take out a lease only on a file whose
|
|
UID (owner) matches the filesystem UID of the process.
|
|
A process with the
|
|
.B CAP_LEASE
|
|
capability may take out leases on arbitrary files.
|
|
.TP
|
|
.BR F_GETLEASE " (\fIvoid\fP)"
|
|
Indicates what type of lease is associated with the file descriptor
|
|
.I fd
|
|
by returning either
|
|
.BR F_RDLCK ", " F_WRLCK ", or " F_UNLCK ,
|
|
indicating, respectively, a read lease , a write lease, or no lease.
|
|
.I arg
|
|
is ignored.
|
|
.PP
|
|
When a process (the "lease breaker") performs an
|
|
.BR open (2)
|
|
or
|
|
.BR truncate (2)
|
|
that conflicts with a lease established via
|
|
.BR F_SETLEASE ,
|
|
the system call is blocked by the kernel and
|
|
the kernel notifies the lease holder by sending it a signal
|
|
.RB ( SIGIO
|
|
by default).
|
|
The lease holder should respond to receipt of this signal by doing
|
|
whatever cleanup is required in preparation for the file to be
|
|
accessed by another process (e.g., flushing cached buffers) and
|
|
then either remove or downgrade its lease.
|
|
A lease is removed by performing an
|
|
.B F_SETLEASE
|
|
command specifying
|
|
.I arg
|
|
as
|
|
.BR F_UNLCK .
|
|
If the lease holder currently holds a write lease on the file,
|
|
and the lease breaker is opening the file for reading,
|
|
then it is sufficient for the lease holder to downgrade
|
|
the lease to a read lease.
|
|
This is done by performing an
|
|
.B F_SETLEASE
|
|
command specifying
|
|
.I arg
|
|
as
|
|
.BR F_RDLCK .
|
|
|
|
If the lease holder fails to downgrade or remove the lease within
|
|
the number of seconds specified in
|
|
.IR /proc/sys/fs/lease-break-time ,
|
|
then the kernel forcibly removes or downgrades the lease holder's lease.
|
|
|
|
Once a lease break has been initiated,
|
|
.B F_GETLEASE
|
|
returns the target lease type (either
|
|
.B F_RDLCK
|
|
or
|
|
.BR F_UNLCK ,
|
|
depending on what would be compatible with the lease breaker)
|
|
until the lease holder voluntarily downgrades or removes the lease or
|
|
the kernel forcibly does so after the lease break timer expires.
|
|
|
|
Once the lease has been voluntarily or forcibly removed or downgraded,
|
|
and assuming the lease breaker has not unblocked its system call,
|
|
the kernel permits the lease breaker's system call to proceed.
|
|
|
|
If the lease breaker's blocked
|
|
.BR open (2)
|
|
or
|
|
.BR truncate (2)
|
|
is interrupted by a signal handler,
|
|
then the system call fails with the error
|
|
.BR EINTR ,
|
|
but the other steps still occur as described above.
|
|
If the lease breaker is killed by a signal while blocked in
|
|
.BR open (2)
|
|
or
|
|
.BR truncate (2),
|
|
then the other steps still occur as described above.
|
|
If the lease breaker specifies the
|
|
.B O_NONBLOCK
|
|
flag when calling
|
|
.BR open (2),
|
|
then the call immediately fails with the error
|
|
.BR EWOULDBLOCK ,
|
|
but the other steps still occur as described above.
|
|
|
|
The default signal used to notify the lease holder is
|
|
.BR SIGIO ,
|
|
but this can be changed using the
|
|
.B F_SETSIG
|
|
command to
|
|
.BR fcntl ().
|
|
If a
|
|
.B F_SETSIG
|
|
command is performed (even one specifying
|
|
.BR SIGIO ),
|
|
and the signal
|
|
handler is established using
|
|
.BR SA_SIGINFO ,
|
|
then the handler will receive a
|
|
.I siginfo_t
|
|
structure as its second argument, and the
|
|
.I si_fd
|
|
field of this argument will hold the descriptor of the leased file
|
|
that has been accessed by another process.
|
|
(This is useful if the caller holds leases against multiple files).
|
|
.SS File and directory change notification (dnotify)
|
|
.TP
|
|
.BR F_NOTIFY " (\fIint\fP)"
|
|
(Linux 2.4 onward)
|
|
Provide notification when the directory referred to by
|
|
.I fd
|
|
or any of the files that it contains is changed.
|
|
The events to be notified are specified in
|
|
.IR arg ,
|
|
which is a bit mask specified by ORing together zero or more of
|
|
the following bits:
|
|
.RS
|
|
.sp
|
|
.PD 0
|
|
.TP 12
|
|
.B DN_ACCESS
|
|
A file was accessed
|
|
.RB ( read (2),
|
|
.BR pread (2),
|
|
.BR readv (2),
|
|
and similar)
|
|
.TP
|
|
.B DN_MODIFY
|
|
A file was modified
|
|
.RB ( write (2),
|
|
.BR pwrite (2),
|
|
.BR writev (2),
|
|
.BR truncate (2),
|
|
.BR ftruncate (2),
|
|
and similar).
|
|
.TP
|
|
.B DN_CREATE
|
|
A file was created
|
|
.RB ( open (2),
|
|
.BR creat (2),
|
|
.BR mknod (2),
|
|
.BR mkdir (2),
|
|
.BR link (2),
|
|
.BR symlink (2),
|
|
.BR rename (2)
|
|
into this directory).
|
|
.TP
|
|
.B DN_DELETE
|
|
A file was unlinked
|
|
.RB ( unlink (2),
|
|
.BR rename (2)
|
|
to another directory,
|
|
.BR rmdir (2)).
|
|
.TP
|
|
.B DN_RENAME
|
|
A file was renamed within this directory
|
|
.RB ( rename (2)).
|
|
.TP
|
|
.B DN_ATTRIB
|
|
The attributes of a file were changed
|
|
.RB ( chown (2),
|
|
.BR chmod (2),
|
|
.BR utime (2),
|
|
.BR utimensat (2),
|
|
and similar).
|
|
.PD
|
|
.RE
|
|
.IP
|
|
(In order to obtain these definitions, the
|
|
.B _GNU_SOURCE
|
|
feature test macro must be defined before including
|
|
.I any
|
|
header files.)
|
|
|
|
Directory notifications are normally "one-shot", and the application
|
|
must reregister to receive further notifications.
|
|
Alternatively, if
|
|
.B DN_MULTISHOT
|
|
is included in
|
|
.IR arg ,
|
|
then notification will remain in effect until explicitly removed.
|
|
|
|
.\" The following does seem a poor API-design choice...
|
|
A series of
|
|
.B F_NOTIFY
|
|
requests is cumulative, with the events in
|
|
.I arg
|
|
being added to the set already monitored.
|
|
To disable notification of all events, make an
|
|
.B F_NOTIFY
|
|
call specifying
|
|
.I arg
|
|
as 0.
|
|
|
|
Notification occurs via delivery of a signal.
|
|
The default signal is
|
|
.BR SIGIO ,
|
|
but this can be changed using the
|
|
.B F_SETSIG
|
|
command to
|
|
.BR fcntl ().
|
|
(Note that
|
|
.B SIGIO
|
|
is one of the nonqueuing standard signals;
|
|
switching to the use of a real-time signal means that
|
|
multiple notifications can be queued to the process.)
|
|
In the latter case, the signal handler receives a
|
|
.I siginfo_t
|
|
structure as its second argument (if the handler was
|
|
established using
|
|
.BR SA_SIGINFO )
|
|
and the
|
|
.I si_fd
|
|
field of this structure contains the file descriptor which
|
|
generated the notification (useful when establishing notification
|
|
on multiple directories).
|
|
|
|
Especially when using
|
|
.BR DN_MULTISHOT ,
|
|
a real time signal should be used for notification,
|
|
so that multiple notifications can be queued.
|
|
|
|
.B NOTE:
|
|
New applications should use the
|
|
.I inotify
|
|
interface (available since kernel 2.6.13),
|
|
which provides a much superior interface for obtaining notifications of
|
|
filesystem events.
|
|
See
|
|
.BR inotify (7).
|
|
.SS Changing the capacity of a pipe
|
|
.TP
|
|
.BR F_SETPIPE_SZ " (\fIint\fP; since Linux 2.6.35)"
|
|
Change the capacity of the pipe referred to by
|
|
.I fd
|
|
to be at least
|
|
.I arg
|
|
bytes.
|
|
An unprivileged process can adjust the pipe capacity to any value
|
|
between the system page size and the limit defined in
|
|
.IR /proc/sys/fs/pipe-max-size
|
|
(see
|
|
.BR proc (5)).
|
|
Attempts to set the pipe capacity below the page size are silently
|
|
rounded up to the page size.
|
|
Attempts by an unprivileged process to set the pipe capacity above the limit in
|
|
.IR /proc/sys/fs/pipe-max-size
|
|
yield the error
|
|
.BR EPERM ;
|
|
a privileged process
|
|
.RB ( CAP_SYS_RESOURCE )
|
|
can override the limit.
|
|
When allocating the buffer for the pipe,
|
|
the kernel may use a capacity larger than
|
|
.IR arg ,
|
|
if that is convenient for the implementation.
|
|
The actual capacity that is set is returned as the function result.
|
|
Attempting to set the pipe capacity smaller than the amount
|
|
of buffer space currently used to store data produces the error
|
|
.BR EBUSY .
|
|
.TP
|
|
.BR F_GETPIPE_SZ " (\fIvoid\fP; since Linux 2.6.35)"
|
|
Return (as the function result) the capacity of the pipe referred to by
|
|
.IR fd .
|
|
.SH RETURN VALUE
|
|
For a successful call, the return value depends on the operation:
|
|
.TP 0.9i
|
|
.B F_DUPFD
|
|
The new descriptor.
|
|
.TP
|
|
.B F_GETFD
|
|
Value of file descriptor flags.
|
|
.TP
|
|
.B F_GETFL
|
|
Value of file status flags.
|
|
.TP
|
|
.B F_GETLEASE
|
|
Type of lease held on file descriptor.
|
|
.TP
|
|
.B F_GETOWN
|
|
Value of descriptor owner.
|
|
.TP
|
|
.B F_GETSIG
|
|
Value of signal sent when read or write becomes possible, or zero
|
|
for traditional
|
|
.B SIGIO
|
|
behavior.
|
|
.TP
|
|
.BR F_GETPIPE_SZ ", " F_SETPIPE_SZ
|
|
The pipe capacity.
|
|
.TP
|
|
All other commands
|
|
Zero.
|
|
.PP
|
|
On error, \-1 is returned, and
|
|
.I errno
|
|
is set appropriately.
|
|
.SH ERRORS
|
|
.TP
|
|
.BR EACCES " or " EAGAIN
|
|
Operation is prohibited by locks held by other processes.
|
|
.TP
|
|
.B EAGAIN
|
|
The operation is prohibited because the file has been memory-mapped by
|
|
another process.
|
|
.TP
|
|
.B EBADF
|
|
.I fd
|
|
is not an open file descriptor, or the command was
|
|
.B F_SETLK
|
|
or
|
|
.B F_SETLKW
|
|
and the file descriptor open mode doesn't match with the
|
|
type of lock requested.
|
|
.TP
|
|
.B EDEADLK
|
|
It was detected that the specified
|
|
.B F_SETLKW
|
|
command would cause a deadlock.
|
|
.TP
|
|
.B EFAULT
|
|
.I lock
|
|
is outside your accessible address space.
|
|
.TP
|
|
.B EINTR
|
|
For
|
|
.BR F_SETLKW ,
|
|
the command was interrupted by a signal; see
|
|
.BR signal (7).
|
|
For
|
|
.BR F_GETLK " and " F_SETLK ,
|
|
the command was interrupted by a signal before the lock was checked or
|
|
acquired.
|
|
Most likely when locking a remote file (e.g., locking over
|
|
NFS), but can sometimes happen locally.
|
|
.TP
|
|
.B EINVAL
|
|
The value specified in
|
|
.I cmd
|
|
is not recognized by this kernel.
|
|
.TP
|
|
.B EINVAL
|
|
For
|
|
.BR F_DUPFD ,
|
|
.I arg
|
|
is negative or is greater than the maximum allowable value.
|
|
For
|
|
.BR F_SETSIG ,
|
|
.I arg
|
|
is not an allowable signal number.
|
|
.TP
|
|
.B EINVAL
|
|
.I cmd
|
|
is
|
|
.BR F_OFD_SETLK ,
|
|
.BR F_OFD_SETLKW ,
|
|
or
|
|
.BR F_OFD_GETLK ,
|
|
and
|
|
.I l_pid
|
|
was not specified as zero.
|
|
.TP
|
|
.B EMFILE
|
|
For
|
|
.BR F_DUPFD ,
|
|
the process already has the maximum number of file descriptors open.
|
|
.TP
|
|
.B ENOLCK
|
|
Too many segment locks open, lock table is full, or a remote locking
|
|
protocol failed (e.g., locking over NFS).
|
|
.TP
|
|
.B ENOTDIR
|
|
.B F_NOTIFY
|
|
was specified in
|
|
.IR cmd ,
|
|
but
|
|
.IR fd
|
|
does not refer to a directory.
|
|
.TP
|
|
.B EPERM
|
|
Attempted to clear the
|
|
.B O_APPEND
|
|
flag on a file that has the append-only attribute set.
|
|
.SH CONFORMING TO
|
|
SVr4, 4.3BSD, POSIX.1-2001.
|
|
Only the operations
|
|
.BR F_DUPFD ,
|
|
.BR F_GETFD ,
|
|
.BR F_SETFD ,
|
|
.BR F_GETFL ,
|
|
.BR F_SETFL ,
|
|
.BR F_GETLK ,
|
|
.BR F_SETLK ,
|
|
and
|
|
.BR F_SETLKW
|
|
are specified in POSIX.1-2001.
|
|
|
|
.BR F_GETOWN
|
|
and
|
|
.B F_SETOWN
|
|
are specified in POSIX.1-2001.
|
|
(To get their definitions, define either
|
|
.BR _BSD_SOURCE ,
|
|
or
|
|
.BR _XOPEN_SOURCE
|
|
with the value 500 or greater, or
|
|
.BR _POSIX_C_SOURCE
|
|
with the value 200809L or greater.)
|
|
|
|
.B F_DUPFD_CLOEXEC
|
|
is specified in POSIX.1-2008.
|
|
(To get this definition, define
|
|
.B _POSIX_C_SOURCE
|
|
with the value 200809L or greater, or
|
|
.B _XOPEN_SOURCE
|
|
with the value 700 or greater.)
|
|
|
|
.BR F_GETOWN_EX ,
|
|
.BR F_SETOWN_EX ,
|
|
.BR F_SETPIPE_SZ ,
|
|
.BR F_GETPIPE_SZ ,
|
|
.BR F_GETSIG ,
|
|
.BR F_SETSIG ,
|
|
.BR F_NOTIFY ,
|
|
.BR F_GETLEASE ,
|
|
and
|
|
.B F_SETLEASE
|
|
are Linux-specific.
|
|
(Define the
|
|
.B _GNU_SOURCE
|
|
macro to obtain these definitions.)
|
|
.\" .PP
|
|
.\" SVr4 documents additional EIO, ENOLINK and EOVERFLOW error conditions.
|
|
|
|
.BR F_OFD_SETLK ,
|
|
.BR F_OFD_SETLKW ,
|
|
and
|
|
.BR F_OFD_GETLK
|
|
are Linux-specific (and one must define
|
|
.BR _GNU_SOURCE
|
|
to obtain their definitions),
|
|
but work is being done to have them included in the next version of POSIX.1.
|
|
.SH NOTES
|
|
The errors returned by
|
|
.BR dup2 (2)
|
|
are different from those returned by
|
|
.BR F_DUPFD .
|
|
.\"
|
|
.SS File locking
|
|
The original Linux
|
|
.BR fcntl ()
|
|
system call was not designed to handle large file offsets
|
|
(in the
|
|
.I flock
|
|
structure).
|
|
Consequently, an
|
|
.BR fcntl64 ()
|
|
system call was added in Linux 2.4.
|
|
The newer system call employs a different structure for file locking,
|
|
.IR flock64 ,
|
|
and corresponding commands,
|
|
.BR F_GETLK64 ,
|
|
.BR F_SETLK64 ,
|
|
and
|
|
.BR F_SETLKW64 .
|
|
However, these details can be ignored by applications using glibc, whose
|
|
.BR fcntl ()
|
|
wrapper function transparently employs the more recent system call
|
|
where it is available.
|
|
|
|
The errors returned by
|
|
.BR dup2 (2)
|
|
are different from those returned by
|
|
.BR F_DUPFD .
|
|
.SS Record locks
|
|
Since kernel 2.0, there is no interaction between the types of lock
|
|
placed by
|
|
.BR flock (2)
|
|
and
|
|
.BR fcntl ().
|
|
|
|
Several systems have more fields in
|
|
.I "struct flock"
|
|
such as, for example,
|
|
.IR l_sysid .
|
|
.\" e.g., Solaris 8 documents this field in fcntl(2), and Irix 6.5
|
|
.\" documents it in fcntl(5). mtk, May 2007
|
|
.\" Also, FreeBSD documents it (Apr 2014).
|
|
Clearly,
|
|
.I l_pid
|
|
alone is not going to be very useful if the process holding the lock
|
|
may live on a different machine.
|
|
|
|
The original Linux
|
|
.BR fcntl ()
|
|
system call was not designed to handle large file offsets
|
|
(in the
|
|
.I flock
|
|
structure).
|
|
Consequently, an
|
|
.BR fcntl64 ()
|
|
system call was added in Linux 2.4.
|
|
The newer system call employs a different structure for file locking,
|
|
.IR flock64 ,
|
|
and corresponding commands,
|
|
.BR F_GETLK64 ,
|
|
.BR F_SETLK64 ,
|
|
and
|
|
.BR F_SETLKW64 .
|
|
However, these details can be ignored by applications using glibc, whose
|
|
.BR fcntl ()
|
|
wrapper function transparently employs the more recent system call
|
|
where it is available.
|
|
.SS Record locking and NFS
|
|
Before Linux 3.12, if an NFSv4 client
|
|
loses contact with the server for a period of time
|
|
(defined as more than 90 seconds with no communication),
|
|
.\"
|
|
.\" Neil Brown: With NFSv3 the failure mode is the reverse. If
|
|
.\" the server loses contact with a client then any lock stays in place
|
|
.\" indefinitely ("why can't I read my mail"... I remember it well).
|
|
.\"
|
|
it might lose and regain a lock without ever being aware of the fact.
|
|
(The period of time after which contact is assumed lost is known as
|
|
the NFSv4 leasetime.
|
|
On a Linux NFS server, this can be determined by looking at
|
|
.IR /proc/fs/nfsd/nfsv4leasetime ,
|
|
which expresses the period in seconds.
|
|
The default value for this file is 90.)
|
|
.\"
|
|
.\" Jeff Layton:
|
|
.\" Note that this is not a firm timeout. The server runs a job
|
|
.\" periodically to clean out expired stateful objects, and it's likely
|
|
.\" that there is some time (maybe even up to another whole lease period)
|
|
.\" between when the timeout expires and the job actually runs. If the
|
|
.\" client gets a RENEW in there within that window, its lease will be
|
|
.\" renewed and its state preserved.
|
|
.\"
|
|
This scenario potentially risks data corruption,
|
|
since another process might acquire a lock in the intervening period
|
|
and perform file I/O.
|
|
|
|
Since Linux 3.12,
|
|
.\" commit ef1820f9be27b6ad158f433ab38002ab8131db4d
|
|
if an NFSv4 client loses contact with the server,
|
|
any I/O to the file by a process which "thinks" it holds
|
|
a lock will fail until that process closes and reopens the file.
|
|
A kernel parameter,
|
|
.IR nfs.recover_lost_locks ,
|
|
can be set to 1 to obtain the pre-3.12 behavior,
|
|
whereby the client will attempt to recover lost locks
|
|
when contact is reestablished with the server.
|
|
Because of the attendant risk of data corruption,
|
|
.\" commit f6de7a39c181dfb8a2c534661a53c73afb3081cd
|
|
this parameter defaults to 0 (disabled).
|
|
.SH BUGS
|
|
.SS F_SETFL
|
|
It is not possible to use
|
|
.BR F_SETFL
|
|
to change the state of the
|
|
.BR O_DSYNC
|
|
and
|
|
.BR O_SYNC
|
|
flags.
|
|
.\" FIXME . According to POSIX.1-2001, O_SYNC should also be modifiable
|
|
.\" via fcntl(2), but currently Linux does not permit this
|
|
.\" See http://bugzilla.kernel.org/show_bug.cgi?id=5994
|
|
Attempts to change the state of these flags are silently ignored.
|
|
.SS F_GETOWN
|
|
A limitation of the Linux system call conventions on some
|
|
architectures (notably i386) means that if a (negative)
|
|
process group ID to be returned by
|
|
.B F_GETOWN
|
|
falls in the range \-1 to \-4095, then the return value is wrongly
|
|
interpreted by glibc as an error in the system call;
|
|
.\" glibc source: sysdeps/unix/sysv/linux/i386/sysdep.h
|
|
that is, the return value of
|
|
.BR fcntl ()
|
|
will be \-1, and
|
|
.I errno
|
|
will contain the (positive) process group ID.
|
|
The Linux-specific
|
|
.BR F_GETOWN_EX
|
|
operation avoids this problem.
|
|
.\" mtk, Dec 04: some limited testing on alpha and ia64 seems to
|
|
.\" indicate that ANY negative PGID value will cause F_GETOWN
|
|
.\" to misinterpret the return as an error. Some other architectures
|
|
.\" seem to have the same range check as i386.
|
|
Since glibc version 2.11, glibc makes the kernel
|
|
.B F_GETOWN
|
|
problem invisible by implementing
|
|
.B F_GETOWN
|
|
using
|
|
.BR F_GETOWN_EX .
|
|
.SS F_SETOWN
|
|
In Linux 2.4 and earlier, there is bug that can occur
|
|
when an unprivileged process uses
|
|
.B F_SETOWN
|
|
to specify the owner
|
|
of a socket file descriptor
|
|
as a process (group) other than the caller.
|
|
In this case,
|
|
.BR fcntl ()
|
|
can return \-1 with
|
|
.I errno
|
|
set to
|
|
.BR EPERM ,
|
|
even when the owner process (group) is one that the caller
|
|
has permission to send signals to.
|
|
Despite this error return, the file descriptor owner is set,
|
|
and signals will be sent to the owner.
|
|
.\"
|
|
.SS Deadlock detection
|
|
The deadlock-detection algorithm employed by the kernel when dealing with
|
|
.BR F_SETLKW
|
|
requests can yield both
|
|
false negatives (failures to detect deadlocks,
|
|
leaving a set of deadlocked processes blocked indefinitely)
|
|
and false positives
|
|
.RB ( EDEADLK
|
|
errors when there is no deadlock).
|
|
For example,
|
|
the kernel limits the lock depth of its dependency search to 10 steps,
|
|
meaning that circular deadlock chains that exceed
|
|
that size will not be detected.
|
|
In addition, the kernel may falsely indicate a deadlock
|
|
when two or more processes created using the
|
|
.BR clone (2)
|
|
.B CLONE_FILES
|
|
flag place locks that appear (to the kernel) to conflict.
|
|
.\"
|
|
.SS Mandatory locking
|
|
The Linux implementation of mandatory locking
|
|
is subject to race conditions which render it unreliable:
|
|
.\" http://marc.info/?l=linux-kernel&m=119013491707153&w=2
|
|
.\"
|
|
.\" Reconfirmed by Jeff Layton
|
|
.\" From: Jeff Layton <jlayton <at> redhat.com>
|
|
.\" Subject: Re: Status of fcntl() mandatory locking
|
|
.\" Newsgroups: gmane.linux.file-systems
|
|
.\" Date: 2014-04-28 10:07:57 GMT
|
|
.\" http://thread.gmane.org/gmane.linux.file-systems/84481/focus=84518
|
|
a
|
|
.BR write (2)
|
|
call that overlaps with a lock may modify data after the mandatory lock is
|
|
acquired;
|
|
a
|
|
.BR read (2)
|
|
call that overlaps with a lock may detect changes to data that were made
|
|
only after a write lock was acquired.
|
|
Similar races exist between mandatory locks and
|
|
.BR mmap (2).
|
|
It is therefore inadvisable to rely on mandatory locking.
|
|
.SH SEE ALSO
|
|
.BR dup2 (2),
|
|
.BR flock (2),
|
|
.BR open (2),
|
|
.BR socket (2),
|
|
.BR lockf (3),
|
|
.BR capabilities (7),
|
|
.BR feature_test_macros (7)
|
|
|
|
.IR locks.txt ,
|
|
.IR mandatory-locking.txt ,
|
|
and
|
|
.I dnotify.txt
|
|
in the Linux kernel source directory
|
|
.IR Documentation/filesystems/
|
|
(on older kernels, these files are directly under the
|
|
.I Documentation/
|
|
directory, and
|
|
.I mandatory-locking.txt
|
|
is called
|
|
.IR mandatory.txt )
|