mirror of https://github.com/mkerrisk/man-pages
584 lines
15 KiB
Groff
584 lines
15 KiB
Groff
.\" Copyright (C) 2019 Aleksa Sarai <cyphar@cyphar.com>
|
|
.\"
|
|
.\" %%%LICENSE_START(VERBATIM)
|
|
.\" Permission is granted to make and distribute verbatim copies of this
|
|
.\" manual provided the copyright notice and this permission notice are
|
|
.\" preserved on all copies.
|
|
.\"
|
|
.\" Permission is granted to copy and distribute modified versions of this
|
|
.\" manual under the conditions for verbatim copying, provided that the
|
|
.\" entire resulting derived work is distributed under the terms of a
|
|
.\" permission notice identical to this one.
|
|
.\"
|
|
.\" Since the Linux kernel and libraries are constantly changing, this
|
|
.\" manual page may be incorrect or out-of-date. The author(s) assume no
|
|
.\" responsibility for errors or omissions, or for damages resulting from
|
|
.\" the use of the information contained herein. The author(s) may not
|
|
.\" have taken the same level of care in the production of this manual,
|
|
.\" which is licensed free of charge, as they might when working
|
|
.\" professionally.
|
|
.\"
|
|
.\" Formatted or processed versions of this manual, if unaccompanied by
|
|
.\" the source, must acknowledge the copyright and authors of this work.
|
|
.\" %%%LICENSE_END
|
|
.TH OPENAT2 2 2020-04-11 "Linux" "Linux Programmer's Manual"
|
|
.SH NAME
|
|
openat2 \- open and possibly create a file (extended)
|
|
.SH SYNOPSIS
|
|
.nf
|
|
.B #include <sys/types.h>
|
|
.B #include <sys/stat.h>
|
|
.B #include <fcntl.h>
|
|
.B #include <openat2.h>
|
|
.PP
|
|
.BI "int openat2(int " dirfd ", const char *" pathname ,
|
|
.BI " struct open_how *" how ", size_t " size ");"
|
|
.fi
|
|
.PP
|
|
.IR Note :
|
|
There is no glibc wrapper for this system call; see NOTES.
|
|
.SH DESCRIPTION
|
|
The
|
|
.BR openat2 ()
|
|
system call is an extension of
|
|
.BR openat (2)
|
|
and provides a superset of its functionality.
|
|
.PP
|
|
The
|
|
.BR openat2 ()
|
|
system call opens the file specified by
|
|
.IR pathname .
|
|
If the specified file does not exist, it may optionally (if
|
|
.B O_CREAT
|
|
is specified in
|
|
.IR how.flags )
|
|
be created.
|
|
.PP
|
|
As with
|
|
.BR openat (2),
|
|
if
|
|
.I pathname
|
|
is a relative pathname, then it is interpreted relative to the
|
|
directory referred to by the file descriptor
|
|
.I dirfd
|
|
(or the current working directory of the calling process, if
|
|
.I dirfd
|
|
is the special value
|
|
.BR AT_FDCWD ).
|
|
If
|
|
.I pathname
|
|
is an absolute pathname, then
|
|
.I dirfd
|
|
is ignored (unless
|
|
.I how.resolve
|
|
contains
|
|
.BR RESOLVE_IN_ROOT ,
|
|
in which case
|
|
.I pathname
|
|
is resolved relative to
|
|
.IR dirfd ).
|
|
.PP
|
|
Rather than taking a single
|
|
.I flags
|
|
argument, an extensible structure (\fIhow\fP) is passed to allow for
|
|
future extensions.
|
|
The
|
|
.I size
|
|
argument must be specified as
|
|
.IR "sizeof(struct open_how)" .
|
|
.\"
|
|
.SS The open_how structure
|
|
The
|
|
.I how
|
|
argument specifies how
|
|
.I pathname
|
|
should be opened, and acts as a superset of the
|
|
.IR flags
|
|
and
|
|
.IR mode
|
|
arguments to
|
|
.BR openat (2).
|
|
This argument is a pointer to a structure of the following form:
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
struct open_how {
|
|
u64 flags; /* O_* flags */
|
|
u64 mode; /* Mode for O_{CREAT,TMPFILE} */
|
|
u64 resolve; /* RESOLVE_* flags */
|
|
/* ... */
|
|
};
|
|
.EE
|
|
.in
|
|
.PP
|
|
Any future extensions to
|
|
.BR openat2 ()
|
|
will be implemented as new fields appended to the above structure,
|
|
with a zero value in a new field resulting in the kernel behaving
|
|
as though that extension field was not present.
|
|
Therefore, the caller
|
|
.I must
|
|
zero-fill this structure on
|
|
initialization.
|
|
(See the "Extensibility" section of the
|
|
.B NOTES
|
|
for more detail on why this is necessary.)
|
|
.PP
|
|
The fields of the
|
|
.I open_how
|
|
structure are as follows:
|
|
.TP
|
|
.I flags
|
|
This field specifies
|
|
the file creation and file status flags to use when opening the file.
|
|
All of the
|
|
.B O_*
|
|
flags defined for
|
|
.BR openat (2)
|
|
are valid
|
|
.BR openat2 ()
|
|
flag values.
|
|
.IP
|
|
Whereas
|
|
.BR openat (2)
|
|
ignores unknown bits in its
|
|
.I flags
|
|
argument,
|
|
.BR openat2 ()
|
|
returns an error if unknown or conflicting flags are specified in
|
|
.IR how.flags .
|
|
.TP
|
|
.I mode
|
|
This field specifies the
|
|
mode for the new file, with identical semantics to the
|
|
.I mode
|
|
argument of
|
|
.BR openat (2).
|
|
.IP
|
|
Whereas
|
|
.BR openat (2)
|
|
ignores bits other than those in the range
|
|
.I 07777
|
|
in its
|
|
.I mode
|
|
argument,
|
|
.BR openat2 ()
|
|
returns an error if
|
|
.I how.mode
|
|
contains bits other than
|
|
.IR 07777 .
|
|
Similarly, an error is returned if
|
|
.BR openat2 ()
|
|
is called with a nonzero
|
|
.IR how.mode
|
|
and
|
|
.IR how.flags
|
|
does not contain
|
|
.BR O_CREAT
|
|
or
|
|
.BR O_TMPFILE .
|
|
.TP
|
|
.I resolve
|
|
This is a bit-mask of flags that modify the way in which
|
|
.B all
|
|
components of
|
|
.I pathname
|
|
will be resolved.
|
|
(See
|
|
.BR path_resolution (7)
|
|
for background information.)
|
|
.IP
|
|
The primary use case for these flags is to allow trusted programs to restrict
|
|
how untrusted paths (or paths inside untrusted directories) are resolved.
|
|
The full list of
|
|
.I resolve
|
|
flags is as follows:
|
|
.RS
|
|
.TP
|
|
.B RESOLVE_BENEATH
|
|
.\" commit adb21d2b526f7f196b2f3fdca97d80ba05dd14a0
|
|
Do not permit the path resolution to succeed if any component of the resolution
|
|
is not a descendant of the directory indicated by
|
|
.IR dirfd .
|
|
This causes absolute symbolic links (and absolute values of
|
|
.IR pathname )
|
|
to be rejected.
|
|
.IP
|
|
Currently, this flag also disables magic-link resolution (see below).
|
|
However, this may change in the future.
|
|
Therefore, to ensure that magic links are not resolved,
|
|
the caller should explicitly specify
|
|
.BR RESOLVE_NO_MAGICLINKS .
|
|
.TP
|
|
.B RESOLVE_IN_ROOT
|
|
.\" commit 8db52c7e7ee1bd861b6096fcafc0fe7d0f24a994
|
|
Treat the directory referred to by
|
|
.I dirfd
|
|
as the root directory while resolving
|
|
.IR pathname .
|
|
Absolute symbolic links are interpreted relative to
|
|
.IR dirfd .
|
|
If a prefix component of
|
|
.I pathname
|
|
equates to
|
|
.IR dirfd ,
|
|
then an immediately following
|
|
.IR ..\&
|
|
component likewise equates to
|
|
.IR dirfd
|
|
(just as
|
|
.I /..\&
|
|
is traditionally equivalent to
|
|
.IR / ).
|
|
If
|
|
.I pathname
|
|
is an absolute path, it is also interpreted relative to
|
|
.IR dirfd .
|
|
.IP
|
|
The effect of this flag is as though the calling process had used
|
|
.BR chroot (2)
|
|
to (temporarily) modify its root directory (to the directory
|
|
referred to by
|
|
.IR dirfd ).
|
|
However, unlike
|
|
.BR chroot (2)
|
|
(which changes the filesystem root permanently for a process),
|
|
.B RESOLVE_IN_ROOT
|
|
allows a program to efficiently restrict path resolution on a per-open basis.
|
|
.IP
|
|
Currently, this flag also disables magic-link resolution.
|
|
However, this may change in the future.
|
|
Therefore, to ensure that magic links are not resolved,
|
|
the caller should explicitly specify
|
|
.BR RESOLVE_NO_MAGICLINKS .
|
|
.TP
|
|
.B RESOLVE_NO_MAGICLINKS
|
|
.\" commit 278121417a72d87fb29dd8c48801f80821e8f75a
|
|
Disallow all magic-link resolution during path resolution.
|
|
.IP
|
|
Magic links are symbolic link-like objects that are most notably found in
|
|
.BR proc (5);
|
|
examples include
|
|
.IR /proc/[pid]/exe
|
|
and
|
|
.IR /proc/[pid]/fd/* .
|
|
(See
|
|
.BR symlink (7)
|
|
for more details.)
|
|
.IP
|
|
Unknowingly opening magic links can be risky for some applications.
|
|
Examples of such risks include the following:
|
|
.RS
|
|
.IP \(bu 2
|
|
If the process opening a pathname is a controlling process that
|
|
currently has no controlling terminal (see
|
|
.BR credentials (7)),
|
|
then opening a magic link inside
|
|
.IR /proc/[pid]/fd
|
|
that happens to refer to a terminal
|
|
would cause the process to acquire a controlling terminal.
|
|
.IP \(bu
|
|
.\" From https://lwn.net/Articles/796868/:
|
|
.\" The presence of this flag will prevent a path lookup operation
|
|
.\" from traversing through one of these magic links, thus blocking
|
|
.\" (for example) attempts to escape from a container via a /proc
|
|
.\" entry for an open file descriptor.
|
|
In a containerized environment,
|
|
a magic link inside
|
|
.I /proc
|
|
may refer to an object outside the container,
|
|
and thus may provide a means to escape from the container.
|
|
.RE
|
|
.IP
|
|
Because of such risks,
|
|
an application may prefer to disable magic link resolution using the
|
|
.BR RESOLVE_NO_MAGICLINKS
|
|
flag.
|
|
.IP
|
|
If the trailing component (i.e., basename) of
|
|
.I pathname
|
|
is a magic link,
|
|
.I how.resolve
|
|
contains
|
|
.BR RESOLVE_NO_MAGICLINKS ,
|
|
and
|
|
.I how.flags
|
|
contains both
|
|
.BR O_PATH
|
|
and
|
|
.BR O_NOFOLLOW ,
|
|
then an
|
|
.B O_PATH
|
|
file descriptor referencing the magic link will be returned.
|
|
.TP
|
|
.B RESOLVE_NO_SYMLINKS
|
|
.\" commit 278121417a72d87fb29dd8c48801f80821e8f75a
|
|
Disallow resolution of symbolic links during path resolution.
|
|
This option implies
|
|
.BR RESOLVE_NO_MAGICLINKS .
|
|
.IP
|
|
If the trailing component (i.e., basename) of
|
|
.I pathname
|
|
is a symbolic link,
|
|
.I how.resolve
|
|
contains
|
|
.BR RESOLVE_NO_SYMLINKS ,
|
|
and
|
|
.I how.flags
|
|
contains both
|
|
.BR O_PATH
|
|
and
|
|
.BR O_NOFOLLOW ,
|
|
then an
|
|
.B O_PATH
|
|
file descriptor referencing the symbolic link will be returned.
|
|
.IP
|
|
Note that the effect of the
|
|
.BR RESOLVE_NO_SYMLINKS
|
|
flag,
|
|
which affects the treatment of symbolic links in all of the components of
|
|
.IR pathname ,
|
|
differs from the effect of the
|
|
.BR O_NOFOLLOW
|
|
file creation flag (in
|
|
.IR how.flags ),
|
|
which affects the handling of symbolic links only in the final component of
|
|
.IR pathname .
|
|
.IP
|
|
Applications that employ the
|
|
.BR RESOLVE_NO_SYMLINKS
|
|
flag are encouraged to make its use configurable
|
|
(unless it is used for a specific security purpose),
|
|
as symbolic links are very widely used by end-users.
|
|
Setting this flag indiscriminately\(emi.e.,
|
|
for purposes not specifically related to security\(emfor all uses of
|
|
.BR openat2 ()
|
|
may result in spurious errors on previously-functional systems.
|
|
This may occur if, for example,
|
|
a system pathname that is used by an application is modified
|
|
(e.g., in a new distribution release)
|
|
so that a pathname component (now) contains a symbolic link.
|
|
.TP
|
|
.B RESOLVE_NO_XDEV
|
|
.\" commit 72ba29297e1439efaa54d9125b866ae9d15df339
|
|
Disallow traversal of mount points during path resolution (including all bind
|
|
mounts).
|
|
Consequently,
|
|
.I pathname
|
|
must either be on the same mount as the directory referred to by
|
|
.IR dirfd ,
|
|
or on the same mount as the current working directory if
|
|
.I dirfd
|
|
is specified as
|
|
.BR AT_FDCWD .
|
|
.IP
|
|
Applications that employ the
|
|
.B RESOLVE_NO_XDEV
|
|
flag are encouraged to make its use configurable (unless it is
|
|
used for a specific security purpose),
|
|
as bind mounts are widely used by end-users.
|
|
Setting this flag indiscriminately\(emi.e.,
|
|
for purposes not specifically related to security\(emfor all uses of
|
|
.BR openat2 ()
|
|
may result in spurious errors on previously-functional systems.
|
|
This may occur if, for example,
|
|
a system pathname that is used by an application is modified
|
|
(e.g., in a new distribution release)
|
|
so that a pathname component (now) contains a bind mount.
|
|
.RE
|
|
.IP
|
|
If any bits other than those listed above are set in
|
|
.IR how.resolve ,
|
|
an error is returned.
|
|
.SH RETURN VALUE
|
|
On success, a new file descriptor is returned.
|
|
On error, \-1 is returned, and
|
|
.I errno
|
|
is set appropriately.
|
|
.SH ERRORS
|
|
The set of errors returned by
|
|
.BR openat2 ()
|
|
includes all of the errors returned by
|
|
.BR openat (2),
|
|
as well as the following additional errors:
|
|
.TP
|
|
.B E2BIG
|
|
An extension that this kernel does not support was specified in
|
|
.IR how .
|
|
(See the "Extensibility" section of
|
|
.B NOTES
|
|
for more detail on how extensions are handled.)
|
|
.TP
|
|
.B EAGAIN
|
|
.I how.resolve
|
|
contains either
|
|
.BR RESOLVE_IN_ROOT
|
|
or
|
|
.BR RESOLVE_BENEATH ,
|
|
and the kernel could not ensure that a ".." component didn't escape (due to a
|
|
race condition or potential attack).
|
|
The caller may choose to retry the
|
|
.BR openat2 ()
|
|
call.
|
|
.TP
|
|
.B EINVAL
|
|
An unknown flag or invalid value was specified in
|
|
.IR how .
|
|
.TP
|
|
.B EINVAL
|
|
.I mode
|
|
is nonzero, but
|
|
.I how.flags
|
|
does not contain
|
|
.BR O_CREAT
|
|
or
|
|
.BR O_TMPFILE .
|
|
.TP
|
|
.B EINVAL
|
|
.I size
|
|
was smaller than any known version of
|
|
.IR "struct open_how" .
|
|
.TP
|
|
.B ELOOP
|
|
.I how.resolve
|
|
contains
|
|
.BR RESOLVE_NO_SYMLINKS ,
|
|
and one of the path components was a symbolic link (or magic link).
|
|
.TP
|
|
.B ELOOP
|
|
.I how.resolve
|
|
contains
|
|
.BR RESOLVE_NO_MAGICLINKS ,
|
|
and one of the path components was a magic link.
|
|
.TP
|
|
.B EXDEV
|
|
.I how.resolve
|
|
contains either
|
|
.BR RESOLVE_IN_ROOT
|
|
or
|
|
.BR RESOLVE_BENEATH ,
|
|
and an escape from the root during path resolution was detected.
|
|
.TP
|
|
.B EXDEV
|
|
.I how.resolve
|
|
contains
|
|
.BR RESOLVE_NO_XDEV ,
|
|
and a path component crosses a mount point.
|
|
.SH VERSIONS
|
|
.BR openat2 ()
|
|
first appeared in Linux 5.6.
|
|
.\" commit fddb5d430ad9fa91b49b1d34d0202ffe2fa0e179
|
|
.SH CONFORMING TO
|
|
This system call is Linux-specific.
|
|
.PP
|
|
The semantics of
|
|
.B RESOLVE_BENEATH
|
|
were modeled after FreeBSD's
|
|
.BR O_BENEATH .
|
|
.SH NOTES
|
|
Glibc does not provide a wrapper for this system call; call it using
|
|
.BR syscall (2).
|
|
.\"
|
|
.SS Extensibility
|
|
In order to allow for future extensibility,
|
|
.BR openat2 ()
|
|
requires the user-space application to specify the size of the
|
|
.I open_how
|
|
structure that it is passing.
|
|
By providing this information, it is possible for
|
|
.BR openat2 ()
|
|
to provide both forwards- and backwards-compatibility, with
|
|
.I size
|
|
acting as an implicit version number.
|
|
(Because new extension fields will always
|
|
be appended, the structure size will always increase.)
|
|
This extensibility design is very similar to other system calls such as
|
|
.BR sched_setattr (2),
|
|
.BR perf_event_open (2),
|
|
and
|
|
.BR clone3 (2).
|
|
.PP
|
|
If we let
|
|
.I usize
|
|
be the size of the structure as specified by the user-space application, and
|
|
.I ksize
|
|
be the size of the structure which the kernel supports, then there are
|
|
three cases to consider:
|
|
.IP \(bu 2
|
|
If
|
|
.IR ksize
|
|
equals
|
|
.IR usize ,
|
|
then there is no version mismatch and
|
|
.I how
|
|
can be used verbatim.
|
|
.IP \(bu
|
|
If
|
|
.IR ksize
|
|
is larger than
|
|
.IR usize ,
|
|
then there are some extension fields that the kernel supports
|
|
which the user-space application
|
|
is unaware of.
|
|
Because a zero value in any added extension field signifies a no-op,
|
|
the kernel
|
|
treats all of the extension fields not provided by the user-space application
|
|
as having zero values.
|
|
This provides backwards-compatibility.
|
|
.IP \(bu
|
|
If
|
|
.IR ksize
|
|
is smaller than
|
|
.IR usize ,
|
|
then there are some extension fields which the user-space application
|
|
is aware of but which the kernel does not support.
|
|
Because any extension field must have its zero values signify a no-op,
|
|
the kernel can
|
|
safely ignore the unsupported extension fields if they are all-zero.
|
|
If any unsupported extension fields are nonzero, then \-1 is returned and
|
|
.I errno
|
|
is set to
|
|
.BR E2BIG .
|
|
This provides forwards-compatibility.
|
|
.PP
|
|
Because the definition of
|
|
.I struct open_how
|
|
may change in the future (with new fields being added when system headers are
|
|
updated), user-space applications should zero-fill
|
|
.I struct open_how
|
|
to ensure that recompiling the program with new headers will not result in
|
|
spurious errors at runtime.
|
|
The simplest way is to use a designated
|
|
initializer:
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
struct open_how how = { .flags = O_RDWR,
|
|
.resolve = RESOLVE_IN_ROOT };
|
|
.EE
|
|
.in
|
|
.PP
|
|
or explicitly using
|
|
.BR memset (3)
|
|
or similar:
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
struct open_how how;
|
|
memset(&how, 0, sizeof(how));
|
|
how.flags = O_RDWR;
|
|
how.resolve = RESOLVE_IN_ROOT;
|
|
.EE
|
|
.in
|
|
.PP
|
|
A user-space application that wishes to determine which extensions
|
|
the running kernel supports can do so by conducting a binary search on
|
|
.IR size
|
|
with a structure which has every byte nonzero (to find the largest value
|
|
which doesn't produce an error of
|
|
.BR E2BIG ).
|
|
.SH SEE ALSO
|
|
.BR openat (2),
|
|
.BR path_resolution (7),
|
|
.BR symlink (7)
|