.\" Copyright (C) 2019 Aleksa Sarai .\" .\" %%%LICENSE_START(VERBATIM) .\" Permission is granted to make and distribute verbatim copies of this .\" manual provided the copyright notice and this permission notice are .\" preserved on all copies. .\" .\" Permission is granted to copy and distribute modified versions of this .\" manual under the conditions for verbatim copying, provided that the .\" entire resulting derived work is distributed under the terms of a .\" permission notice identical to this one. .\" .\" Since the Linux kernel and libraries are constantly changing, this .\" manual page may be incorrect or out-of-date. The author(s) assume no .\" responsibility for errors or omissions, or for damages resulting from .\" the use of the information contained herein. The author(s) may not .\" have taken the same level of care in the production of this manual, .\" which is licensed free of charge, as they might when working .\" professionally. .\" .\" Formatted or processed versions of this manual, if unaccompanied by .\" the source, must acknowledge the copyright and authors of this work. .\" %%%LICENSE_END .TH OPENAT2 2 2019-12-20 "Linux" "Linux Programmer's Manual" .SH NAME openat2 \- open and possibly create a file (extended) .SH SYNOPSIS .nf .B #include .B #include .B #include .B #include .PP .BI "int openat2(int " dirfd ", const char *" pathname ", \ struct open_how *" how ", size_t " size "); .fi .PP .IR Note : There is no glibc wrapper for this system call; see NOTES. .SH DESCRIPTION The .BR openat2 () system call opens the file specified by .IR pathname . If the specified file does not exist, it may optionally (if .B O_CREAT is specified in .IR how.flags ) be created by .BR openat2() . .PP As with .BR openat (2), if .I pathname is relative, then it is interpreted relative to the directory referred to by the file descriptor .I dirfd (or the current working directory of the calling process, if .I dirfd is the special value .BR AT_FDCWD .) If .I pathname is absolute, then .I dirfd is ignored (unless .I how.resolve contains .BR RESOLVE_IN_ROOT, in which case .I pathname is resolved relative to .IR dirfd .) .PP The .BR openat2 () system call is an extension of .BR openat (2) and provides a superset of its functionality. Rather than taking a single .I flags argument, an extensible structure (\fIhow\fP) is passed instead to allow for future extensions. .I size must be set to .IR "sizeof(struct open_how)" , to facilitate future extensions (see the "Extensibility" section of the .B NOTES for more detail on how extensions are handled.) .\" .SS The open_how structure The following structure indicates how .I pathname should be opened, and acts as a superset of the .IR flag " and " mode arguments to .BR openat (2). .PP .in +4n .EX struct open_how { u64 flags; /* O_* flags */ u64 mode; /* Mode for O_{CREAT,TMPFILE} */ u64 resolve; /* RESOLVE_* flags */ /* ... */ }; .EE .in .PP Any future extensions to .BR openat2 () will be implemented as new fields appended to the above structure, with the zero value of the new fields acting as though the extension were not present. Therefore, users must ensure that they zero-fill this structure on initialization (see the "Extensibility" section of the .B NOTES for more detail on why this is necessary.) .PP The meaning of each field is as follows: .TP .I flags The file creation and status flags to use for this operation. All of the .B O_* flags defined for .BR openat (2) are valid .BR openat2 () flag values. .IP Unlike .BR openat (2), it is an error to provide .BR openat2 () unknown or conflicting flags in .IR flags . .TP .I mode File mode for the new file, with identical semantics to the .I mode argument to .BR openat (2). .IP Unlike .BR openat (2), it is an error to provide .BR openat2 () with a .I mode which contains bits other than .IR 0777 , or to provide .BR openat2 () a non-zero .IR mode " if " flags does not contain .BR O_CREAT " or " O_TMPFILE . .TP .I resolve Change how .B all components of .I pathname will be resolved (see .BR path_resolution (7) for background information.) The primary use case for these flags is to allow trusted programs to restrict how untrusted paths (or paths inside untrusted directories) are resolved. The full list of .I resolve flags is given below. .RS .TP .B RESOLVE_NO_XDEV Disallow traversal of mount points during path resolution (including all bind mounts). .IP Users of this flag are encouraged to make its use configurable (unless it is used for a specific security purpose), as bind mounts are very widely used by end-users. Setting this flag indiscriminately for all uses of .IR openat2 () may result in spurious errors on previously-functional systems. .TP .B RESOLVE_NO_SYMLINKS Disallow resolution of symbolic links during path resolution. This option implies .BR RESOLVE_NO_MAGICLINKS . .IP If the trailing component is a symbolic link, and .I flags contains both .BR O_PATH " and " O_NOFOLLOW "," then an .B O_PATH file descriptor referencing the symbolic link will be returned. .IP Users of this flag are encouraged to make its use configurable (unless it is used for a specific security purpose), as symbolic links are very widely used by end-users. Setting this flag indiscriminately for all uses of .IR openat2 () may result in spurious errors on previously-functional systems. .TP .B RESOLVE_NO_MAGICLINKS Disallow all magic link resolution during path resolution. .IP If the trailing component is a magic link, and .I flags contains both .BR O_PATH " and " O_NOFOLLOW "," then an .B O_PATH file descriptor referencing the magic link will be returned. .IP Magic-links are symbolic link-like objects that are most notably found in .BR proc (5) (examples include .IR /proc/[pid]/exe " and " /proc/[pid]/fd/* .) Due to the potential danger of unknowingly opening these magic links, it may be preferable for users to disable their resolution entirely (see .BR symlink (7) for more details.) .TP .B RESOLVE_BENEATH Do not permit the path resolution to succeed if any component of the resolution is not a descendant of the directory indicated by .IR dirfd . This results in absolute symbolic links (and absolute values of .IR pathname ) to be rejected. .IP Currently, this flag also disables magic link resolution. However, this may change in the future. The caller should explicitly specify .B RESOLVE_NO_MAGICLINKS to ensure that magic links are not resolved. .TP .B RESOLVE_IN_ROOT Treat .I dirfd as the root directory while resolving .I pathname (as though the user called .BR chroot (2) with .IR dirfd as the argument.) Absolute symbolic links and ".." path components will be scoped to .IR dirfd . If .I pathname is an absolute path, it is also treated relative to .IR dirfd . .IP However, unlike .BR chroot (2) (which changes the filesystem root permanently for a process), .B RESOLVE_IN_ROOT allows a program to efficiently restrict path resolution for only certain operations. It also has several hardening features (such detecting escape attempts during .I ".." resolution) which .BR chroot (2) does not. .IP Currently, this flag also disables magic link resolution. However, this may change in the future. The caller should explicitly specify .B RESOLVE_NO_MAGICLINKS to ensure that magic links are not resolved. .RE .PP It is an error to provide .BR openat2 () unknown flags in .IR resolve . .SH RETURN VALUE On success, a new file descriptor is returned. On error, \-1 is returned, and .I errno is set appropriately. .SH ERRORS The set of errors returned by .BR openat2 () includes all of the errors returned by .BR openat (2), as well as the following additional errors: .TP .B EINVAL An unknown flag or invalid value was specified in .IR how . .TP .B EINVAL .I mode is non-zero, but .I flags does not contain .BR O_CREAT " or " O_TMPFILE . .TP .B EINVAL .I size was smaller than any known version of .IR "struct open_how" . .TP .B E2BIG An extension was specified in .IR how , which the current kernel does not support (see the "Extensibility" section of the .B NOTES for more detail on how extensions are handled.) .TP .B EAGAIN .I resolve contains either .BR RESOLVE_IN_ROOT " or " RESOLVE_BENEATH , and the kernel could not ensure that a ".." component didn't escape (due to a race condition or potential attack.) Callers may choose to retry the .BR openat2 () call. .TP .B EXDEV .I resolve contains either .BR RESOLVE_IN_ROOT " or " RESOLVE_BENEATH , and an escape from the root during path resolution was detected. .TP .B EXDEV .I resolve contains .BR RESOLVE_NO_XDEV , and a path component attempted to cross a mount point. .TP .B ELOOP .I resolve contains .BR RESOLVE_NO_SYMLINKS , and one of the path components was a symbolic link (or magic link). .TP .B ELOOP .I resolve contains .BR RESOLVE_NO_MAGICLINKS , and one of the path components was a magic link. .SH VERSIONS .BR openat2 () first appeared in Linux 5.6. .SH CONFORMING TO This system call is Linux-specific. .PP The semantics of .B RESOLVE_BENEATH were modeled after FreeBSD's .BR O_BENEATH . .SH NOTES Glibc does not provide a wrapper for this system call; call it using .BR syscall (2). .\" .SS Extensibility In order to allow for .I struct open_how to be extended in future kernel revisions, .BR openat2 () requires userspace to specify the size of .I struct open_how structure they are passing. By providing this information, it is possible for .BR openat2 () to provide both forwards- and backwards-compatibility \(em with .I size acting as an implicit version number (because new extension fields will always be appended, the size will always increase.) This extensibility design is very similar to other system calls such as .BR perf_setattr "(2), " perf_event_open "(2), and " clone (3). .PP If we let .I usize be the size of the structure according to userspace and .I ksize be the size of the structure which the kernel supports, then there are only three cases to consider: .RS .IP * 3 If .IR ksize " equals " usize , then there is no version mismatch and .I how can be used verbatim. .IP * If .IR ksize " is larger than " usize , then there are some extensions the kernel supports which the userspace program is unaware of. Because all extensions must have their zero values be a no-op, the kernel treats all of the extension fields not set by userspace to have zero values. This provides backwards-compatibility. .IP * If .IR ksize " is smaller than " usize , then there are some extensions which the userspace program is aware of but the kernel does not support. Because all extensions must have their zero values be a no-op, the kernel can safely ignore the unsupported extension fields if they are all-zero. If any unsupported extension fields are non-zero, then \-1 is returned and .I errno is set to .BR E2BIG . This provides forwards-compatibility. .RE .PP Therefore most userspace programs will not need to have any special handling of extensions. .PP However, because the definition of .I struct open_how may change in the future (with new fields being added when system headers are updated), userspace programs should zero-fill .I struct open_how to ensure that re-compiling the program with new headers will not result in spurious errors at runtime. The simplest way is to use a designated initializer: .PP .in +4n .EX struct open_how how = { .flags = O_RDWR, .resolve = RESOLVE_IN_ROOT }; .EE .in .PP or explicitly using something like .BR memset (3): .PP .in +4n .EX struct open_how how; memset(&how, 0, sizeof(how)); how.flags = O_RDWR; how.resolve = RESOLVE_IN_ROOT; .EE .in .PP If a userspace program wishes to determine what extensions the running kernel supports, they may conduct a binary search on .IR size with a structure which has every byte non-zero (to find the largest value which doesn't produce an error of .BR E2BIG .) .SH SEE ALSO .BR openat (2), .BR path_resolution (7), .BR symlink (7)