Commit Graph

9334 Commits

Author SHA1 Message Date
Michael Kerrisk ae6b221882 prctl.2: Rewrite the description of PR_SET_SECCOMP to defer to seccomp(2)
There is a lot of unnecessary duplication of content of the seccomp
material in prctl(2) and seccomp(2).  Trevor Woerner also noted that
there is an error in prctl(2), where it says that the filters
"are run in order until the first non-allow result is seen", which
contradicts the correct statement in seccomp(2) that *all* filters
are executed.

So, rewrite the seccomp material in prctl(2) to strip out most of
the content duplicated in seccomp(2), and replace the removed
text with statements deferring to to seccomp(2).

Reported-by: Trevor Woerner <twoerner@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-09-02 00:15:52 +02:00
Michael Kerrisk 2da936fe2b prctl.2: Note that seccomp(2) is preferred over prctl(2) for setting seccomp mode
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-09-01 23:57:17 +02:00
Michael Kerrisk bf4f7a7867 exit_group.2: Remove a confusing reference to _exit(2) in DESCRIPTION
As noted by Jakub:

    BTW, the exit_group.2 man page could use an update (possibly
    by merging it into exit.2): it says that the "system
    call is is equivalent to _exit(2) except that it terminates
    not only the calling thread, but all threads in the calling
    process's thread group", which isn't helpful these days.

Reported-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-31 01:21:39 +02:00
Michael Kerrisk db141dbfca exit_group.2: SEE ALSO: s/exit(2)/_exit(2)/
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-31 01:21:38 +02:00
Michael Kerrisk d99b5be0d8 _exit.2: Clarify the distinction between the raw syscall and the wrapper function
Further clarify the difference between the raw _exit() system call
and the C library wrapper.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-31 01:21:36 +02:00
Pali Rohár 44803dd03b ioctl_tty.2: TIOCGSID is equivalent to tcgetsid()
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-30 23:15:14 +02:00
Michael Kerrisk fabb1a2a0b syscalls.2: Add Linux 5.14 system calls
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-27 22:43:25 +02:00
Michael Kerrisk d5ee9f931e memfd_secret.2: SEE ALSO: add memfd_create(2)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-27 22:43:25 +02:00
Michael Kerrisk e817f70a5f memfd_create.2: SEE ALSO: add memfd_secret(2)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-27 22:43:25 +02:00
Michael Kerrisk 84a2ce0f18 memfd_secret.2: Minor edits to Mike Rapoport's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-27 22:43:25 +02:00
Michael Kerrisk eabb03a4d2 memfd_secret.2: wfix
Added "RAM-based" after consultation with Mike Rapoport

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-27 22:43:19 +02:00
Mike Rapoport ac5edfeb1d memfd_secret.2: New page describing memfd_secret() system call
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-27 22:40:49 +02:00
Michael Kerrisk 6e00b7a858 iconv.1, ldd.1, accept.2, access.2, add_key.2, arch_prctl.2, bpf.2, chmod.2, chown.2, close_range.2, copy_file_range.2, execve.2, execveat.2, fanotify_mark.2, futex.2, futimesat.2, getpriority.2, intro.2, ioctl_tty.2, keyctl.2, link.2, membarrier.2, mkdir.2, mknod.2, mlock.2, mount.2, mount_setattr.2, open.2, open_by_handle_at.2, perf_event_open.2, pidfd_open.2, readlink.2, readv.2, rename.2, request_key.2, seccomp.2, sigaction.2, stat.2, statx.2, symlink.2, syscalls.2, umount.2, unlink.2, utimensat.2, wait.2, bsearch.3, fflush.3, getaddrinfo.3, getauxval.3, getopt.3, getsubopt.3, mkfifo.3, pthread_mutex_consistent.3, pthread_setname_np.3, pthread_tryjoin_np.3, scandir.3, sem_wait.3, stailq.3, strlen.3, strstr.3, termios.3, tsearch.3, wcslen.3, wcstok.3, wordexp.3, proc.5, capabilities.7, cgroups.7, fanotify.7, mount_namespaces.7, namespaces.7, path_resolution.7, pipe.7, posixoptions.7, user_namespaces.7, vdso.7, iconvconfig.8, ld.so.8: tstamp
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-27 02:44:07 +02:00
Michael Kerrisk 74ed673c59 mount.2: ERRORS: add EPERM error for case where a mount is locked
Refer the reader to mount_namespaces(7) for details.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-20 23:37:47 +02:00
Michael Kerrisk b3987057c6 umount.2: ERRORS: add EINVAL for case where mount is locked
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-20 23:37:47 +02:00
Michael Kerrisk 7127973e0c add_key.2, keyctl.2, request_key.2: Note that the "libkeyutils" package provides <keyutils.h>
See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=992377

Reported-by: Dominique Brazziel <dbrazziel@snet.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-18 11:45:20 +02:00
Michael Kerrisk 7b7252ed41 fanotify_mark.2: Revert cruft added in commit 717c3a7dcf
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-18 10:10:52 +02:00
Michael Kerrisk f99cea2c35 prctl.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-18 07:46:16 +02:00
Michael Kerrisk dae872dd27 readv.2, pthread_tryjoin_np.3, stailq.3, strlen.3, wcslen.3: Arrange .SH sections in correct order
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-18 07:38:15 +02:00
Michael Kerrisk 85a7ae7344 intro.2, mount_setattr.2, seccomp_unotify.2, fflush.3, pthread_mutex_consistent.3: Place SEE ALSO entries in correct order
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-18 07:30:52 +02:00
Michael Kerrisk 90879cbd20 chmod.2, chown.2, open.2, mkdir.2, mknod.2, readlink.2, stat.2, symlink.2, mkfifo.3, scandir.3, sem_wait.3: ERRORS: combine errors into a single alphabetic list
These pages split out extra errors for some APIs into a separate
list.  Probably, the pages are easier to ready if all errors are
combined into a single list.

Note that there still remain a few pages where the errors are
listed separately for different APIs. For the moment, it seems
best to leave those pages as is, since the error lists are
largely distinct in those pages.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-18 03:47:26 +02:00
Michael Kerrisk 97e2d8e602 arch_prctl.2, perf_event_open.2, pthread_tryjoin_np.3: ERRORS: correct alphabetic order
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-18 03:46:58 +02:00
Michael Kerrisk 18ce9c4a1b accept.2, access.2, getpriority.2, mlock.2: ERRORS: combine errors into a single list
These split out errors into separate lists (perhaps per API,
perhaps "may" vs "shall", perhaps "Linux-specific" vs
standard(??)), but there's no good reason to do this.  It makes
the error list harder to read, and is inconsistent with other
pages. So, combine the errors into a single list.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-18 01:38:14 +02:00
Michael Kerrisk 65f96dae10 shmop.2: wfix
Reported-by: Helge Kreutzmann <debian@helgefjell.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-18 01:38:14 +02:00
Michael Kerrisk 525a8b5461 fanotify_mark.2, link.2, mount.2, umount.2, proc.5, cgroups.7, fanotify.7: Terminology clean-up: "mount point" ==> "mount"
Many times, these pages use the terminology "mount point", where
"mount" would be better. A "mount point" is the location at which
a mount is attached. A "mount" is an association between a
filesystem and a mount point.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-18 01:38:14 +02:00
Michael Kerrisk 7ccfe34995 rename.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-18 01:38:14 +02:00
Michael Kerrisk 053132a882 statx.2: srcfix: semantic newlines
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-13 01:21:53 +02:00
NeilBrown 728009a474 statx.2: Document STATX_MNT_ID
Linux 5.8 adds STATX_MNT_ID and stx_mnt_id.
Add description to statx.2

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-13 01:21:53 +02:00
Michael Kerrisk 4c313d979d mount_setattr.2: srcfix: add note explaining Christian's use of -ve dirfd values
From email with Christian Brauner:

>>>>>>           int fd_tree = open_tree(-EBADF, source,
>>>>>>                        OPEN_TREE_CLONE | OPEN_TREE_CLOEXEC |
>>>>>>                        AT_EMPTY_PATH | (recursive ? AT_RECURSIVE : 0));
>>>>>
>>>>> ???
>>>>> What is the significance of -EBADF here? As far as I can tell, it
>>>>> is not meaningful to open_tree()?
>>>>
>>>> I always pass -EBADF for similar reasons to [2]. Feel free to just use -1.
>>>
>>> ????
>>> But here, both -EBADF and -1 seem to be wrong. This argument
>>> is a dirfd, and so should either be a file descriptor or the
>>> value AT_FDCWD, right?
>>
>> [1]: In this code "source" is expected to be absolute. If it's not
>>      absolute we should fail. This can be achieved by passing -1/-EBADF,
>>      afaict.
>
> D'oh! Okay. I hadn't considered that use case for an invalid dirfd.
> (And now I've done some adjustments to openat(2),which contains a
> rationale for the *at() functions.)
>
> So, now I understand your purpose, but still the code is obscure,
> since
>
> * You use a magic value (-EBADF) rather than (say) -1.
> * There's no explanation (comment about) of the fact that you want
>   to prevent relative pathnames.
>
> So, I've changed the code to use -1, not -EBADF, and I've added some
> comments to explain that the intent is to prevent relative pathnames.
> Okay?

Sounds good.

>
> But, there is still the meta question: what's the problem with using
> a relative pathname?

Nothing per se. Ok, you asked so it's your fault:
When writing programs I like to never use relative paths with AT_FDCWD
because. Because making assumptions about the current working directory
of the calling process is just too easy to get wrong; especially when
pivot_root() or chroot() are in play.
My absolut preference (joke intended) is to open a well-known starting
point with an absolute path to get a dirfd and then scope all future
operations beneath that dirfd. This already works with old-style
openat() and _very_ cautious programming but openat2() and its
resolve-flag space have made this **chef's kiss**.
If I can't operate based on a well-known dirfd I use absolute paths with
a -EBADF dirfd passed to *at() functions.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-13 01:21:46 +02:00
Michael Kerrisk 669d6302cb fanotify_mark.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-13 01:21:08 +02:00
Michael Kerrisk d53b1b1730 open.2: Add mount_setattr(2) to list of 'dirfd' APIs
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-13 01:21:08 +02:00
Michael Kerrisk af35474e4f mount_setattr.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-13 01:21:08 +02:00
Michael Kerrisk a3ade5b2b1 mount.2: SEE ALSO: add mount_setattr(2)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-13 01:21:01 +02:00
Michael Kerrisk 9e11604c6c mount_setattr.2: Further tweaks after feedback from Christian Brauner
Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 07:33:53 +02:00
Michael Kerrisk 20e6e6ed79 mount_setattr.2: Clarify the description of "detached" mounts
From email:

>> Thanks. I made it "detached". Elsewhere, the page already explains
>> that a detached mount is one that:
>>
>>           must have been created by calling open_tree(2) with the
>>           OPEN_TREE_CLONE flag and it must not already have been
>>           visible in the filesystem.
>>
>> Which seems a fine explanation.
>>
>> ????
>> But, just a thought... "visible in the filesystem" seems not quite accurate.
>> What you really mean I guess is that it must not already have been
>> /visible in the filesystem hierarchy/previously mounted/something else/,
>> right?
I suppose that I should have clarified that my main problem was
that you were using the word "filesystem" in a way that I find
unconventional/ambiguous. I mean, I normally take the term
"filesystem" to be "a storage system for folding files".
Here, you are using "filesystem" to mean something else, what
I might call like "the single directory hierarchy" or "the
filesystem hierarchy" or "the list of mount points".

> A detached mount is created via the OPEN_TREE_CLONE flag. It is a
> separate new mount so "previously mounted" is not applicable.
> A detached mount is _related_ to what the MS_BIND flag gives you with
> mount(2). However, they differ conceptually and technically. A MS_BIND
> mount(2) is always visible in the fileystem when mount(2) returns, i.e.
> it is discoverable by regular path-lookup starting within the
> filesystem.
>
> However, a detached mount can be seen as a split of MS_BIND into two
> distinct steps:
> 1. fd_tree = open_tree(OPEN_TREE_CLONE): create a new mount
> 2. move_mount(fd_tree, <somewhere>):     attach the mount to the filesystem
>
> 1. and 2. together give you the equivalent of MS_BIND.
> In between 1. and 2. however the mount is detached. For the kernel
> "detached" means that an anonymous mount namespace is attached to it
> which doen't appear in proc and has a 0 sequence number (Technically,
> there's a bit of semantical argument to be made that "attached" and
> "detached" are ambiguous as they could also be taken to mean "does or
> does not have a parent mount". This ambiguity e.g. appears in
> do_move_mount(). That's why the kernel itself calls it an "anonymous
> mount". However, an OPEN_TREE_CLONE-detached mount of course doesn't
> have a parent mount so it works.).
>
> For userspace it's better to think of detached and attached in terms of
> visibility in the filesystem or in a mount namespace. That's more
> straightfoward, more relevant, and hits the target in 90% of the cases.
>
> However, the better and clearer picture is to say that a
> OPEN_TREE_CLONE-detached mount is a mount that has never been
> move_mount()ed. Which in turn can be defined as the detached mount has
> never been made visible in a mount namespace. Once that has happened the
> mount is irreversibly an attached mount.
>
> I keep thinking that maybe we should just say "anonymous mount"
> everywhere. So changing the wording to:
I'm not against the word "detached". To user space, I think it is a
little more meaningful than "anonymous". For the moment, I'll stay with
"detached", but if you insist on "anonymous", I'll probably change it.

> [...]
> EINVAL The mount that is to be ID mapped is not an anonymous mount;
> that is, the mount has already been visible in a mount namespace.
I like that text *a lot* better! Thanks very much for suggesting
wordings. It makes my life much easier.

I've made the text:

       EINVAL The mount that is to be ID mapped is not a detached
              mount; that is, the mount has not previously been
              visible in a mount namespace.

> [...]
> The mount must be an anonymous mount; that is, it must have been
> created by calling open_tree(2) with the OPEN_TREE_CLONE flag and it
> must not already have been visible in a mount namespace, i.e. it must
> not have been attached to the filesystem hierarchy with syscalls such
> as move_mount() syscall.
And that too! I've made the text:

       •  The mount must be a detached mount; that is, it must have
          been created by calling open_tree(2) with the
          OPEN_TREE_CLONE flag and it must not already have been
          visible in a mount namespace.  (To put things another way:
          the mount must not have been attached to the filesystem
          hierarchy with a system call such as move_mount(2).)

Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 07:33:53 +02:00
Michael Kerrisk 45ea537cf2 mount_setattr.2: EXAMPLES: use -1 rather than -EBADF
From email with Christian Braner:

> [1]: In this code "source" is expected to be absolute. If it's not
>      absolute we should fail. This can be achieved by passing -1/-EBADF,
>      afaict.
D'oh! Okay. I hadn't considered that use case for an invalid dirfd.
(And now I've done some adjustments to openat(2),which contains a
rationale for the *at() functions.)

So, now I understand your purpose, but still the code is obscure,
since

* You use a magic value (-EBADF) rather than (say) -1.
* There's no explanation (comment about) of the fact that you want
  to prevent relative pathnames.

So, I've changed the code to use -1, not -EBADF, and I've added some
comments to explain that the intent is to prevent relative pathnames.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 07:33:53 +02:00
Michael Kerrisk 717c3a7dcf fanotify_mark.2, futimesat.2, mount_setattr.2, statx.2, symlink.2, mkfifo.3: Refer the reader to openat(2) for explanation of why 'dirfd' is useful
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 05:45:21 +02:00
Michael Kerrisk 3c39ce8598 fanotify_mark.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 05:45:21 +02:00
Michael Kerrisk 0a5c96dbc4 open.2: Minor tweaks to list of functions that take a dirfd argument
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 05:45:21 +02:00
Michael Kerrisk 9f4e736ad0 access.2, chmod.2, chown.2, execveat.2, futimesat.2, link.2, mkdir.2, mknod.2, mount_setattr.2, open.2, open_by_handle_at.2, readlink.2, rename.2, stat.2, statx.2, symlink.2, unlink.2, utimensat.2, mkfifo.3, scandir.3: Fix EBADF error description
Make the description of the EBADF error for invalid 'dirfd' more
uniform. In particular, note that the error only occurs when the
pathname is relative, and that it occurs when the 'dirfd' is
neither valid *nor* has the value AT_FDCWD.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 05:45:21 +02:00
Michael Kerrisk 38a350061e fanotify_mark.2: ERRORS: add missing EBADF error for invalid 'dirfd'
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 05:45:21 +02:00
Michael Kerrisk 2b7b1f385e open_by_handle_at.2: ERRORS: add missing EBADF error for invalid 'dirfd'
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 05:45:21 +02:00
Michael Kerrisk 5a9ebeba72 mount_setattr.2: Rename 'path' to 'pathname'
For consistency with other pages

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 05:45:21 +02:00
Michael Kerrisk 73434f4003 open.2: Explicitly describe the EBADF error that can occur with openat()
In particular, specifying an invalid file descriptor number
in 'dirfd' can be used as a check that 'pathname' is absolute.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 05:45:21 +02:00
Michael Kerrisk a9db6c1ba3 open.2: Clarify that openat()'s dirfd must be opened with O_RDONLY or O_PATH
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 05:45:21 +02:00
Michael Kerrisk 56dddcbad5 open.2: Reorder list of cases for 'dirfd' argument of openat()
In preparation for subsequent commits

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 05:45:21 +02:00
Michael Kerrisk 5241f3cce5 open.2: Minor reworking of the description of the 'dirfd' argument of openat()
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-12 05:45:16 +02:00
Michael Kerrisk 9f275af155 syscalls.2: Add quotactl_fd(); remove quotactl_path()
quotactl_path() was never wired up in Linux 5.13.
It was replaced instead by quotactl_fd(),

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-11 04:40:43 +02:00
Michael Kerrisk ed655e3c21 sigaction.2: Add kernel version for SA_UNSUPPORTED flag
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-11 03:31:52 +02:00
Michael Kerrisk 2d369e8e5d wait.2: ERRORS: document EAGAIN for waitid() on a PID file descriptor
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2021-08-11 02:01:46 +02:00