Commit Graph

20179 Commits

Author SHA1 Message Date
Michael Kerrisk ee81d7e418 namespaces.7: Include manual page references in the summary table of namespace types
Make the page more compact by removing the stub subsections that
list the manual pages for the namespace types. And while we're
here, add an explanation of the table columns.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-09 08:59:22 +02:00
Michael Kerrisk 4d75df3711 mount_namespaces.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-09 08:25:04 +02:00
Michael Kerrisk 19416046c5 mount_namespaces.7: Tweak discussion of "less privileged" mount namespace
Eric Biederman:

    I hate to nitpick, but I am going to say that when I read
    the text above the phrase "mount namespace of the process
    that created the new mount namespace" feels wrong.

    Either you use unshare(2) and the mount namespace of the
    process that created the mount namespace changes.

    Or you use clone(2) and you could argue it is the new child
    that created the mount namespace.

    Having a different mount namespace at the end of the
    creation operation feels like it makes your phrase confusing
    about what the starting mount namespace is.  I hate to use
    references that are ambiguous when things are changing.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-08 23:30:55 +02:00
Michael Kerrisk 534755eed9 mount_namespaces.7: Explain how a namespace's mount point list is initialized
Provide a more detailed explanation of the initialization of
the mount point list in a new mount namespace.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-08 22:51:59 +02:00
Michael Kerrisk 47b69a37cf pivot_root.2: srcfix: FIXME
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-08 21:29:28 +02:00
Michael Kerrisk 8f2a9129e6 pivot_root.2: Remove the term 'old_root'
Reid noted a confusion between 'old_root' (my attempt at a
shorthand for the old root point) and 'put_old. Eliminate the
confusion by replacing the shorthand with "old root mount point".

Reported-by: Reid Priedhorsky <reidpr@lanl.gov>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-08 20:57:55 +02:00
Michael Kerrisk 459fe99546 mount.2: Describe the concept of "parent mounts"
Reported-by: Reid Priedhorsky <reidpr@lanl.gov>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-08 17:26:51 +02:00
Michael Kerrisk e0e0ba7d01 mount.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-08 17:23:20 +02:00
Michael Kerrisk dd858bfd5e mount.2: Rework the text on mount namespaces a little
Eliminate the term "Per-process namespaces" and add a reference
to mount_namespaces(7).

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-08 16:44:58 +02:00
Michael Kerrisk 5d3bcce72d mount.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-08 16:44:58 +02:00
Michael Kerrisk 632940d96d mount.2: NOTES: add subsection heading for /proc/[pid]/{mounts,mountinfo}
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-08 16:44:58 +02:00
Michael Kerrisk ed425459c5 mount_namespaces.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-08 16:26:15 +02:00
Michael Kerrisk a0c9733194 mount_namespaces.7: Clarify description of "less privileged" mount namespaces
The current text talks about "parent mount namespaces", but there
is no such concept. As confirmed by Eric Biederman, what is mean
here is "the mount namespace this mount namespace started as a
copy of". So, this change writes up Eric's description in a more
detailed way.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-08 16:20:59 +02:00
Michael Kerrisk 93cc3b3827 pivot_root.2: Simplify pivot_root(".", ".") example
Eric Biederman notes that the change in commit f646ac88ef was
not strictly necessary for this example, since one of the already
documented requirements is that various mount points must not have
shared propagation, or else pivot_root() will fail. So, simplify
the example.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-07 14:02:42 +03:00
Michael Kerrisk a2fc45a9f8 mount_namespaces.7: It may be desirable to disable propagation after creating a namespace
After creating a new mount namespace, it may be desirable to
disable mount propagation. Give the reader a more explicit
hint about this.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-07 12:11:30 +03:00
Michael Kerrisk 0b6cf5d26e pthreads.7: Minor tweaks to Carlos O'Donell's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-05 14:54:24 +03:00
Michael Kerrisk 50639a2a18 pthread_setcancelstate.3, pthreads.7: srcfix: wrap source lines at sentence boundaries
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-05 14:54:15 +03:00
Carlos O'Donell dbb01cbbdb pthread_setcancelstate.3, pthreads.7, signal-safety.7: Describe issues with cancellation points in signal handlers
In a recent conversation with Mathieu Desnoyers I was reminded
that we haven't written up anything about how deferred
cancellation and asynchronous signal handlers interact. Mathieu
ran into some of this behaviour and I promised to improve the
documentation in this area to point out the potential pitfall.

Thoughts?

8< --- 8< --- 8<
In pthread_setcancelstate.3, pthreads.7, and signal-safety.7 we
describe that if you have an asynchronous signal nesting over a
deferred cancellation region that any cancellation point in the
signal handler may trigger a cancellation that will behave
as-if it was an asynchronous cancellation. This asynchronous
cancellation may have unexpected effects on the consistency of
the application. Therefore care should be taken with asynchronous
signals and deferred cancellation.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-05 14:54:02 +03:00
Michael Kerrisk c6ed23c5da perf_event_open.2: SEE ALSO: add Documentation/admin-guide/perf-security.rst
Reported-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-05 11:30:42 +03:00
Michael Kerrisk 1ff5960b23 prctl.2: Clarify that PR_MCE_KILL_GET returns value via function result
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-02 07:20:45 +03:00
Michael Kerrisk 035a7bf179 prctl.2: wfix (for consistency)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-02 07:19:53 +03:00
Michael Kerrisk 7f5d84426c prctl.2: RETURN VALUE: add some missing entries
Note success return for PR_GET_SPECULATION_CTRL and PR_GET_FP_MODE.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-02 07:09:38 +03:00
Michael Kerrisk 1cea09b38b prctl.2: Clarify that PR_GET_SPECULATION_CTRL returns value as function result
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-02 06:56:50 +03:00
Michael Kerrisk f1bb579885 prctl.2: grfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-02 06:11:02 +03:00
Michael Kerrisk f1ba3ad272 prctl.2: wfix (for consistency with usage in rest of this page)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-02 06:07:52 +03:00
Michael Kerrisk 3946602978 prctl.2: Clarify that PR_GET_FP_MODE returns value as function result
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-02 06:05:50 +03:00
Michael Kerrisk 27f942adbc sched_setparam.2, pthread_mutexattr_init.3, pthread_mutexattr_setrobust.3, pthread_mutex_consistent.3, strtol.3, sched.7, uts_namespaces.7: SEE ALSO: correct list order
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-27 14:18:46 +02:00
Michael Kerrisk 1c6f266407 res_nclose.3: Add NEW link to resolver.3
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-27 14:18:46 +02:00
Michael Kerrisk c148832982 veth.4, persistent-keyring.7, process-keyring.7, session-keyring.7, thread-keyring.7, user-keyring.7, user-session-keyring.7: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-27 14:18:46 +02:00
Michael Kerrisk 549597a85f close.2, execve.2, io_submit.2, prctl.2, write.2: Remove section number from references to function in its own page
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-27 14:18:46 +02:00
Michael Kerrisk 8f4f2de329 getlogin.3: grfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-27 14:18:46 +02:00
Michael Kerrisk 49a2a1052b copy_file_range.2, fanotify_mark.2, inotify_add_watch.2, ioctl_fideduperange.2, kcmp.2, prctl.2, get_robust_list.2, tkill.2, ttyname.3: ERRORS: correct alphabetical order
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-27 14:18:08 +02:00
Amir Goldstein 88e75e2c56 copy_file_range.2: Kernel v5.3 updates
Update with all the missing errors the syscall can return, the
behaviour the syscall should have w.r.t. to copies within single
files, etc.

[Amir] updates for final released version.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-27 13:26:03 +02:00
Michael Kerrisk 4985364098 epoll_wait.2: tfix
Reported-by: nilsocket <nilsocket@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-27 08:36:52 +02:00
Michael Kerrisk e72ae85154 strtok.3: Add portability note for strtok_r() '*saveptr' value
On some implementations, '*saveptr' must be NULL on first call
to strtok_r().

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-26 08:01:52 +02:00
Michael Kerrisk 85987f9818 strtok.3: The caller should not modify 'saveptr' between strtok_r() calls
Reported-by: eponymous alias <eponymousalias@yahoo.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-26 07:53:24 +02:00
Michael Kerrisk 362310a7bd signalfd.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-25 23:20:08 +02:00
Jakub Wilk bf421740d4 pivot_root.2: tfix
Remove duplicated words.

Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-25 20:41:48 +02:00
Michael Kerrisk d703afe9a6 sched_setaffinity.2: RETURN VALUE: sched_getaffinity() syscall differs from the wrapper
In RETURN VALUE, point reader at subsection noting that the return
value of the raw sched_setaffinity() system call differs from the
wrapper function in glibc.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-25 14:35:27 +02:00
Michael Kerrisk 618b1e7eca getauxval.3: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-24 22:02:22 +02:00
Michael Kerrisk be2e899b49 getauxval.3: srcfix: rewrap source lines
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-24 22:01:19 +02:00
Raphael Moreira Zinsly a81869d0e6 getauxval.3: Add new cache geometry entries
Add entries for the new cache geometry values of the auxiliary
vector that got included in the kernel.

Signed-off-by: Raphael Moreira Zinsly <rzinsly@linux.vnet.ibm.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-24 21:57:51 +02:00
Michael Kerrisk f3fdbe2812 open.2: tfix
Reported-by: Дилян Палаузов <dilyan.palauzov@aegee.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-24 12:16:07 +02:00
Michael Kerrisk b892d64f4f signalfd.2: Rewrite the text on epoll semantics
I also verified the behavior reported by Andrew Clayton
with the program below.

$ ./epoll_signalfd
PID of parent: 5661
PID of child:  5662
epoll_wait() returned 0
PID 5662: got signal 10
Successfully read signal, even though epoll_wait() didn't say FD was ready!

8x----8x----8x----8x----8x----8x----8x----8x----8x----8x----8x----8x----
/* epoll_signalfd.c */

#include <sys/signalfd.h>
#include <signal.h>
#include <sys/epoll.h>
#include <sys/wait.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
                        } while (0)

static void
signalTest(int sfd, int epfd)
{
    struct signalfd_siginfo fdsi;
    struct epoll_event rev;
    int ready;
    ssize_t s;

    usleep(50000);
    ready = epoll_wait(epfd, &rev, 1, 0);
    if (ready == -1)
        errExit("epoll_wait");

    printf("epoll_wait() returned %d\n", ready);

    s = read(sfd, &fdsi, sizeof(struct signalfd_siginfo));
    if (s != sizeof(struct signalfd_siginfo))
        errExit("read");

    printf("PID %ld: got signal %d\n", (long) getpid(), fdsi.ssi_signo);

    if (ready == 0 && s > 0)
        printf("Successfully read signal, even though epoll_wait() "
                "didn't say FD was ready!\n");
}

int
main(int argc, char *argv[])
{
    struct epoll_event ev;
    sigset_t mask;
    int sfd, epfd;

    sigfillset(&mask);
    sigdelset(&mask, SIGINT);

    if (sigprocmask(SIG_BLOCK, &mask, NULL) == -1)
        errExit("sigprocmask");

    sfd = signalfd(-1, &mask, SFD_NONBLOCK);
    if (sfd == -1)
        errExit("signalfd");

    epfd = epoll_create(5);
    if (epfd == -1)
        errExit("epoll_create");

    ev.data.fd = sfd;
    ev.events = EPOLLIN;
    if (epoll_ctl(epfd, EPOLL_CTL_ADD, sfd, &ev) == -1)
        errExit("epoll_ctl");

    switch (fork()) {
    case -1:
        errExit("fork");
    case 0:
        printf("PID of child:  %ld\n", (long) getpid());
        raise(SIGUSR1);
        signalTest(sfd, epfd);
        break;
    default:
        printf("PID of parent: %ld\n", (long) getpid());
        wait(NULL);
        break;
    }

    exit(EXIT_SUCCESS);
}
8x----8x----8x----8x----8x----8x----8x----8x----8x----8x----8x----8x----

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 16:48:36 +02:00
Andrew Clayton e95f6bf482 signalfd.2: Note about interactions with epoll & fork
Using signalfd(2) with epoll(7) and fork(2) can lead to some head
scratching.

It seems that when a signalfd file descriptor is added to epoll
you will only get notifications for signals sent to the process
that added the file descriptor to epoll.

So if you have a signalfd fd registered with epoll and then call
fork(2), perhaps by way of daemon(3) for example. Then you will
find that you no longer get notifications for signals sent to the
newly forked process.

User kentonv on ycombinator[0] explained it thus

    "One place where the inconsistency gets weird is when you
     use signalfd with epoll. The epoll will flag events on the
     signalfd based on the process where the signalfd was
     registered with epoll, not the process where the epoll is
     being used. One case where this can be surprising is if you
     set up a signalfd and an epoll and then fork() for the
     purpose of daemonizing -- now you will find that your epoll
     mysteriously doesn't deliver any events for the signalfd
     despite the signalfd otherwise appearing to function as
     expected."

And another post from the same person[1].

And then there is this snippet from this kernel commit message[2]

    "If you share epoll fd which contains our sigfd with another
     process you should blame yourself. signalfd is "really
     special"."

So add a note to the man page that points this out where people
will hopefully find it sooner rather than later!

[0]: https://news.ycombinator.com/item?id=9564975
[1]: https://stackoverflow.com/questions/26701159/sending-signalfd-to-another-process/29751604#29751604
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d80e731ecab420ddcb79ee9d0ac427acbc187b4b

Signed-off-by: Andrew Clayton <andrew@digital-domain.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 15:57:21 +02:00
Michael Kerrisk 6c578de29d strtok.3: Correct description of use of 'saveptr' argument in strtok_r()
Reported-by: eponymous alias <eponymousalias@yahoo.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 15:57:11 +02:00
Michael Kerrisk 9d33e03b95 pivot_root.2: Explain why various mount points can't have shared propagation
Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk d4b2104ae5 pivot_root.2: Correct the list of mount points that can't be MS_SHARED
Eric Biederman noted that my list of directories that could not
have shared propagation was incorrect.  I had written that
new_root could not be shared; rather it should be: the parent of
the current root mount point.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk f646ac88ef pivot_root.2: Tweak pivot_root(".", ".") example
Quoting Eric Biederman:

    The concern from our conversation at the container
    mini-summit was that there is a pathology if in your initial
    mount namespace all of the mounts are marked MS_SHARED like
    systemd does (and is almost necessary if you are going to
    use mount propagation), that if new_root itself is MS_SHARED
    then unmounting the old_root could propagate.

    So I believe the desired sequence is:

    >>>            chdir(new_root);
    +++            mount("", ".", MS_SLAVE | MS_REC, NULL);
    >>>            pivot_root(".", ".");
    >>>            umount2(".", MNT_DETACH);

    The change to new new_root could be either MS_SLAVE or
    MS_PRIVATE.  So long as it is not MS_SHARED the mount won't
    propagate back to the parent mount namespace.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk 57bab66a92 pivot_root.2: pivot_root(".", ".") really is a thing
LXC uses this [1]. I tested, to double-check, and it works.

The fchdir() dance done by LXC is not needed though:

fchdir(old_root); umount(".", MNT_DETACH); fchdir(new_root);

As far as I can see, just the umount() is sufficient, since,
after pivot_root(), oldi_root is at the top of the stack
of mounts at "/" and thus (so long as CWD is at "/")
the umount will remove the mount at the top of the stack.
Eric Biederman confirmed my understanding by mail, and
Philipp Wendler verified my results by experiment.

[1] See the following commit in LXC:

    commit 2d489f9e87fa0cccd8a1762680a43eeff2fe1b6e
    Author: Serge Hallyn <serge.hallyn@ubuntu.com>
    Date:   Sat Sep 20 03:15:44 2014 +0000

        pivot_root: switch to a new mechanism (v2)

Helped-by: Eric W. Biederman <ebiederm@xmission.com>
Helped-by: Philipp Wendler <ml@philippwendler.de>
Helped-by: Aleksa Sarai <asarai@suse.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00