Commit Graph

19886 Commits

Author SHA1 Message Date
Vince Weaver 34211ee3f2 perf_event_open.2: Fix wording in multiplexing description
Back in 2014 (37bee118ad) the text
describing when multiplexing happens was changed in a confusing way.
This is an attempt to clarify things a bit.

Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-03 22:10:30 +01:00
Michael Kerrisk 07ca8b34a0 madvise.2: Minor tweaks to Michal Hocko's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-01 09:53:19 +01:00
Michal Hocko 9bbc50e6e0 madvise.2: MADV_FREE clarify swapless behavior
Since 93e06c7a6453 ("mm: enable MADV_FREE for swapless system") we
handle MADV_FREE on a swapless system the same way as with the
swap available. Clarify that fact in the man page.

Reported-by: Niklas Hambüchen <mail@nh2.me>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-01 09:48:46 +01:00
Konst Mayer 081ec61f02 tcp.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-01 09:47:28 +01:00
Michael Kerrisk 7f11e32c39 accept.2, copy_file_range.2, eventfd.2, inotify_init.2, pipe.2, readahead.2, signalfd.2, socket.2, timerfd_create.2: Clarify the distinction between "file descriptor" and "file description"
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-01 09:42:01 +01:00
Michael Kerrisk 735e291284 eventfd.2: Move text noting that eventfd() creates a FD earlier in the page
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-01 09:30:07 +01:00
Michael Kerrisk 839d161f0f clone.2: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-01 09:21:05 +01:00
Michael Kerrisk a202ed9396 ioctl_console.2, ctime.3: tfix
Reported-by: Anatoly Borodin <anatoly.borodin@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-27 18:28:31 +01:00
Dmitry V. Levin b29cd73f56 ptrace.2: Do not say that PTRACE_O_TRACESYSGOOD may not work
Remove the old statement that PTRACE_O_TRACESYSGOOD may not work
on all architectures.  As far as I can tell, all kernel code
properly tests PT_TRACESYSGOOD flag and sets the 7th bit in the
exit code passed to ptrace_notify().

Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-27 08:14:54 +01:00
Michael Kerrisk 4a5a783d8f prctl.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 20:54:48 +01:00
Michael Kerrisk a32c96b894 prctl.2: Explain the circumstances in which the parent-death signal is sent
To test the behavior documented by this patch, the following
demos employ the program shown at the foot of this commit message.

First, show that the pdeath signal is sent when the parent
terminates:

$ ./pdeath_signal 0 10 4
Parent (18595) about to sleep for 4 seconds
Child about to set PR_SET_PDEATHSIG
Child about to sleep
Parent (18595) terminating
*********** Child (18596) got signal; si_pid = 18595; si_uid = 1000
            Parent PID is now 1403
$ Child about to exit

But the signal is not sent if the parent terminates before the
child uses PR_SET_PDEATHSIG:

$ ./pdeath_signal 2 10  0
Parent (18707) about to sleep for 0 seconds
Parent (18707) terminating
Child about to sleep 2 seconds before setting PR_SET_PDEATHSIG
$ Child about to set PR_SET_PDEATHSIG
Child about to sleep
Child about to exit

Demonstrate that the pdeath signal is sent on termination of each
ancestor subreaper process:

$ ./pdeath_signal 2 10 3 7 6 5
18786 marked itself as a subreaper
18786 subreaper about to sleep 7 seconds
18787 marked itself as a subreaper
18787 subreaper about to sleep 6 seconds
18788 marked itself as a subreaper
18788 subreaper about to sleep 5 seconds
Parent (18789) about to sleep for 3 seconds
Child about to sleep 2 seconds before setting PR_SET_PDEATHSIG
Child about to set PR_SET_PDEATHSIG
Child about to sleep
Parent (18789) terminating
*********** Child (18790) got signal; si_pid = 18789; si_uid = 1000
            Parent PID is now 18788
18788 subreaper about to terminate
*********** Child (18790) got signal; si_pid = 18788; si_uid = 1000
            Parent PID is now 18787
18787 subreaper about to terminate
*********** Child (18790) got signal; si_pid = 18787; si_uid = 1000
            Parent PID is now 18786
18786 subreaper about to terminate
*********** Child (18790) got signal; si_pid = 18786; si_uid = 1000
            Parent PID is now 1403
$ Child about to exit

But in the case where some subreapers terminate before they
have a chance to adopt the child, the terminations of those
subreapers do not result in a signal for the child:

$ ./pdeath_signal 2 10 3 5 6 7
18836 marked itself as a subreaper
18836 subreaper about to sleep 5 seconds
18837 marked itself as a subreaper
18837 subreaper about to sleep 6 seconds
18838 marked itself as a subreaper
18838 subreaper about to sleep 7 seconds
Parent (18839) about to sleep for 3 seconds
Child about to sleep 2 seconds before setting PR_SET_PDEATHSIG
Child about to set PR_SET_PDEATHSIG
Child about to sleep
Parent (18839) terminating
*********** Child (18840) got signal; si_pid = 18839; si_uid = 1000
            Parent PID is now 18838
18836 subreaper about to terminate
$ 18837 subreaper about to terminate
18838 subreaper about to terminate
*********** Child (18840) got signal; si_pid = 18838; si_uid = 1000
            Parent PID is now 1403
Child about to exit

============================

/* pdeath_signal.c */

                        } while (0)

static void
handler(int sig, siginfo_t *si, void *ucontext)
{
    printf("*********** Child (%ld) got signal; si_pid = %d; si_uid = %d\n",
            (long) getpid(), si->si_pid, si->si_uid);
    printf("            Parent PID is now %ld\n", (long) getppid());
}

int
main(int argc, char *argv[])
{
    struct sigaction sa;
    int childPreSleep, childPostSleep, parentSleep;

    if (argc < 2) {
        fprintf(stderr, "Usage: %s child-pre-sleep "
                "[child-post-sleep [parent-sleep [subreaper-sleep...]]]\n",
                argv[0]);
        exit(EXIT_FAILURE);
    }

    childPreSleep = atoi(argv[1]);
    if (argc > 2)
        childPostSleep = atoi(argv[2]);
    if (argc > 3)
        parentSleep = atoi(argv[3]);

    /* Optionally create a series of subreapers */

    if (argc > 4) {
        for (int sr = 4; sr < argc; sr++) {
            if (prctl(PR_SET_CHILD_SUBREAPER, 1) == -1)
                errExit("prctl");
            printf("%ld marked itself as a subreaper\n", (long) getpid());
            switch (fork()) {
            case -1:
                errExit("fork");
            case 0:
                break;
            default:
                printf("%ld subreaper about to sleep %s seconds\n",
                        (long) getpid(), argv[sr]);
                sleep(atoi(argv[sr]));
                printf("%ld subreaper about to terminate\n", (long) getpid());
                exit(EXIT_SUCCESS);
            }
        }
    }

    switch (fork()) {
    case -1:
        errExit("fork");

    case 0:
        sa.sa_flags = SA_SIGINFO;
        sigemptyset(&sa.sa_mask);
        sa.sa_sigaction = handler;
        if (sigaction(SIGUSR1, &sa, NULL) == -1)
            errExit("sigaction");

        if (childPreSleep > 0) {
            printf("Child about to sleep %d seconds before setting "
                    "PR_SET_PDEATHSIG\n", childPreSleep);
            sleep(childPreSleep);
        }

        printf("Child about to set PR_SET_PDEATHSIG\n");
        if (prctl(PR_SET_PDEATHSIG, SIGUSR1) == -1)
            errExit("prctl");

        printf("Child about to sleep\n");
        for (int j = 0; j < childPostSleep; j++)
            sleep(1);

        printf("Child about to exit\n");
        exit(EXIT_SUCCESS);

    default:
        printf("Parent (%ld) about to sleep for %d seconds\n",
                (long) getpid(), parentSleep);
        sleep(parentSleep);
        printf("Parent (%ld) terminating\n", (long) getpid());
        exit(EXIT_SUCCESS);
    }
}

Reported-by: Jann Horn <jann@thejh.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 13:00:52 +01:00
Michael Kerrisk 29b249db56 prctl.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 12:44:27 +01:00
Michael Kerrisk fdda93639e prctl.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 11:25:28 +01:00
Michael Kerrisk e256205a55 prctl.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 11:23:55 +01:00
Michael Kerrisk 300a9c78f3 prctl.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 11:22:47 +01:00
Michael Kerrisk a09b5995c3 prctl.2: Add additional info on PR_SET_PDEATHSIG
The signal is process directed and the siginfo_t->si_pid
filed contains the PID of the terminating parent.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 11:20:09 +01:00
Michael Kerrisk 910b068989 prctl.2: Rework the PR_SET_PDEATHSIG description a little, for easier readability
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 10:47:21 +01:00
Michael Kerrisk c5236575ca prctl.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 10:38:07 +01:00
Jann Horn c62b945324 ptrace.2: BUGS: ptrace() may set errno to zero
ptrace() with requests PTRACE_PEEKTEXT, PTRACE_PEEKDATA and
PTRACE_PEEKUSER can set errno to zero. AFAICS this is for a good
reason (so that you can tell the difference between a successful
PEEK with a result of -1 and a failed PEEK, even if you forget to
clear errno yourself), but it technically violates the rules
described in the errno.3 manpage.

glibc snippet from sysdeps/unix/sysv/linux/ptrace.c:

  res = INLINE_SYSCALL (ptrace, 4, request, pid, addr, data);
  if (res >= 0 && request > 0 && request < 4)
    {
      __set_errno (0);
      return ret;
    }

reproducer:

$ cat ptrace_test.c
char foobar_data[4] = "ABCD";
int main(void) {
  pid_t child = fork();
  if (child == -1) err(1, "fork");
  if (child == 0) {
    if (prctl(PR_SET_PDEATHSIG, SIGKILL)) err(1, "prctl");
    while (1) sleep(1);
  }
  int status;
  if (ptrace(PTRACE_ATTACH, child, NULL, NULL)) err(1, "attach");
  if (waitpid(child, &status, 0) != child) err(1, "wait");
  errno = EINVAL;
  unsigned int res = ptrace(PTRACE_PEEKDATA, child, foobar_data, NULL);
  printf("errno after PEEKDATA: %d\n", errno);
  printf("PEEKDATA result: 0x%x\n", res);
}
$ gcc -o ptrace_test ptrace_test.c -Wall
$ ./ptrace_test
errno after PEEKDATA: 0
PEEKDATA result: 0x44434241

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 08:16:03 +01:00
Jann Horn d6868c69b3 clone.2: Pending CLONE_NEWPID prevents thread creation
See copy_process() in kernel/fork.c:

	if (clone_flags & CLONE_THREAD) {
		if ((clone_flags & (CLONE_NEWUSER | CLONE_NEWPID)) ||
		    (task_active_pid_ns(current) !=
				current->nsproxy->pid_ns_for_children))
			return ERR_PTR(-EINVAL);
	}

current->nsproxy->pid_ns_for_children is where unshare(CLONE_NEWPID)
stashes the pending namespace.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 08:13:11 +01:00
Anthony Iliopoulos 6684e3e4ff fanotify.7: wfix
Use "FAN_OPEN_PERM" consistently rather than "FAN_PERM_OPEN".

Signed-off-by: Anthony Iliopoulos <ailiopoulos@suse.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 08:04:42 +01:00
Michael Kerrisk 63e59e0d31 system.3: Note that system() can fail for the same reasons as fork(2)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 08:02:48 +01:00
Michael Kerrisk 0cb0dd1c5d system.3: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 08:00:34 +01:00
Arkadiusz Drabczyk 80d274454b system.3: Mention that 'errno' is set on error
Corresponding system.3p already mentions that.
Tested with glibc and musl.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 07:59:58 +01:00
Michael Kerrisk 2fc546f9bf proc.5: Minor clean-ups for Alan Jenkins' patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-20 14:35:49 +01:00
Michael Kerrisk a15ad24df8 proc.5: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-20 14:32:42 +01:00
Alan Jenkins bfe9256a15 proc.5: Vmalloc information is no longer calculated (Linux 4.4)
See Linux commit a5ad88ce8c7fae7ddc72ee49a11a75aa837788e0,
"mm: get rid of 'vmalloc_info' from /proc/meminfo".

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-20 14:31:21 +01:00
Michael Kerrisk 4f1a13fe85 pid_namespaces.7: Clarify the semantics for the adoption of orphaned processes
Because of setns() semantics, the parent of a process may reside
in the outer PID namespace. If that parent terminates, then the
child is adopted by the "init" in the outer PID namespace (rather
than the "init" of the PID namespace of the child).

Thus, in a scenario such as the following, if process M
terminates, P is adopted by the init process in the initial
PID namespace, and if P terminates, Q is adopted by the init
process in the inner PID namespace.

    +---------------------------------------------+
    | Initial PID NS                              |
    |                           +---------------+ |
    |  +-+                      | inner PID NS  | |
    |  |1|                      |               | |
    |  +-+                      |    +-+        | |
    |                           |    |1|        | |
    |                           |    +-+        | |
    |                           |               | |
    |  +-+   setns(), fork()    |    +-+        | |
    |  |M|----------------------+--> |P|        | |
    |  +-+                      |    +-+        | |
    |                           |     | fork()  | |
    |                           |     v         | |
    |                           |    +-+        | |
    |                           |    |Q|        | |
    |                           |    +-+        | |
    |                           +---------------+ |
    +---------------------------------------------+

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-19 16:55:50 +01:00
Michael Kerrisk d6bec36eca clone.2, prctl.2, st.4, proc.5: Change references to '2.6.0-test*' series kernels to just '2.6.0'
The extra detail has little of noting with -test 2.6.0
added a particular feature has little value these days,
and is likely to confuse some readers who don't know
(and probably don't care) about the historical details.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-19 13:09:55 +01:00
Michael Kerrisk 44645ac4db getgroups.2: Note that a process can drop all groups with: setgroups(0, NULL)
Checking the FreeBSD source code, there's explicit support for
this to accommodate non-BSD systems (such as Linux).

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-19 13:09:46 +01:00
Michael Kerrisk 1fa9fdb1e9 signal.7: Unify signal lists into a signal table that embeds standards info
Having the signals listed in three different tables reduces
readability, and would require more table splits if future
standards specify other signals.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 10:17:39 +01:00
Michael Kerrisk 6043ed9d54 signal.7: Insert standards info into tables
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 10:17:39 +01:00
Michael Kerrisk 9a10a14487 signal.7: Place signal numbers in a separate table
The current tables of signal information are unwieldy,
as they try to cram in too much information.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 10:17:39 +01:00
Michael Kerrisk bdbc9b4475 signal.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 08:59:02 +01:00
Benjamin Peterson 915c4ba36f futex.2: Make the example use C11 atomics rather than GCC builtins
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 08:21:51 +01:00
Michael Kerrisk 1605ddac8f getaddrinfo.3: Fix off-by-one error in example client program
Reported-by: Eric Sanchis <eric.sanchis@iut-rodez.fr>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 08:11:07 +01:00
Michael Kerrisk da3ed81b42 pthread_rwlockattr_setkind_np.3: tfix
Reported-by: G. Branden Robinson <g.branden.robinson@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 07:41:18 +01:00
Carlos O'Donell 0d255e74c0 pthread_rwlockattr_setkind_np.3: Remove bug notes
The notes in pthread_rwlockattr_setkind_np.3 imply there is a bug
in glibc's implementation of PTHREAD_RWLOCK_PREFER_WRITER_NP (a
non-portable constant anyway), but this is not true. The
implementation of PTHREAD_RWLOCK_PREFER_WRITER_NP is made almost
impossible by the POSIX standard requirement that reader locks be
allowed to be recursive, and that requirement makes writer
preference deadlock without an impossibly complex requirement that
we track all reader locks. Therefore the only sensible solution
was to add PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP and
disallow recursive reader locks if you want writer preference.

This patch removes the bug description and documents the current
state and recommendations for glibc. I have also updated bug 7057
with this information, answering Steven Munroe's almost 10 year
old question :-) I hope Steven is enjoying his much earned
retirement.

Should we move the glibc discussion to some footnote? Some libc
may be able to implement the requirement to avoid deadlocks in the
future, but I doubt it (fundamental CS stuff).

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 07:35:53 +01:00
Mike Rapoport a2463bae6f ioctl_userfaultfd.2, madvise.2, memfd_create.2, migrate_pages.2, mmap.2, shmget.2, subpage_prot.2, userfaultfd.2, malloc.3, proc.5, sysfs.5, tmpfs.5: Update paths for in-kernel memory management documentation files
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 07:29:53 +01:00
Michael Kerrisk d893df00d9 capabilities.7: Update URL for libcap tarballs
The previous location does not seem to be getting updated.
(For example, at the time of this commit, libcap-2.26
had been out for two months, but was not present at
http://www.kernel.org/pub/linux/libs/security/linux-privs.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 07:26:22 +01:00
Michael Kerrisk cf0866501d prctl.2: Note libcap(3) APIs for operating on ambient capability set
(However, the libcap APIs do not yet seem to have
manual pages...)

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-16 21:45:10 +01:00
Michael Kerrisk d9a0d1d7b7 prctl.2: Mention libcap APIs for operating on capability bounding set
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-16 21:32:45 +01:00
Michael Kerrisk 6a1634dc09 syscalls.2: Update syscall list for Linux 4.18
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-11 08:07:35 +01:00
Michael Kerrisk 35f2e598f0 system.3: Use '(char *) NULL' rather than '(char *) 0'
Reported-by: Jonny Grant <jg@jguk.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-10 07:15:40 +01:00
Anthony Iliopoulos 99de80c58b ioctl_userfaultfd.2, userfaultfd.2: wfix
Use "UFFDIO_ZEROPAGE" consistently rather than "UFFDIO_ZERO".

Signed-off-by: Anthony Iliopoulos <ailiopoulos@suse.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-10 07:12:40 +01:00
Michael Kerrisk 4f5bbd6115 system.3: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-09 23:17:15 +01:00
Michael Kerrisk f80fdeaf61 system.3: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-09 23:14:27 +01:00
Jakub Wilk b784b9d50f user_namespaces.7: tfix
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-09 16:02:07 +01:00
Michael Kerrisk 52fc743c1b pivot_root.2: Minor fixes to Elvira Khabirova's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-09 08:54:20 +01:00
Elvira Khabirova 82320f4201 pivot_root.2: Explain the initramfs case and point to switch_root(8).
Based on text from Documentation/filesystems/ramfs-rootfs-initramfs.txt.

Signed-off-by: Elvira Khabirova <lineprinter@altlinux.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-09 08:52:40 +01:00