Commit Graph

19813 Commits

Author SHA1 Message Date
Michael Kerrisk d40e0bfc27 open.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-10 06:34:38 +13:00
Michael Kerrisk 319e9b31ed open.2: Minor fixes to Eugene's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-10 06:32:39 +13:00
Eugene Syromiatnikov 8132c11589 open.2: Document FASYNC usage in Linux UAPI headers
Linux's <asm/fcntl.h> defines FASYNC instead of O_ASYNC; document
this peculiarity.

Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-10 06:30:47 +13:00
Michael Kerrisk ecb110e846 rename.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-10 06:25:18 +13:00
Eugene Syromiatnikov 5fc33f3827 rename.2: Some additional notes regarding RENAME_WHITEOUT
Add a note regarding other implementations of whiteout inodes
and update filesystem support information.

Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-10 06:24:50 +13:00
Michael Kerrisk b0f5ce2757 getgid.2, getpid.2, getuid.2, pipe.2: Remove mention of other syscalls that use second retval register
This information is already summarized in syscall(2), so there's
no need to repeat it in each page.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-10 06:21:17 +13:00
Michael Kerrisk 127f815c99 syscall.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-10 06:18:39 +13:00
Michael Kerrisk 17eb18e79a syscall.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-10 06:18:39 +13:00
Michael Kerrisk 0e80287f28 syscall.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-10 06:18:39 +13:00
Michael Kerrisk 7548a84a6c syscall.2: Rework table to render within 80 columns
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-10 06:18:39 +13:00
Michael Kerrisk 03936fb4ce getgid.2, getpid.2, getuid.2, pipe.2, syscall.2: Minor tweaks to Eugene's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-10 06:18:39 +13:00
Eugene Syromiatnikov 70ea1968cc getgid.2, getpid.2, getuid.2, pipe.2, syscall.2: Describe 2nd return value peculiarity
Some architectures (ab)use second return value register for additional
return value in some system calls. Let's describe this.

Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-10 05:46:35 +13:00
Eugene Syromiatnikov f3f3ab82ee syscall.2: ffix
Add missing .RE.

Fixes: 2ad7b4c46c ("syscall.2: Elaborate x32 ABI specifics")
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-09 16:58:31 +13:00
Michael Kerrisk b5986acdcc errno.3: Mention that errno(1) is part of the 'moreutils' package
Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-05 21:21:02 +01:00
Michael Kerrisk 928c3e7c95 unix.7: wfix
Reported-by: Felipe Gasper <felipe@felipegasper.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Michael Kerrisk 421d34c632 unix.7: Clarify that SO_PASSCRED behavior
Clarify that SO_PASSCRED results in SCM_CREDENTIALS data in each
subsequently received message.

See https://bugzilla.kernel.org/show_bug.cgi?id=201805

Reported-by: Felipe Gasper <felipe@felipegasper.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Michael Kerrisk 3d3cddde94 unix.7: Rework SO_PEERCRED text for greater clarity
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Michael Kerrisk 88fedfa061 unix.7: Explicitly note that SO_PASSCRED provides SCM_CREDENTIALS messages
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Michael Kerrisk c02ed554e9 unix.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Michael Kerrisk 744c8fa8d2 unix.7: Improve wording describing socket option argument/return values
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Michael Kerrisk 43e3c5518b select.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Michael Kerrisk 3d335319e7 write.2: RETURN VALUE: clarify details of partial write and
As reported by Nadav Har'El in
https://bugzilla.kernel.org/show_bug.cgi?id=197961

    The write(2) manual page has this paragraph:

    "On  success,  the  number  of bytes written is returned
    (zero indicates nothing was written).  It is not an error
    if  this  number  is  smaller than the number of bytes
    requested; this may happen for example because the disk
    device was filled.  See also NOTES."

    I find a few problems with this paragraph:

    1. It's not clear what "See also NOTES." refers to (does it
       refer to anything?). What in the NOTES is relevant here?

    2. The paragraph seems to suggest that write(2) of a
       non-empty buffer may sometimes return even 0 in case of an
       error like the device being filled. I think this is wrong
       - if there was an error after already writing some number
       of bytes, this non-zero number is returned. But if there's
       an error before writing any bytes, -1 will be returned
       (and the error reason in errno) - 0 will not be returned
       unless the given count is 0 (that case is explained in the
       following paragraph).

    3. The paragraph doesn't explain what a user should do
       after a short write (i.e., write(2) returning less than
       count). How would the user know why there was an error, or
       if there even was one? I think users should be told what
       to do next because this information is part of how to use
       this API correctly. I think users should be told to retry
       the rest of the write (i.e., write(fd, buf+ret, count-ret)
       and this will either succeed in writing some more data if
       the error reason was solved, or the second write will
       return -1 and the error reason in errno.

Reported-by: Nadav Har'El <nyh@math.technion.ac.il>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Michael Kerrisk acac1139a6 getxattr.2, removexattr.2, setxattr.2: ERRORS: replace ENOATTR with ENODATA
ENOATTR is not a standard error code, but rather one that is
defined in 'libattr' as a synonym for ENODATA. The manual pages
should use the error code actually returned by the kernel APIs.

See also https://bugzilla.kernel.org/show_bug.cgi?id=201995

Reported-by: Enrico Scholz <enrico.scholz@sigma-chemnitz.de>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:18 +01:00
Radostin Stoyanov 69fc6c6761 namespaces.7: tfix
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-18 21:09:16 +01:00
Michael Kerrisk d70a2e571d mount.2: wfix (clarify effect of unbindable propagation type)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-14 11:35:40 +01:00
Michael Kerrisk adab7ac810 proc.5: Refer to mount(2) for explanation of mount vs superblock options
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-14 09:11:29 +01:00
snyh 2e75672790 syscall.2: Fix wrong retval register number in alpha architecture
alpha use v0 e.g. $0 as the return value register both in
syscall ABI and C ABI.

see also
https://github.com/torvalds/linux/blob/master/arch/alpha/kernel/entry.S#L479

The normal Alpha C ABI use a0~a5 to pass arguments and use v0 as
the return value register. See here
https://www2.cs.arizona.edu/projects/alto/Doc/local/alpha.register.html

The syscall ABI use v0 as the trap number, a0~a5 to pass arguments
and use a3 as a indicator (bool type) whether has a error occurred.

We can also see the libc's syscall wrapper implements at
https://code.woboq.org/userspace/glibc/sysdeps/unix/sysv/linux/alpha/syscall.S.html
The v0 is the normal used as return register, and we can see the
return processing doesn't do anything about a0 which is the wrong
register of currently syscall(2) description.

p.s. I found this wrong description because I'm porting Go gc to
a new CPU architecture which is similar to Alpha, And I use the
wrong register at first, then I have inspect the kernel code and
objdump to ensure the right syscall ABI.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-08 14:44:15 +01:00
Vince Weaver 34211ee3f2 perf_event_open.2: Fix wording in multiplexing description
Back in 2014 (37bee118ad) the text
describing when multiplexing happens was changed in a confusing way.
This is an attempt to clarify things a bit.

Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-03 22:10:30 +01:00
Michael Kerrisk 07ca8b34a0 madvise.2: Minor tweaks to Michal Hocko's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-01 09:53:19 +01:00
Michal Hocko 9bbc50e6e0 madvise.2: MADV_FREE clarify swapless behavior
Since 93e06c7a6453 ("mm: enable MADV_FREE for swapless system") we
handle MADV_FREE on a swapless system the same way as with the
swap available. Clarify that fact in the man page.

Reported-by: Niklas Hambüchen <mail@nh2.me>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-01 09:48:46 +01:00
Konst Mayer 081ec61f02 tcp.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-01 09:47:28 +01:00
Michael Kerrisk 7f11e32c39 accept.2, copy_file_range.2, eventfd.2, inotify_init.2, pipe.2, readahead.2, signalfd.2, socket.2, timerfd_create.2: Clarify the distinction between "file descriptor" and "file description"
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-01 09:42:01 +01:00
Michael Kerrisk 735e291284 eventfd.2: Move text noting that eventfd() creates a FD earlier in the page
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-01 09:30:07 +01:00
Michael Kerrisk 839d161f0f clone.2: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-01 09:21:05 +01:00
Michael Kerrisk a202ed9396 ioctl_console.2, ctime.3: tfix
Reported-by: Anatoly Borodin <anatoly.borodin@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-27 18:28:31 +01:00
Dmitry V. Levin b29cd73f56 ptrace.2: Do not say that PTRACE_O_TRACESYSGOOD may not work
Remove the old statement that PTRACE_O_TRACESYSGOOD may not work
on all architectures.  As far as I can tell, all kernel code
properly tests PT_TRACESYSGOOD flag and sets the 7th bit in the
exit code passed to ptrace_notify().

Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-27 08:14:54 +01:00
Michael Kerrisk 4a5a783d8f prctl.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 20:54:48 +01:00
Michael Kerrisk a32c96b894 prctl.2: Explain the circumstances in which the parent-death signal is sent
To test the behavior documented by this patch, the following
demos employ the program shown at the foot of this commit message.

First, show that the pdeath signal is sent when the parent
terminates:

$ ./pdeath_signal 0 10 4
Parent (18595) about to sleep for 4 seconds
Child about to set PR_SET_PDEATHSIG
Child about to sleep
Parent (18595) terminating
*********** Child (18596) got signal; si_pid = 18595; si_uid = 1000
            Parent PID is now 1403
$ Child about to exit

But the signal is not sent if the parent terminates before the
child uses PR_SET_PDEATHSIG:

$ ./pdeath_signal 2 10  0
Parent (18707) about to sleep for 0 seconds
Parent (18707) terminating
Child about to sleep 2 seconds before setting PR_SET_PDEATHSIG
$ Child about to set PR_SET_PDEATHSIG
Child about to sleep
Child about to exit

Demonstrate that the pdeath signal is sent on termination of each
ancestor subreaper process:

$ ./pdeath_signal 2 10 3 7 6 5
18786 marked itself as a subreaper
18786 subreaper about to sleep 7 seconds
18787 marked itself as a subreaper
18787 subreaper about to sleep 6 seconds
18788 marked itself as a subreaper
18788 subreaper about to sleep 5 seconds
Parent (18789) about to sleep for 3 seconds
Child about to sleep 2 seconds before setting PR_SET_PDEATHSIG
Child about to set PR_SET_PDEATHSIG
Child about to sleep
Parent (18789) terminating
*********** Child (18790) got signal; si_pid = 18789; si_uid = 1000
            Parent PID is now 18788
18788 subreaper about to terminate
*********** Child (18790) got signal; si_pid = 18788; si_uid = 1000
            Parent PID is now 18787
18787 subreaper about to terminate
*********** Child (18790) got signal; si_pid = 18787; si_uid = 1000
            Parent PID is now 18786
18786 subreaper about to terminate
*********** Child (18790) got signal; si_pid = 18786; si_uid = 1000
            Parent PID is now 1403
$ Child about to exit

But in the case where some subreapers terminate before they
have a chance to adopt the child, the terminations of those
subreapers do not result in a signal for the child:

$ ./pdeath_signal 2 10 3 5 6 7
18836 marked itself as a subreaper
18836 subreaper about to sleep 5 seconds
18837 marked itself as a subreaper
18837 subreaper about to sleep 6 seconds
18838 marked itself as a subreaper
18838 subreaper about to sleep 7 seconds
Parent (18839) about to sleep for 3 seconds
Child about to sleep 2 seconds before setting PR_SET_PDEATHSIG
Child about to set PR_SET_PDEATHSIG
Child about to sleep
Parent (18839) terminating
*********** Child (18840) got signal; si_pid = 18839; si_uid = 1000
            Parent PID is now 18838
18836 subreaper about to terminate
$ 18837 subreaper about to terminate
18838 subreaper about to terminate
*********** Child (18840) got signal; si_pid = 18838; si_uid = 1000
            Parent PID is now 1403
Child about to exit

============================

/* pdeath_signal.c */

                        } while (0)

static void
handler(int sig, siginfo_t *si, void *ucontext)
{
    printf("*********** Child (%ld) got signal; si_pid = %d; si_uid = %d\n",
            (long) getpid(), si->si_pid, si->si_uid);
    printf("            Parent PID is now %ld\n", (long) getppid());
}

int
main(int argc, char *argv[])
{
    struct sigaction sa;
    int childPreSleep, childPostSleep, parentSleep;

    if (argc < 2) {
        fprintf(stderr, "Usage: %s child-pre-sleep "
                "[child-post-sleep [parent-sleep [subreaper-sleep...]]]\n",
                argv[0]);
        exit(EXIT_FAILURE);
    }

    childPreSleep = atoi(argv[1]);
    if (argc > 2)
        childPostSleep = atoi(argv[2]);
    if (argc > 3)
        parentSleep = atoi(argv[3]);

    /* Optionally create a series of subreapers */

    if (argc > 4) {
        for (int sr = 4; sr < argc; sr++) {
            if (prctl(PR_SET_CHILD_SUBREAPER, 1) == -1)
                errExit("prctl");
            printf("%ld marked itself as a subreaper\n", (long) getpid());
            switch (fork()) {
            case -1:
                errExit("fork");
            case 0:
                break;
            default:
                printf("%ld subreaper about to sleep %s seconds\n",
                        (long) getpid(), argv[sr]);
                sleep(atoi(argv[sr]));
                printf("%ld subreaper about to terminate\n", (long) getpid());
                exit(EXIT_SUCCESS);
            }
        }
    }

    switch (fork()) {
    case -1:
        errExit("fork");

    case 0:
        sa.sa_flags = SA_SIGINFO;
        sigemptyset(&sa.sa_mask);
        sa.sa_sigaction = handler;
        if (sigaction(SIGUSR1, &sa, NULL) == -1)
            errExit("sigaction");

        if (childPreSleep > 0) {
            printf("Child about to sleep %d seconds before setting "
                    "PR_SET_PDEATHSIG\n", childPreSleep);
            sleep(childPreSleep);
        }

        printf("Child about to set PR_SET_PDEATHSIG\n");
        if (prctl(PR_SET_PDEATHSIG, SIGUSR1) == -1)
            errExit("prctl");

        printf("Child about to sleep\n");
        for (int j = 0; j < childPostSleep; j++)
            sleep(1);

        printf("Child about to exit\n");
        exit(EXIT_SUCCESS);

    default:
        printf("Parent (%ld) about to sleep for %d seconds\n",
                (long) getpid(), parentSleep);
        sleep(parentSleep);
        printf("Parent (%ld) terminating\n", (long) getpid());
        exit(EXIT_SUCCESS);
    }
}

Reported-by: Jann Horn <jann@thejh.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 13:00:52 +01:00
Michael Kerrisk 29b249db56 prctl.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 12:44:27 +01:00
Michael Kerrisk fdda93639e prctl.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 11:25:28 +01:00
Michael Kerrisk e256205a55 prctl.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 11:23:55 +01:00
Michael Kerrisk 300a9c78f3 prctl.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 11:22:47 +01:00
Michael Kerrisk a09b5995c3 prctl.2: Add additional info on PR_SET_PDEATHSIG
The signal is process directed and the siginfo_t->si_pid
filed contains the PID of the terminating parent.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 11:20:09 +01:00
Michael Kerrisk 910b068989 prctl.2: Rework the PR_SET_PDEATHSIG description a little, for easier readability
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 10:47:21 +01:00
Michael Kerrisk c5236575ca prctl.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 10:38:07 +01:00
Jann Horn c62b945324 ptrace.2: BUGS: ptrace() may set errno to zero
ptrace() with requests PTRACE_PEEKTEXT, PTRACE_PEEKDATA and
PTRACE_PEEKUSER can set errno to zero. AFAICS this is for a good
reason (so that you can tell the difference between a successful
PEEK with a result of -1 and a failed PEEK, even if you forget to
clear errno yourself), but it technically violates the rules
described in the errno.3 manpage.

glibc snippet from sysdeps/unix/sysv/linux/ptrace.c:

  res = INLINE_SYSCALL (ptrace, 4, request, pid, addr, data);
  if (res >= 0 && request > 0 && request < 4)
    {
      __set_errno (0);
      return ret;
    }

reproducer:

$ cat ptrace_test.c
char foobar_data[4] = "ABCD";
int main(void) {
  pid_t child = fork();
  if (child == -1) err(1, "fork");
  if (child == 0) {
    if (prctl(PR_SET_PDEATHSIG, SIGKILL)) err(1, "prctl");
    while (1) sleep(1);
  }
  int status;
  if (ptrace(PTRACE_ATTACH, child, NULL, NULL)) err(1, "attach");
  if (waitpid(child, &status, 0) != child) err(1, "wait");
  errno = EINVAL;
  unsigned int res = ptrace(PTRACE_PEEKDATA, child, foobar_data, NULL);
  printf("errno after PEEKDATA: %d\n", errno);
  printf("PEEKDATA result: 0x%x\n", res);
}
$ gcc -o ptrace_test ptrace_test.c -Wall
$ ./ptrace_test
errno after PEEKDATA: 0
PEEKDATA result: 0x44434241

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 08:16:03 +01:00
Jann Horn d6868c69b3 clone.2: Pending CLONE_NEWPID prevents thread creation
See copy_process() in kernel/fork.c:

	if (clone_flags & CLONE_THREAD) {
		if ((clone_flags & (CLONE_NEWUSER | CLONE_NEWPID)) ||
		    (task_active_pid_ns(current) !=
				current->nsproxy->pid_ns_for_children))
			return ERR_PTR(-EINVAL);
	}

current->nsproxy->pid_ns_for_children is where unshare(CLONE_NEWPID)
stashes the pending namespace.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 08:13:11 +01:00
Anthony Iliopoulos 6684e3e4ff fanotify.7: wfix
Use "FAN_OPEN_PERM" consistently rather than "FAN_PERM_OPEN".

Signed-off-by: Anthony Iliopoulos <ailiopoulos@suse.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 08:04:42 +01:00
Michael Kerrisk 63e59e0d31 system.3: Note that system() can fail for the same reasons as fork(2)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 08:02:48 +01:00
Michael Kerrisk 0cb0dd1c5d system.3: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 08:00:34 +01:00