Commit Graph

8567 Commits

Author SHA1 Message Date
Michael Kerrisk 96d8887df7 timer_create.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-06 10:24:47 +02:00
Michael Kerrisk 65ff4e238d clock_getres.2: Minor clarification in description of CLOCK_BOOTTIME
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-06 10:24:36 +02:00
Michael Kerrisk f3c29937e6 prctl.2: Note semantics of IO_FLUSHER state with respect to fork(2) and execve(2)
Reported-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-06 10:07:04 +02:00
Michael Kerrisk 22a2e0553b lseek.2: ERRORS: ENXIO can also occur SEEK_DATA in middle of hole at end of file
Quoting Matthew Wilcox:

    The current text of the lseek manpage is ambiguous about
    the behaviour of lseek(SEEK_DATA) for a file which is
    entirely a hole (or the end of the file is a hole and the
    pos lies within the hole).  The draft POSIX language is
    specific (ENXIO is returned when whence is SEEK_DATA and
    offset lies within the final hole of the file).  Could I
    trouble you to wordsmith that in?

    If you want to look at the draft POSIX text, it's here:
    https://www.austingroupbugs.net/view.php?id=415

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-06 10:06:56 +02:00
Michael Kerrisk ab366b4567 lseek.2: Minor fix to wording of ENXIO error
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-06 06:59:35 +02:00
Michael Kerrisk bef940caef clock_getres.2: Minor tweaks to example
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-04 10:52:16 +02:00
Michael Kerrisk a04e44bd3a clock_getres.2: Clarify that CLOCK_MONOTONIC is system-wide
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-04 09:49:03 +02:00
Michael Kerrisk 9d69bebbd6 clock_getres.2: Clarify that CLOCK_TAI is nonsettable
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-04 09:49:03 +02:00
Michael Kerrisk 16fa57813e clock_getres.2: Add an example program
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-04 09:32:33 +02:00
Michael Kerrisk a48d19162d clock_getres.2: wfix: EOPNOTSUPP --> ENOTSUP
The two error codes are synonymous, but ENOTSUP is what is used
in other related pages.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 21:54:07 +02:00
Eric Rannaud f873b37560 clock_getres.2: Dynamic POSIX clock devices can return other errors
See Linux source as of v5.4:
  kernel/time/posix-clock.c

Signed-off-by: Eric Rannaud <e@nanocritical.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 21:52:01 +02:00
Michael Kerrisk 0f1553b5fd timerfd_create.2: Note a case where timterfd_settime() can fail with ECANCELED
From email discussions with Thomas Gleixner:

======

Hello Thomas, et al,

Following on from our discussion of read() on a timerfd [1], I
happened to remember a Debian bug report [2] that points out that
timer_settime() can fail with the error ECANCELED, which is both
surprising and odd (because despite the error, the timer does get
updated).

The relevant kernel code (I think, from your commit [3]) seems to be
the following in timerfd_setup():

        if (texp != 0) {
                if (flags & TFD_TIMER_ABSTIME)
                        texp = timens_ktime_to_host(clockid, texp);
                if (isalarm(ctx)) {
                        if (flags & TFD_TIMER_ABSTIME)
                                alarm_start(&ctx->t.alarm, texp);
                        else
                                alarm_start_relative(&ctx->t.alarm, texp);
                } else {
                        hrtimer_start(&ctx->t.tmr, texp, htmode);
                }

                if (timerfd_canceled(ctx))
                        return -ECANCELED;
        }

Using a small test program [4] shows the behavior. The program loops,
repeatedly calling timerfd_settime() (with a delay of a few seconds
before each call). In another terminal window, enter the following
command a few times:

    $ sudo date -s "5 seconds"       # Add 5 secs to wall-clock time

I see behavior as follows (the /sudo date -s "5 seconds"/ command was
executed before loop iterations 0, 2, and 4):

[[
$ ./timerfd_settime_ECANCELED
0
Current time is 1585729978 secs, 868510078 nsecs
Timer value is now 0 secs, 0 nsecs
timerfd_settime() succeeded
Timer value is now 9 secs, 999991977 nsecs

1
Current time is 1585729982 secs, 716339545 nsecs
Timer value is now 6 secs, 152167990 nsecs
timerfd_settime() succeeded
Timer value is now 9 secs, 999992940 nsecs

2
Current time is 1585729991 secs, 567377831 nsecs
Timer value is now 1 secs, 148959376 nsecs
timerfd_settime: Operation canceled
Timer value is now 9 secs, 999976294 nsecs

3
Current time is 1585729995 secs, 405385503 nsecs
Timer value is now 6 secs, 161989917 nsecs
timerfd_settime() succeeded
Timer value is now 9 secs, 999993317 nsecs

4
Current time is 1585730004 secs, 225036165 nsecs
Timer value is now 1 secs, 180346909 nsecs
timerfd_settime: Operation canceled
Timer value is now 9 secs, 999984345 nsecs
]]

I note from the above.

(1) If the wall-clock is changed before the first timerfd_settime()
call, the call succeeds. This is of course expected.
(2) If the wall-clock is changed after a timerfd_settime() call, then
the next timerfd_settime() call fails with ECANCELED.
(3) Even if the timerfd_settime() call fails, the timer is still updated(!).

Some questions:
(a) What is the rationale for timerfd_settime() failing with ECANCELED
in this case? (Currently, the manual page says nothing about this.)
(b) It seems at the least surprising, but more likely a bug, that
timerfd_settime() fails with ECANCELED while at the same time
successfully updating the timer value.

Your thoughts?

Thanks,

Michael

[1] https://lore.kernel.org/lkml/3cbd0919-c82a-cb21-c10f-0498433ba5d1@gmail.com/

[2] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=947091

[3]
commit 99ee5315dac6211e972fa3f23bcc9a0343ff58c4
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Wed Apr 27 14:16:42 2011 +0200

    timerfd: Allow timers to be cancelled when clock was set

[4]
/* timerfd_settime_ECANCELED.c */
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <inttypes.h>
#include <sys/timerfd.h>

#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); } while (0)

int
main(int argc, char *argv[])
{
    struct itimerspec ts, gts;
    struct timespec start;

    int tfd = timerfd_create(CLOCK_REALTIME, 0);
    if (tfd == -1)
        errExit("timerfd_create");

    ts.it_interval.tv_sec = 0;
    ts.it_interval.tv_nsec = 10;

    int flags = TFD_TIMER_ABSTIME | TFD_TIMER_CANCEL_ON_SET;

    for (long j ; ; j++) {

        /* Inject a delay into each loop, by calling getppid()
           many times */

        for (int k = 0; k < 10000000; k++)
            getppid();

        if (j % 1 == 0)
            printf("%ld\n", j);

        /* Display the current wall-clock time */

        if (clock_gettime(CLOCK_REALTIME, &start) == -1)
            errExit("clock_gettime");
        printf("Current time is %ld secs, %ld nsecs\n",
                start.tv_sec, start.tv_nsec);

        /* Before resetting the timer, retrieve its current value
           so that after the timerfd_settime() call, we can see
           whether the the value has changed */

        if (timerfd_gettime(tfd, &gts) == -1)
            perror("timerfd_gettime");
        printf("Timer value is now %ld secs, %ld nsecs\n",
            gts.it_value.tv_sec, gts.it_value.tv_nsec);

        /* Reset the timer to now + 10 secs */

        ts.it_value.tv_sec = start.tv_sec + 10;
        ts.it_value.tv_nsec = start.tv_nsec;
        if (timerfd_settime(tfd, flags, &ts, NULL) == -1)
            perror("timerfd_settime");
        else
            printf("timerfd_settime() succeeded\n");

        /* Display the timer value once again */

        if (timerfd_gettime(tfd, &gts) == -1)
            perror("timerfd_gettime");
        printf("Timer value is now %ld secs, %ld nsecs\n",
            gts.it_value.tv_sec, gts.it_value.tv_nsec);

        printf("\n");
    }
}

=======

Subject: Re: timer_settime() and ECANCELED
Date: Wed, 01 Apr 2020 19:42:42 +0200
From: Thomas Gleixner <tglx@linutronix.de>

Michael,

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
> Following on from our discussion of read() on a timerfd [1], I
> happened to remember a Debian bug report [2] that points out that
> timer_settime() can fail with the error ECANCELED, which is both
> surprising and odd (because despite the error, the timer does get
> updated).
...
> (1) If the wall-clock is changed before the first timerfd_settime()
> call, the call succeeds. This is of course expected.
> (2) If the wall-clock is changed after a timerfd_settime() call, then
> the next timerfd_settime() call fails with ECANCELED.
> (3) Even if the timerfd_settime() call fails, the timer is still updated(!).
>
> Some questions:
> (a) What is the rationale for timerfd_settime() failing with ECANCELED
> in this case? (Currently, the manual page says nothing about this.)
> (b) It seems at the least surprising, but more likely a bug, that
> timerfd_settime() fails with ECANCELED while at the same time
> successfully updating the timer value.

Really good question and TBH I can't remember why this is implemented in
the way it is, but I have a faint memory that at least (a) is
intentional.

After staring at the code for a while I came up with the following
answers:

(a): If the clock was set event ("date -s ...") which triggered the
     cancel was not yet consumed by user space via read(), then that
     information would get lost because arming the timer to the new
     value has to reset the state.

(b): Arming the timer in that case is indeed very questionable, but it
     could be argued that because the clock was set event happened with
     the old expiry value that the new expiry value is not affected.

     I'd be happy to change that and not arm the timer in the case of a
     pending cancel, but I fear that some user space already depends on
     that behaviour.

Thanks,

        tglx

======

Subject: Re: timer_settime() and ECANCELED
Date: Thu, 02 Apr 2020 10:49:18 +0200
From: Thomas Gleixner <tglx@linutronix.de>
To: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com>

"Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com> writes:
> On 4/1/20 7:42 PM, Thomas Gleixner wrote:
>> (b): Arming the timer in that case is indeed very questionable, but it
>>      could be argued that because the clock was set event happened with
>>      the old expiry value that the new expiry value is not affected.
>>
>>      I'd be happy to change that and not arm the timer in the case of a
>>      pending cancel, but I fear that some user space already depends on
>>      that behaviour.
>
> Yes, that's the risk, of course. So, shall we just document all
> this in the manual page?

I think so.

Thanks,

        tglx
======

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 21:43:21 +02:00
Michael Kerrisk b5b0b2882e prctl.2: Reword description of PR_GET_IO_FLUSHER
Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 14:13:51 +02:00
Michael Kerrisk 3872a3d621 prctl.2: Unused args must be zero for PR_GET_IO_FLUSHER and PR_SET_IO_FLUSHER
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 14:08:39 +02:00
Michael Kerrisk 4222606d2a prctl.2: f
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 14:07:12 +02:00
Michael Kerrisk 91e015066f prctl.2: Minor tweaks to Mike Christie's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 14:06:28 +02:00
Mike Christie 308eb2f636 prctl.2: Document PR_SETIO_FLUSHER/GET_IO_FLUSHER
This patch documents the PR_SET_IO_FLUSHER and PR_GET_IO_FLUSHER
prctl commands added to the linux kernel for 5.6 in commit:

    commit 8d19f1c8e1937baf74e1962aae9f90fa3aeab463
    Author: Mike Christie <mchristi@redhat.com>
    Date:   Mon Nov 11 18:19:00 2019 -0600

        prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim

Reviewed-by: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 13:59:05 +02:00
Michael Kerrisk ba1c6b2081 clock_getres.2: wfix: s/clk_id/clockid/ throughout
Most other manual pages use 'clockid' for the 'clockid_t'
argument.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk 717096082d clock_nanosleep.2: wfix: s/clock_id/clockid/ throughout
Most other section 2 pages use 'clockid' as the name
of the 'clockid_t' argument.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk 96d951a401 clock_nanosleep.2, timer_create.2, timerfd_create.2: Add various missing errors
Mostly verified by testing and reading the code.

There is unfortunately quite a bit of inconsistency across API~s:

                  clock_gettime  clock_settime  clock_nanosleep  timer_create  timerfd_create

CLOCK_BOOTTIME            y        n (EINVAL)     y                y             y
CLOCK_BOOTTIME_ALARM      y        n (EINVAL)     y [1]            y [1]         y [1]
CLOCK_MONOTONIC           y        n (EINVAL)     y                y             y
CLOCK_MONOTONIC_COARSE    y        n (EINVAL)     n (ENOTSUP)      n (ENOTSUP)   n (EINVAL)
CLOCK_MONOTONIC_RAW       y        n (EINVAL)     n (ENOTSUP)      n (ENOTSUP)   n (EINVAL)
CLOCK_REALTIME            y        y              y                y             y
CLOCK_REALTIME_ALARM      y        n (EINVAL)     y [1]            y [1]         y [1]
CLOCK_REALTIME_COARSE     y        n (EINVAL)     n (ENOTSUP)      n (ENOTSUP)   n (EINVAL)
CLOCK_TAI                 y        n (EINVAL)     y                y             n (EINVAL)
CLOCK_PROCESS_CPUTIME_ID  y        n (EINVAL)     y                y             n (EINVAL)
CLOCK_THREAD_CPUTIME_ID   y        n (EINVAL)     n (EINVAL [2])   y             n (EINVAL)
pthread_getcpuclockid()   y        n (EINVAL)     y                y             n (EINVAL)

[1] The caller must have CAP_WAKE_ALARM, or the error EPERM results.

[2] This error is generated in the glibc wrapper.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk 04e2e313fc timerfd_create.2: Rework text for EINVAL for invalid clock ID
The error description was crufty. There are more valid
clock IDs these days.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk d53b0f4822 clock_nanosleep.2: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk 0e7984ff40 clock_nanosleep.2: clock_nanosleep() can also sleep against CLOCK_TAI
Presumably since Linux 3.10, when CLOCK_TAI was added to the
kernel.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk b24db7cb8a clock_nanosleep.2: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk 14df252bf8 clock_getres.2: CLOCK_REALTIME_COARSE is not settable
In kernel/time/posix-timers.c, 'CLOCK_REALTIME_COARSE' has
no 'timer_set' method.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:18 +02:00
Michael Kerrisk 41043c0bd6 clock_getres.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:42:54 +02:00
Michael Kerrisk ac90b58942 clock_getres.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:42:54 +02:00
Michael Kerrisk eb6567fb00 clock_getres.2: Add CLOCK_REALTIME_ALARM and CLOCK_BOOTTIME_ALARM
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:42:54 +02:00
Michael Kerrisk da8a95bca1 timer_create.2: timer_create(2) also supports CLOCK_TAI
Presumably (and from a quick glance at the source code)
since Linux 3.10. when CLOCK_TAI was introduced.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:42:54 +02:00
Michael Kerrisk 0e4b87c4fd timer_create.2: Mention clock_getres(2) for further details on the various clocks
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:42:54 +02:00
Michael Kerrisk 966051ca74 clock_nanosleep.2: clock_nanosleep() also supports CLOCK_BOOTTIME
Presumably (and from a quick glance at the source code)
since Linux 2.6.39, when CLOCK_BOOTTIME was introduced.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 09:34:37 +02:00
Michael Kerrisk 2c16f1bc28 clock_getres.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-01 21:30:37 +02:00
Michael Kerrisk 066dcd09cb timerfd_create.2: wfix
See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=947091

Reported-by: Marc Lehmann <debian-reportbug@plan9.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-01 10:04:32 +02:00
Michael Kerrisk 372b58573a openat2.2: srcfix: remove a FIXME
Aleksa Sarai is okay with my text changes.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-01 08:36:49 +02:00
Michael Kerrisk 08ba10a6d5 openat2.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-01 08:18:45 +02:00
Michael Kerrisk b4e1568256 openat2.2: Improve text describing caveat for use of RESOLVE_NO_XDEV
From email discussions with Aleksa Sarai:

> .\" FIXME I find the "previously-functional systems" in the previous
> .\" sentence a little odd (since openat2() ia new sysycall), so I would
> .\" like to clarify a little...
> .\" Are you referring to the scenario where someone might take an
> .\" existing application that uses openat() and replaces the uses
> .\" of openat() with openat2()? In which case, is it correct to
> .\" understand that you mean that one should not just indiscriminately
> .\" add the RESOLVE_NO_XDEV flag to all of the openat2() calls?
> .\" If I'm not on the right track, could you point me in the right
> .\" direction please.

This is mostly meant as a warning to hopefully avoid applications
because the developer didn't realise that system paths may contain
symlinks or bind-mounts. For an application which has switched to
openat2() and then uses RESOLVE_NO_SYMLINKS for a non-security reason,
it's possible that on some distributions (or future versions of a
distribution) that their application will stop working because a system
path suddenly contains a symlink or is a bind-mount.

This was a concern which was brought up on LWN some time ago. If you can
think of a phrasing that makes this more clear, I'd appreciate it.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-01 08:16:40 +02:00
Michael Kerrisk c85ebb3c94 openat2.2: Various tweaks to the dicussion of 'resolve' flags
Some tweaks inspired by https://lwn.net/Articles/796868/

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-31 09:50:48 +02:00
Michael Kerrisk e31d5bfd36 openat2.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-31 09:11:20 +02:00
Michael Kerrisk 193f7fb272 openat2.2: Place 'resolve' flags in alphabetical order
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-31 09:10:05 +02:00
Michael Kerrisk 1ae24555ba timerfd_create.2: Negetive changes to CLOCK_REALTIME may cause read() to return 0
Devi R K reported this issue, and went on to note:

> We have written a program using real time clock and it has been raised to
> the community.
>
> https://lore.kernel.org/lkml/alpine.DEB.2.21.1908191943280.1796@nanos.tec.linutronix.de/T/

[...]

Thanks for pointing me at that thread. In particular, the test
program at
https://lore.kernel.org/lkml/alpine.DEB.2.21.1908191943280.1796@nanos.tec.linutronix.de/T/#m489d81abdfbb2699743e18c37657311f8d52a4cd

[...]

I think this patch does not really capture the details
properly. The immediately preceding paragraph says:

         If  the  associated  clock  is  either  CLOCK_REALTIME   or
         CLOCK_REALTIME_ALARM,     the     timer     is     absolute
         (TFD_TIMER_ABSTIME), and the  flag  TFD_TIMER_CANCEL_ON_SET
         was  specified when calling timerfd_settime(), then read(2)
         fails with the  error  ECANCELED  if  the  real-time  clock
         undergoes a discontinuous change.  (This allows the reading
         application to discover such discontinuous changes  to  the
         clock.)

Following on from that, I think we should have a paragraph that says
something like:

         If  the  associated  clock  is  either  CLOCK_REALTIME   or
         CLOCK_REALTIME_ALARM,     the     timer     is     absolute
         (TFD_TIMER_ABSTIME), and the  flag  TFD_TIMER_CANCEL_ON_SET
         was not specified when calling timerfd_settime(), then a
         discontinuous negative change to the clock
         (e.g., clock_settime(2)) may cause read(2) to unblock, but
         return a value of 0 (i.e., no bytes read), if the clock
         change occurs after the time expired, but before the
         read(2) on the timerfd file descriptor.

This seems consistent with Thomas's observations in
https://lore.kernel.org/lkml/alpine.DEB.2.21.1908191943280.1796@nanos.tec.linutronix.de/T/#m49b78122b573a2749a05b720dc9fa036546db490

==
Thomas Gleixner replied:

Yes, that's correct. Accurate as always!

This is pretty much in line with clock_nanosleep(CLOCK_REALTIME,
TIMER_ABSTIME) which has a similar problem vs. observability in user
space.

clock_nanosleep(2) mutters:

  "POSIX.1 specifies that after changing the value of the CLOCK_REALTIME
   clock via clock_settime(2), the new clock value shall be used to
   determine the time at which a thread blocked on an absolute
   clock_nanosleep() will wake up; if the new clock value falls past the
   end of the sleep interval, then the clock_nanosleep() call will return
   immediately."

which can be interpreted as guarantee that clock_nanosleep() never
returns prematurely, i.e. the assert() in the below code would indicate
a kernel failure:

   ret = clock_nanosleep(CLOCK_REALTIME, TIMER_ABSTIME, &expiry, NULL);
   if (!ret) {
         clock_gettime(CLOCK_REALTIME, &now);
         assert(now >= expiry);
   }

But that assert can trigger when CLOCK_REALTIME was modified after the
timer fired and the kernel decided to wake up the task and let it return
to user space.

   clock_nanosleep(..., &expiry)
     arm_timer(expires);
     schedule();

   -> timer interrupt
      now = ktime_get_real();
      if (expires <= now)
              -------------------------------- After this point
         wakeup();                             clock_settime(2) or
                                               adjtimex(2) which
                                               makes CLOCK_REALTIME
                                               jump back far enough will
                                               cause the above assert
                                               to trigger.

   ...
   return from syscall (retval == 0)

There is no guarantee against clock_settime() coming after the
wakeup. Even if we put another check into the return to user path then
we won't catch a clock_settime() which comes right after that and before
user space invokes clock_gettime().

POSIX spec Issue 7 (2018 edition) says:

 The suspension for the absolute clock_nanosleep() function (that is,
 with the TIMER_ABSTIME flag set) shall be in effect at least until the
 value of the corresponding clock reaches the absolute time specified by
 rqtp.

And that's what the kernel implements for clock_nanosleep() and timerfd
behaves exactly the same way.

The wakeup of the waiter, i.e. task blocked in clock_nanosleep(2),
read(2), poll(2), is not happening _before_ the absolute time specified
is reached.

If clock_settime() happens right before the expiry check, then it does
the right thing, but any modification to the clock after the wakeup
cannot be mitigated. At least not in a way which would make the assert()
in the example code above a reliable indicator for a kernel fail.

That's the reason why I rejected the attempt to mitigate that particular
0 tick issue in timerfd as it would just scratch a particular itch but
still not provide any guarantee. So having the '0' return documented is
the right way to go.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: devi R.K <devi.feb27@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 22:52:58 +02:00
Michael Kerrisk 1f4cf8e85e openat2.2: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 22:41:25 +02:00
Michael Kerrisk 7a18f60e4d openat2.2: Minor tweaks to the text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 22:26:50 +02:00
Michael Kerrisk 552f379960 openat2.2: Further tweaks to the RESOLVE_IN_ROOT text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 21:52:32 +02:00
Michael Kerrisk 9e0168b018 openat2.2: Minor tweaks to RESOLVE_IN_ROOT text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 21:41:47 +02:00
Michael Kerrisk 75cd77e3c1 openat2.2: Minor change: reword a sentence
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 21:20:41 +02:00
Michael Kerrisk 39bfd04683 openat2.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 21:17:52 +02:00
Michael Kerrisk a424f7c064 openat2.2: srcfix: Disfavor multiargument .BR and .IR usage
For me, source lines such as:

    .BR perf_setattr "(2), " perf_event_open "(2), and " clone3 (2).

is harder to read than:

    .BR perf_setattr (2),
    .BR perf_event_open (2),
    and
    .BR clone3 (2).

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 14:52:47 +02:00
Michael Kerrisk d144dc36b8 openat2.2: Rework RESOLVE_IN_ROOT text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 14:52:44 +02:00
Michael Kerrisk 36c9d56de6 openat2.2: Reorganize and rework introductory text a little
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 14:52:40 +02:00
Michael Kerrisk 6c6945d461 openat2.2: Remove one of the forward references to the "Extensibility" subsection
There are currently three of these forward references (two in
DESCRIPTION, one in ERRORS). This is a little redundant.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 14:44:59 +02:00
Michael Kerrisk 4ec6d407a9 openat2.2: Various wording improvements to Aleksa Sarai's text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 14:44:59 +02:00
Michael Kerrisk 7a11fc63b8 open.2: Clarify that O_NOFOLLOW is relevant (only) for basename of 'pathname'
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 12:32:07 +02:00
Michael Kerrisk 7b7aad695b openat2.2: wfix: explicitly qualify fields of 'how' argument with "how."
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 12:32:07 +02:00
Michael Kerrisk 3fcaeb806a openat2.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 12:32:07 +02:00
Michael Kerrisk 03625dc12d openat2.2: Place ERRORS in alphabetical order
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 12:32:07 +02:00
Michael Kerrisk 0105739e8b openat2.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 10:20:06 +02:00
Michael Kerrisk 0389373e6e openat2.2: ffix (mainly: replace blank lines by .IP or .PP)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 10:19:20 +02:00
Michael Kerrisk 669403e99e openat2.2: spfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 10:19:20 +02:00
Michael Kerrisk 2d82152f53 openat2.2: srcfix: eliminate redundant blank lines
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 10:19:20 +02:00
Michael Kerrisk 2359744f97 openat2.2: srcfix: semantic newlines and rewrap some long source lines
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 10:19:20 +02:00
Michael Kerrisk 4b322a2fc8 open.2: Minor tweaks to Aleksa Sarai's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 08:45:58 +02:00
Aleksa Sarai a2dbb2e378 open.2: Add references to new openat2(2) page
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 08:37:02 +02:00
Aleksa Sarai 89de505522 openat2.2: Document new openat2(2) syscall
Rather than trying to merge the new syscall documentation into
open.2 (which would probably result in the man-page being
incomprehensible), instead the new syscall gets its own dedicated
page with links between open(2) and openat2(2) to avoid
duplicating information such as the list of O_* flags or common
errors.

In addition to describing all of the key flags, information about
the extensibility design is provided so that users can better
understand why they need to pass sizeof(struct open_how) and how
their programs will work across kernels. After some discussions
with David Laight, I also included explicit instructions to zero
the structure to avoid issues when recompiling with new headers.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 08:34:45 +02:00
Michael Kerrisk 238442a2de clock_getres.2: ERRORS: add EINVAL for attempt to set a nonsettable clock
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-29 22:36:19 +02:00
Michael Kerrisk c009a15c2e clock_getres.2: Improve description of CPU-time clocks
The current descriptions are a bit terse. Make the description
a little clearer.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-29 22:14:00 +02:00
Michael Kerrisk 71b7e2a5dc clock_getres.2: Note that CPU-time clocks are not settable.
Explicitly note that CLOCK_PROCESS_CPUTIME_ID and
CLOCK_PROCESS_CPUTIME_ID are not settable.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-29 22:11:24 +02:00
Michael Kerrisk a215069794 clock_getres.2: Move text in BUGS to NOTES
The fact that CLOCK_PROCESS_CPUTIME_ID and
CLOCK_PROCESS_CPUTIME_ID are not settable isn't a bug,
since POSIX does allow the possibility that these clocks
are not settable.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-29 22:08:10 +02:00
Michael Kerrisk 6cfa7458f7 clock_getres.2: Minor wording improvement
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-29 21:42:21 +02:00
Benjamin Peterson 3b9aa39b85 clock_getres.2: tfix
Signed-off-by: Benjamin Peterson <benjamin@python.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-29 09:20:58 +02:00
André Almeida c1e04f0116 futex.2: wfix
The sixth argument of futex is uaddr2, instead of uaddr.

Signed-off-by: André Almeida <andrealmeid@collabora.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-26 09:48:33 +01:00
Michael Kerrisk bf981e8b3b clock_getres.2: Minor tweaks to Benjamin Peterson's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-26 08:45:20 +01:00
Michael Kerrisk da9fe87d7c clock_getres.2: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-26 08:44:37 +01:00
Benjamin Peterson 3eee751583 clock_getres.2: Document CLOCK_TAI
Signed-off-by: Benjamin Peterson <benjamin@python.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-26 08:41:25 +01:00
Michael Kerrisk e5d8f6046b timerfd_create.2: srcfix: remove obsolete FIXME
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 09:24:44 +01:00
Michael Kerrisk cbf25811f8 timer_getoverrun.2: srcfix: Update FIXME
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 09:24:44 +01:00
Michael Kerrisk 7f4a75814f gettid.2: Remove obsolete FIXME
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 09:24:44 +01:00
Michael Kerrisk b51c940445 execve.2: ERRORS: ENOENT does not occur for missing shared libraries
See http://sourceware.org/bugzilla/show_bug.cgi?id=12241.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 09:24:44 +01:00
Michael Kerrisk a0cdf1e1b3 shmget.2: Add a reference to the example in shmop(2)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 09:24:44 +01:00
Michael Kerrisk 5d7a184304 semctl.2: Add a reference to the example in shmop(2)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 09:24:44 +01:00
Michael Kerrisk 26c8d4bb8c semop.2: Add a reference to the semop(2) example in shmop(2)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 09:24:44 +01:00
Michael Kerrisk fc4774234e shmop.2: EXAMPLE: add a pair of example programs
Add example programs demonstrating usage of shmget(2), shmat(2),
semget(2), semctl(2), and semop(2).

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 09:24:44 +01:00
Michael Kerrisk 90f986d35f semget.2: EXAMPLE: add an example program
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 09:24:44 +01:00
Michael Kerrisk c436f71fa0 statx.2: Minor tweaks to Eric Bigger's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 09:20:50 +01:00
Eric Biggers ba2a520059 statx.2: Document STATX_ATTR_VERITY
Document the verity attribute for statx(), which was added in
Linux 5.5.

For more context, see the fs-verity documentation:
https://www.kernel.org/doc/html/latest/filesystems/fsverity.html

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 09:18:57 +01:00
Jakub Wilk 765a67c3ad semctl.2: tfix
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 09:16:22 +01:00
Krzysztof Małysa d5d482ec7a clone.2: tfix
Fix clone3() syscall description for CLONE_PARENT_SETTID: kernel uses
cl_args.parent_tid instead of the specified cl_args.child_tid.

Signed-off-by: Krzysztof Małysa <varqox@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 09:14:11 +01:00
Mike Frysinger deb1825fcf mlock.2: ffix
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-12 06:43:37 +01:00
Michael Kerrisk cd356fa192 _exit.2: Clarify that raw _exit() system call terminates only the calling thread
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-04 22:36:31 +01:00
Michael Kerrisk 07f462e9f2 _exit.2: Minor wording tweaks
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-04 22:35:11 +01:00
Michael Kerrisk b2a8e05384 stat.2: Clarify definitions of timestamp fields
In particular, make it clear that atime and mtime relate to the
file *data*.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-02 17:45:05 +01:00
Michael Kerrisk 64cde6e3bf poll.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-28 22:12:33 +01:00
Michael Kerrisk 232057c309 inotify_add_watch.2: EXAMPLE: add referecne to example in inotify(7)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-28 22:08:38 +01:00
Michael Kerrisk ce160b21ab semctl.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-28 08:46:58 +01:00
Michael Kerrisk 1923607c30 execve.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-28 08:41:51 +01:00
Michael Kerrisk 63433c537c execve.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-28 08:38:19 +01:00
Michael Kerrisk 5d92031a43 execve.2: Explicitly note that argv[argc] == NULL in the new program
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-28 08:38:19 +01:00
Michael Kerrisk 0262995818 execve.2: wfix 2020-02-28 08:32:32 +01:00
Michael Kerrisk 01b08fe410 msgop.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-27 14:00:09 +01:00
Michael Kerrisk 5abea51fba semop.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 21:48:53 +01:00
Michael Kerrisk 410ad57327 shmop.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 21:47:57 +01:00
Michael Kerrisk 55f2c84816 msgop.2, shmop.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 21:44:42 +01:00
Michael Kerrisk a63243b9e2 msgctl.2: Add information on permission bits (based on sysvipc(7) text)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 16:53:25 +01:00
Michael Kerrisk cce90cf33a msgctl.2: Copy information on 'msqid_ds' fields from sysvipc(7)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 16:53:25 +01:00
Michael Kerrisk dd7f869f66 semctl.2: Minor wording fixes
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 16:53:25 +01:00
Michael Kerrisk 702f1a76e1 semctl.2: Add information on permission bits (based on sysvipc(7) text)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 16:53:25 +01:00
Michael Kerrisk c7d32a9ee7 semctl.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 16:53:25 +01:00
Michael Kerrisk 1c9b3f5ff2 semctl.2: Copy information on 'semid_ds' fields from sysvipc(7)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 16:53:25 +01:00
Michael Kerrisk bede9ac097 shmctl.2: Note that execute permission is not needed for shmat() SHM_EXEC
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 16:53:25 +01:00
Michael Kerrisk aa3b28505e shmctl.2: Add information on permission bits (based on sysvipc(7) text)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 16:53:25 +01:00
Michael Kerrisk 5ec2120137 shmctl.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 16:53:25 +01:00
Michael Kerrisk fc11d0e500 shmctl.2: Some small improvements to the description of the 'shmid_ds' structure
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 16:53:25 +01:00
Michael Kerrisk c41cf60e5c shmctl.2: Copy information on 'shmid_ds' fields from sysvipc(7)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-25 16:53:25 +01:00
Alexander Miller 3b9cbcdc61 pidfd_open.2: wfix
Signed-off-by: Alexander Miller <alex.miller@gmx.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 22:47:39 +01:00
Michael Kerrisk 7b9f319555 clock_getres.2: Tweaks to Helge Deller's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 22:47:39 +01:00
Helge Deller 7b02075b19 clock_getres.2: Consecutive calls for CLOCK_MONOTONIC may return same value
Consecutive calls to clock_gettime(CLOCK_MONOTONIC) are guaranteed
to return MONOTONIC values, which means that they either return
the *SAME* time value like the last call, or a later (higher) time
value.

Due to high resolution counters, like TSC on x86, most people see
that the values returned increase, but on other less common
platforms it's less likely that consecutive calls return newer
values, and instead users may unexpectedly get back the SAME time
value.

I think it makes sense to document that people should not expect
to see "always-growing" time values. For example in Debian I've
seen in quite some source packages where return values of
consecutive calls are compared against each other and then the
package build fails if they are equal (e.g.  ruby-hitimes, ...).

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 22:47:26 +01:00
Eugene Syromyatnikov ea09dfe60c syscalls.2: ffix (trying to fit the table into 80 columns)
* man2/syscalls.2 (.SH DESCRIPTION) <\fBgetdtablesize\fP(2)>: Remove "since
Linux 2.0" part for the osf_getdtablesize note, as syscall is generally
available since Linux 2.0; add line break after the word "as".
(.SH DESCRIPTION) <\fBpwrite\fP(2)>: Add line breaks.
(.SH DESCRIPTION) <\fBvm86old\fP(2)>: Add a line break after "in".

Signed-off-by: Eugene Syromyatnikov <evgsyr@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 21:46:22 +01:00
Michael Kerrisk ea5c73026f syscalls.2: Note that the 5.x series followed 4.20
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 21:42:31 +01:00
Michael Kerrisk 3ed6ea8db5 msgget.2, semget.2, shmget.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk 0f1e53ec43 ioctl_ficlonerange.2, ioctl_fideduperange.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk 58a114f1d9 execve.2: Add a subhead for the discussion of effect on process attributes
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk 1e980a0e8b sched_setattr.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk a2d56dbbe3 mmap.2: Add a subhead for the 'flags' argument
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk dd6ceee6f7 mmap.2: Move some text hidden at the end of DESCRIPTION to NOTES
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk 8762c93c83 ioctl_fat.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk 8a76077272 read.2: srcfix: Add self to copyright
By now, I'm the author of perhaps the majority of the text.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk e6356d613f msgop.2: arcfix: add Bill Pemberton to copyright
The example program in this page is from Bill Pemberton.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk b2a44c2ecc capget.2: srcfix: Add self to copyright
By now, I'm the author of a substantial part of the text.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk 7a5f235074 brk.2: srcfix: Add self to copyright
I'm the author of the majority of the text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk 35e6d17bde bind.2: srcfix: Add self to copyright
I'm the author of the example program and various
other additions to the page.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk 50fd1cfe6f poll.2, atoi.3, gsignal.3, posix_memalign.3, scanf.3: Remove a few mentions of the ancient "Linux libc"
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk 38e17cbacc sigaction.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk 0019177eac getent.1, localedef.1, clock_nanosleep.2, fcntl.2, getitimer.2, getsockopt.2, inotify_init.2, ioctl.2, mlock.2, mprotect.2, quotactl.2, s390_sthyi.2, semctl.2, shmctl.2, shmget.2, wait.2, CPU_SET.3, aio_init.3, des_crypt.3, fmemopen.3, fopencookie.3, fts.3, getaddrinfo.3, getrpcent.3, lio_listio.3, posix_spawn.3, shm_open.3, st.4, elf.5, group.5, proc.5, services.5, aio.7, feature_test_macros.7, keyrings.7, man-pages.7, sigevent.7, tcp.7, udp.7: Global formatting fix: disfavor nonstandard .TP indents
In many cases, these don't improve readability, and (when stacked)
they sometimes have the side effect of sometimes forcing text
to be justified within a narrow column range.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk e41f05af22 splice.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk 962a269d97 epoll_create.2: srcfix: Add self to copyright
By now, I'm the author of the majority of the text.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:23 +01:00
Michael Kerrisk cce4f97420 epoll_create.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:22 +01:00
Michael Kerrisk 23a169f749 epoll_wait.2: A few minor additions and rewrites
And add self to copyright, since, by now, the majority of the
text in the page has now been (re)written by me.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:22 +01:00
Michael Kerrisk ee9c94436f epoll_ctl.2: Various minor additions and clarifications
And add self to copyright, since, by now, I'm the author of
substantial parts of the page.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:22 +01:00
Michael Kerrisk 733192cba6 poll.2: srcfix: Amend copyright
By now, I've written or rewritten pretty much the entire page

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:22 +01:00
Michael Kerrisk 342819c832 poll.2: Improve description of EFAULT error
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:22 +01:00
Michael Kerrisk 8bcfd68889 poll.2: Fix description of ENOMEM error
No file descriptors are being allocated...

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:22 +01:00
Michael Kerrisk 836efc18d7 poll.2: Add an example program
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-23 09:58:22 +01:00
Michael Kerrisk 83bb822c4f poll.2: Mention epoll(7) in the introductory paragraph
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-19 09:38:22 +01:00
Michael Kerrisk 6c485bbb3a poll.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-19 09:38:22 +01:00
Michael Kerrisk 4628e3eca7 select.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-19 09:38:22 +01:00
Michael Kerrisk 602b388d07 select.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-19 09:38:22 +01:00
Michael Kerrisk 95a85f1438 select.2: Note that FD_SET() and FD_CLR() do not return errors
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-19 09:38:22 +01:00
Michael Kerrisk a84ed70098 select.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-19 09:38:22 +01:00
Michael Kerrisk 0a4d109d22 select.2: Consolidate historical glibc pselect() details under one subhead
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-19 09:38:22 +01:00
Michael Kerrisk d221e421a5 select.2: Place the discussion of the self-pipe technique in a headed subsection
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-19 09:38:22 +01:00
Michael Kerrisk 73e8f3b4c1 select_tut.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-19 09:38:22 +01:00
Michael Kerrisk 4551a1b1d1 select_tut.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 22:59:35 +01:00
Michael Kerrisk a1f163d6c0 select_tut.2: wfix: break up a long paragraph
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 22:59:35 +01:00
Michael Kerrisk 97a5a8d838 select_tut.2: Adjust header file includes in example
Employ <sys/select.h>, rather than the historical header files.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 22:59:35 +01:00
Michael Kerrisk 21677b1bb5 select_tut.2: SEE ALSO: shorten this list
select(2) already lists most of these.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 22:59:35 +01:00
Michael Kerrisk 82cf8e8850 select_tut.2: RETURN VALUE: defer to select(2)
Defer to select(2), rather than repeating the information
in this page.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 22:59:35 +01:00
Michael Kerrisk c9a275a703 select_tut.2: DESCRIPTION: defer to select(2)
Avoid duplicating the same information in two pages.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 22:59:35 +01:00
Michael Kerrisk b3b45b2b16 select_tut.2: SYNOPSIS: defer to select(2), rather than repeating the same info
Remove the prototypes, which are detailed in select(2).

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 22:59:35 +01:00
Michael Kerrisk 095d8388cf select.2: Rewrite DESCRIPTION
Improve structure and readability, at the same time incorporating
text and details that were formerly in select_tut(2). Also
move a few details in other parts of the page into DESCRIPTION.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 22:59:35 +01:00
Michael Kerrisk f7cd286592 select.2: Minor wording fix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 11:55:51 +01:00
Michael Kerrisk e5704b1a7a select.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 11:55:51 +01:00
Michael Kerrisk b397824f99 select.2: srcfix: add myself to copyright
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 11:55:51 +01:00
Michael Kerrisk bd89babbed select.2, select_tut.2: Consolidate info on usleep() emulation in one place
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 11:55:51 +01:00
Michael Kerrisk a63fef4359 select.2: Remove details of historical #include requirements
The POSIX situation has been the norm for a long time now,
and including ancient details overcomplicates the page.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 11:55:51 +01:00
Michael Kerrisk f0e902c3a1 select.2: Remove some ancient information about pre-POSIX types for 'timeout'
The discussion about pre-POSIX types for 'timeval' and 'timespec'
is rather old, and these days serves mainly to complicate the
page. Remove it.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 10:14:13 +01:00
Michael Kerrisk b8f8864d29 select.2: Minor fix: add forward reference to 'timeval' description
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 10:14:13 +01:00
Michael Kerrisk 01901530b2 select.2: Consolidate the discussion of pselect into a headed subsection
The current text layout is a little hard to parse, with details of
pselect() spread in the main description.  Move some of that text
to a headed subsection, and add a one-sentence introduction
describing the purpose of pselect().

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 10:14:13 +01:00
Michael Kerrisk 1eda1a3a5b select.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-18 08:40:54 +01:00
Michael Kerrisk 2054c92aa1 syscalls.2: Add new Linux 5.6 system calls
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-15 23:18:58 +01:00
Michael Kerrisk 6b621d05b3 _exit.2, capget.2, fcntl.2, futex.2, listen.2, memfd_create.2, modify_ldt.2, move_pages.2, open.2, perf_event_open.2, ptrace.2, set_thread_area.2, stime.2, syscall.2, sysctl.2, userfaultfd.2, cmsg.3, exit.3, ftime.3, getpt.3, malloc.3, console_codes.4, loop.4, inotify.7, netlink.7, packet.7, rtnetlink.7, tcp.7, unix.7, vsock.7, ldconfig.8: tstamp
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-09 11:53:28 +01:00
Michael Kerrisk 93902a96eb stime.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-09 10:25:48 +01:00
Michael Kerrisk f3aa51b217 fcntl.2: Further tweaks to F_SEAL_FUTURE_WRITE text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-08 14:10:31 +01:00
Michael Kerrisk e15b10ba32 memfd_create.2: Minor tweaks's to Joel Fernandes's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-08 13:04:07 +01:00
Michael Kerrisk fc6a14f557 memfd_create.2: srcfix: semantic line breaks
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-08 13:04:07 +01:00
Joel Fernandes (Google) 98eff9f7e5 memfd_create.2: Update manpage with new memfd F_SEAL_FUTURE_WRITE seal
More details of the seal can be found in the LKML patch:
https://lore.kernel.org/lkml/20181120052137.74317-1-joel@joelfernandes.org/T/#t

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-08 13:04:07 +01:00
Michael Kerrisk e1cd30eb5e fcntl.2: Note kernel version for F_SEAL_FUTURE_WRITE
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-08 13:04:07 +01:00
Michael Kerrisk e9a2c239b3 fcntl.2: Minor tweaks to Joel Fernandes's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-08 12:37:39 +01:00
Michael Kerrisk 9341a793d0 fcntl.2: srcfix: rewrap source lines
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-08 12:31:11 +01:00
Michael Kerrisk e38bb9196d fcntl.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-08 12:09:56 +01:00
Joel Fernandes (Google) 7b7d3b200a fcntl.2: Update manpage with new memfd F_SEAL_FUTURE_WRITE seal
More details of the seal can be found in the LKML patch:
https://lore.kernel.org/lkml/20181120052137.74317-1-joel@joelfernandes.org/T/#t

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-08 12:09:07 +01:00
Benjamin Peterson f3491e47ba exit.3: Use hex for the status mask
Admittedly, the POSIX specification for exit() also uses octal.
However, 0xFF immediately indicates the lowest 8 bits to me
whereas I had to think a bit about the octal mask.

Cowritten-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-08 11:59:37 +01:00
Michael Kerrisk ba9ae75ddb clone.2: Add old EINVAL error for AArch64
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-08 09:11:01 +01:00
Michael Kerrisk f44b032900 listen.2: The 'somaxconn' default value has increased to 4096
See https://bugzilla.suse.com/show_bug.cgi?id=1162464

Reported-by: Peter Gajdos <pgajdos@suse.cz>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-07 18:02:52 +01:00
Michael Kerrisk a2587fbb2e open.2: In O_TMPFILE example, describe alternative linkat() call
This was already shown in an earlier version of the page,
but Adam Borowski's patch replaced it with an alternative.
Probably, it is better to show both possibilities.

Reported-by: "Joseph C. Sible" <josephcsible@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-07 17:59:34 +01:00
Adam Borowski fc29a199d9 open.2: No need for /proc to make an O_TMPFILE file permanent
In the example snippet, we already have the fd, thus there's no
need to refer to the file by name.  And, /proc/ might be not
mounted or not accessible.

Noticed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-07 17:30:45 +01:00
Yu Jian Wu 03ba66f3a6 ioctl_userfaultfd.2: wfix
Hi,

Patch as attached. I think the comment on the variables in the struct is
reversed.

Thanks!

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-02-07 16:21:18 +01:00
Michael Kerrisk c98fe9f8ad seccomp.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-22 05:23:50 +01:00
Michael Kerrisk b386cee344 clone.2: Note the kernel version that added the 'set_tid' feature
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-22 02:19:09 +01:00
Michael Kerrisk 27f14b447a clone.2: Document CLONE_CLEAR_SIGHAND
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-22 02:09:17 +01:00
Denys Vlasenko 302c512cef ptrace.2: PTRACE_EVENT_STOP does not always report SIGTRAP
PTRACE_EVENT_STOP does not always report SIGTRAP, can be the
signal which stopped us

While at it, fix an obvious copy/paste error in
PTRACE_GET_SYSCALL_INFO description.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-22 01:14:19 +01:00
Michael Kerrisk 2b6923ba65 userfaultfd.2: Note that CAP_SYS_PTRACE is checked in the *initial* user namespace
(Add a detail missing in Yang Xu's patch.)

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-21 22:02:04 +01:00
Michael Kerrisk c4f13bc72a userfaultfd.2: Tweaks to Yang Xu's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-21 22:01:35 +01:00
Yang Xu 339b899c4c userfaultfd.2: Add EPERM error
Signed-off-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-21 21:46:23 +01:00
Daniel Colascione f46304f747 perf_event_open.2: Mention EINTR for perf_event_open
Somewhat surprisingly, perf_event_open() can fail with EINTR when
trying to enable perf reporting for a uprobe that's already been
configured for use with ftrace. Mention this error in the man
page.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-21 20:48:28 +01:00
Michael Kerrisk a473f8a707 fanotify_init.2: srcfix
Reported-by: Sam Varshavchik <mrsam@courier-mta.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-21 20:07:34 +01:00
Michael Kerrisk ee8bb310d8 clone.2: Minor tweaks to Adrian Reber's 'set_tid' patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-12 21:47:35 +01:00
Michael Kerrisk 09007c4b88 clone.2: srcfix: semantic line breaks
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-12 21:47:35 +01:00
Adrian Reber bf031aaa54 clone.2: Add clone3() set_tid information
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-12 21:47:29 +01:00
Ponnuvel Palaniyappan 09e456c2d0 futex.2: Fix a bug in the example
The man page contains a trivial bug that's discussed here:
https://stackoverflow.com/q/59628958

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-10 21:30:29 +01:00
Michael Kerrisk 4897b19d4e syscall.2: Minor tweaks to Petr Vorel's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-10 21:12:20 +01:00
Petr Vorel ce0f522790 syscall.2: Update feature test macro requirements
Reported-by: Cyril Hrubis <chrubis@suse.cz>
Signed-off-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-01-10 21:07:55 +01:00
John Hubbard e15ff1e76d move_pages.2: Remove ENOENT from the list of possible return values
Linux kernel commit e78bbfa82624 ("mm: stop returning -ENOENT from
sys_move_pages() if nothing got migrated") had the effect of
*never* returning -ENOENT, in any situation. So we need to update
the man page to reflect that ENOENT is not a possible return
value.

Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-12-30 19:53:31 +01:00
Andy Lutomirski 59b191dc04 modify_ldt.2, set_thread_area.2: Fix type of base_addr
Historically (before Linux 2.6.23), base_addr was unsigned long
for 32-bit code and unsigned int for 64-bit code.  In other words,
it was always a 32-bit value.  When the ldt.h header files were
unified, the type became unsigned int on all systems.  Update
modify_ldt.2 and set_thread_area.2 accordingly.

Indeed, on x86, the GDT and LDT specify 32-bit bases for code and
data segments, and this has nothing to do with the kernel.

Reported-by: "Metzger, Markus T" <markus.t.metzger@intel.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-12-14 05:53:47 +01:00
Michael Kerrisk 36a35d6735 quotactl.2: Don't show numeric values of Q_XQUOTAON XFS_QUOTA_?DQ_* flags
The programmer should not need to care about the numeric values,
and their inclusion is verbosity.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-12-13 18:07:27 +01:00
Michael Kerrisk 0674437054 quotactl.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-12-13 18:03:33 +01:00
Michael Kerrisk fcd4007bfa quotactl.2: srcfix: semantic line breaks
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-12-13 18:03:14 +01:00
Michael Kerrisk 64e4eac9ea quotactl.2: Tweaks to Yang Xu's Q_XQUOTARM EINVAL patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-12-13 18:03:08 +01:00
Yang Xu ef9e5be04f quotactl.2: Add EINVAL error of Q_XQUOTARM operation
Since kernel commit 3dd4d40b4208("xfs: Sanity check flags
of Q_XQUOTARM call"), it has added flags check. If it is
not usr,grp,prj quota type, it will report EINVAL.

Signed-off-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-12-13 17:51:09 +01:00
Nikola Forró b9827733ba copy_file_range.2: tfix
Signed-off-by: Nikola Forró <nforro@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-12-12 09:36:30 +01:00
Michael Kerrisk 782715806c capget.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-12-12 09:33:00 +01:00
Michael Kerrisk c0188da633 capget.2: Add missing details in EPERM error for setting inheritable capabilities
Reported-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-12-12 09:27:28 +01:00
Michael Kerrisk 5dc3d7b78f sysctl.2: This system call was removed in Linux 5.5; adjust the page accordingly
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-12-11 23:01:08 +01:00
Adrian Reber bc03b11659 clone.2: tfix
Added two missing parentheses

Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-12-01 09:03:31 +01:00
Michael Kerrisk 445fc03eeb stime.2: Note that stime() is deprecated
As per glibc 2.31 feature notes.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-23 10:53:56 +01:00
Michael Kerrisk d279876353 gettimeofday.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-23 10:45:43 +01:00
Christian Brauner 97883faea2 clone.2: tfix
This surely meant to say clone3() and not clone(3).

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-21 12:24:24 +01:00
Michael Kerrisk be479fdf02 clone.2: ERRORS: add EINVAL for use of CLONE_PARENT by an init process
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-21 10:52:14 +01:00
Michael Kerrisk 4269a6ab8b clone.2: Some reworking of Christian Braunner's CLONE_PARENT init text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-21 10:52:14 +01:00
Michael Kerrisk d36198870c clone.2: srcfix: rewrap source lines
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-21 10:52:14 +01:00
Christian Brauner a17b9d28c3 clone.2: Mention that CLONE_PARENT is off-limits for inits
The CLONE_PARENT flag cannot but used by init processes. Let's mention
this in the manpages to prevent surprises.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-21 10:52:14 +01:00
Michael Kerrisk a10c5a33de clone.2: Note that CLONE_THREAD causes similar behavior to CLONE_PARENT
The introductory paragraphs note that "the calling process" is
normally synonymous with the "the parent process", except in the
case of CLONE_PARENT. The same is also true of CLONE_THREAD.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-21 10:52:14 +01:00
Michael Kerrisk 324f6154f4 Removed trailing white space at end of lines 2019-11-19 15:31:20 +01:00
Michael Kerrisk a5409de92c clone.2, fallocate.2, ioctl_iflags.2, ioctl_list.2, pidfd_open.2, pivot_root.2, quotactl.2, seccomp.2, select.2, wait.2, proc.5, cgroups.7, netdevice.7, uts_namespaces.7: tstamp
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-19 15:31:20 +01:00
Christian Brauner be66dbc7a7 clone.2: Use pid_t for clone3() {child,parent}_tid
Advertise to userspace that they should use proper pid_t types
for arguments returning a pid.

The kernel-internal struct kernel_clone_args currently uses int
as type and since POSIX mandates that pid_t is a signed integer
type and glibc and friends use int this is not an issue. After
the merge window for v5.5 closes we can switch struct
kernel_clone_args over to using pid_t as well without any danger
in regressing current userspace.

Also note, that the new set tid feature which will be merged for
v5.5 uses pid_t types as well.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-17 18:58:24 +01:00
Christian Brauner 8eea66b8bb clone.2: Check for MAP_FAILED not NULL on mmap()
If mmap() fails it will return MAP_FAILED which according to the manpage
is (void *)-1 not NULL.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-17 18:56:07 +01:00
Christian Brauner 225f5da8ac clone.2: tfix
Fix two spelling mistakes in manpage describing the clone{2,3}()
syscalls/syscall wrappers.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-17 18:55:46 +01:00
Michael Kerrisk efc7fb935e mmap.2: tfix
Reported-by: Marko Myllynen <myllynen@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-16 23:35:14 +01:00
Michael Kerrisk 91243dad42 mmap.2: Some rewording of the description of MAP_STACK
Reword a little to allow for the fact that there are now
*two* reasons to consider using this flag.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-14 22:24:52 +01:00
Michael Kerrisk d3d881232b mmap.2: Note that MAP_STACK exists on some other systems
As noted in man-pages commit 99c3a00027,
MAP_STACK exists on at least OpenBSD and FreeBSD.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-14 22:24:52 +01:00
Michael Kerrisk 1b54731692 pivot_root.2: EXAMPLE: allocate stack using mmap() MAP_STACK rather than malloc()
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-14 22:24:45 +01:00
Michael Kerrisk 99c3a00027 clone.2: Allocate child's stack using mmap(2) rather than malloc(3)
Christian Brauner suggested mmap(MAP_STACKED), rather than
malloc(), as the canonical way of allocating a stack for the
child of clone(), and Jann Horn noted some reasons why:

    Not on Linux, but on OpenBSD, they do use MAP_STACK now
    AFAIK; this was announced here:
    <http://openbsd-archive.7691.n7.nabble.com/stack-register-checking-td338238.html>.
    Basically they periodically check whether the userspace
    stack pointer points into a MAP_STACK region, and if not,
    they kill the process. So even if it's a no-op on Linux, it
    might make sense to advise people to use the flag to improve
    portability? I'm not sure if that's something that belongs
    in Linux manpages.

    Another reason against malloc() is that when setting up
    thread stacks in proper, reliable software, you'll probably
    want to place a guard page (in other words, a 4K PROT_NONE
    VMA) at the bottom of the stack to reliably catch stack
    overflows; and you probably don't want to do that with
    malloc, in particular with non-page-aligned allocations.

And the OpenBSD 6.5 manual pages says:

    MAP_STACK
        Indicate that the mapping is used as a stack. This
        flag must be used in combination with MAP_ANON and
        MAP_PRIVATE.

And I then noticed that MAP_STACK seems already to be on
FreeBSD for a long time:

    MAP_STACK
        Map the area as a stack.  MAP_ANON is implied.
        Offset should be 0, fd must be -1, and prot should
        include at least PROT_READ and PROT_WRITE.  This
        option creates a memory region that grows to at
        most len bytes in size, starting from the stack
        top and growing down.  The stack top is the start‐
        ing address returned by the call, plus len bytes.
        The bottom of the stack at maximum growth is the
        starting address returned by the call.

        The entire area is reserved from the point of view
        of other mmap() calls, even if not faulted in yet.

Reported-by: Jann Horn <jannh@google.com>
Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-14 12:19:21 +01:00
Michael Kerrisk 8dd6b0bcd2 clone.2: Minor tweaks after feedback from Christian Brauner
Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-10 20:39:17 +01:00
Jakub Wilk edf93e146d clone.2: tfix
Remove duplicated word.

Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-09 12:54:41 +01:00
Michael Kerrisk baa435c66c clone.2: Tidy up the description of CLONE_DETACHED
The obsolete CLONE_DETACHED flag has never been properly
documented, but now the discussion CLONE_PIDFD also requires
mention of CLONE_DETACHED. So, properly document CLONE_DETACHED,
and mention its interactions with CLONE_PIDFD.

Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-09 09:09:18 +01:00
Michael Kerrisk f6183e5b21 clone.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-09 09:09:18 +01:00
Michael Kerrisk 981eda4aa5 clone.2: Consistently order paragraphs for CLONE_NEW* flags
Sometimes the descriptions of these flags mentioned the
corresponding section 7 namespace manual page and then the
required capabilities, and sometimes the order was the was
the reverse. Make it consistent.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-09 09:09:18 +01:00
Michael Kerrisk d2799a466c clone.2: Remove various details that are already covered in namespaces pages
Remove details of UTS, IPC, and network namespaces that are
already covered in the corresponding namespaces pages in
section 7. This change is for consistency, since corresponding
details were not provided for other namespace types in clone(2)
and these details do not appear in unshare(2).

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-09 09:09:18 +01:00
Michael Kerrisk 1270276bc3 clone.2: Remove wording that suggests CLONE_NEW* flags are for containers
These flags are used for implementing many other interesting
things by now.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-09 09:09:18 +01:00
Michael Kerrisk f5d5180f5c clone.2: Adjustments to clone3() text as well as some other details in the page
After feedback from Christian Brauner [1], I've adjusted a few pieces
of the clone3() text, and also adjusted some of the older text in
the page.

[1] https://lore.kernel.org/linux-man/20191107151941.dw4gtul5lrtax4se@wittgenstein/

Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-09 09:09:02 +01:00
Michael Kerrisk 1033756742 clone.2: Give the introductory paragraph a new coat of paint
Change the text in the introductory paragraph (which was written
20 years ago) to reflect the fact that clone*() does more things
nowadays.

Cowritten-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-08 16:32:38 +01:00
Michael Kerrisk 1373b98190 ioctl_iflags.2: Emphasize that FS_IOC_GETFLAGS and FS_IOC_SETFLAGS argument is 'int *'
Reported-by: Robert Edmonds <edmonds@debian.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-05 10:31:39 +01:00
Michael Kerrisk 556e715a8a ioctl_list.2: Add reference to ioctl(2) SEE ALSO section
The referenced section lists various pages that document ioctls.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-05 10:06:03 +01:00
Andrew Price 3bf86e7d53 fallocate.2: Add gfs2 to the list of punch hole-capable filesystems
Also remove a stray " from the previous item.

Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-01 09:35:12 +01:00
Michael Kerrisk 75e28ebad4 clone.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-31 14:22:13 +01:00
Michael Kerrisk 400027959e clone.2, proc.5: Adjust references to namespaces(7)
Adjust references to namespaces(7) to be references to pages
describing specific namespace types.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-31 11:34:08 +01:00
Michael Kerrisk 96e60ae500 quotactl.2: wfix: consistently use 'operation', rather than 'command'
A mix of the two words was being used, with 'operation' being
more common.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-31 07:02:10 +01:00
Michael Kerrisk a5394cba1c quotactl.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-31 07:02:10 +01:00
Michael Kerrisk f5fd82cc4e quotactl.2: Minor tweaks to Yang Xu's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-31 07:02:10 +01:00
Yang Xu ae848b1d80 quotactl.2: Add some details about Q_QUOTAON
For Q_QUOTAON, on old kernel we can use quotacheck -ug to generate
quota files. But in current kernel, we can also hide them in
system inodes and indicate them by using "quota" or project
feature.

For user or group quota, we can do as below (etc ext4):

mkfs.ext4 -F -o quota /dev/sda5
mount /dev/sda5 /mnt
quotactl(QCMD(Q_QUOTAON, USRQUOTA), /dev/sda5, QFMT_VFS_V0, NULL);

For project quota, we can do as below (etc ext4):

mkfs.ext4 -F -o quota,project /dev/sda5
mount /dev/sda5 /mnt
quotactl(QCMD(Q_QUOTAON, PRJQUOTA), /dev/sda5, QFMT_VFS_V0, NULL);

Reported-by: Jan Kara <jack@suse.cz>
Signed-off-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-31 07:01:59 +01:00
Yang Xu 13a07cc485 copy_file_range.2: tfix
Signed-off-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-31 06:24:58 +01:00
Jakub Wilk a9e52b437f clone.2: Include clone3 in NAME section.
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-31 06:22:15 +01:00