Commit Graph

21163 Commits

Author SHA1 Message Date
Michael Kerrisk 9e099e691e clock_getcpuclockid.3, pthread_getcpuclockid.3: wfix: use 'clockid' rather than 'clock_id'
For consistency across pages.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk ba1c6b2081 clock_getres.2: wfix: s/clk_id/clockid/ throughout
Most other manual pages use 'clockid' for the 'clockid_t'
argument.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk 717096082d clock_nanosleep.2: wfix: s/clock_id/clockid/ throughout
Most other section 2 pages use 'clockid' as the name
of the 'clockid_t' argument.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk 96d951a401 clock_nanosleep.2, timer_create.2, timerfd_create.2: Add various missing errors
Mostly verified by testing and reading the code.

There is unfortunately quite a bit of inconsistency across API~s:

                  clock_gettime  clock_settime  clock_nanosleep  timer_create  timerfd_create

CLOCK_BOOTTIME            y        n (EINVAL)     y                y             y
CLOCK_BOOTTIME_ALARM      y        n (EINVAL)     y [1]            y [1]         y [1]
CLOCK_MONOTONIC           y        n (EINVAL)     y                y             y
CLOCK_MONOTONIC_COARSE    y        n (EINVAL)     n (ENOTSUP)      n (ENOTSUP)   n (EINVAL)
CLOCK_MONOTONIC_RAW       y        n (EINVAL)     n (ENOTSUP)      n (ENOTSUP)   n (EINVAL)
CLOCK_REALTIME            y        y              y                y             y
CLOCK_REALTIME_ALARM      y        n (EINVAL)     y [1]            y [1]         y [1]
CLOCK_REALTIME_COARSE     y        n (EINVAL)     n (ENOTSUP)      n (ENOTSUP)   n (EINVAL)
CLOCK_TAI                 y        n (EINVAL)     y                y             n (EINVAL)
CLOCK_PROCESS_CPUTIME_ID  y        n (EINVAL)     y                y             n (EINVAL)
CLOCK_THREAD_CPUTIME_ID   y        n (EINVAL)     n (EINVAL [2])   y             n (EINVAL)
pthread_getcpuclockid()   y        n (EINVAL)     y                y             n (EINVAL)

[1] The caller must have CAP_WAKE_ALARM, or the error EPERM results.

[2] This error is generated in the glibc wrapper.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk 04e2e313fc timerfd_create.2: Rework text for EINVAL for invalid clock ID
The error description was crufty. There are more valid
clock IDs these days.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk d53b0f4822 clock_nanosleep.2: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk 0e7984ff40 clock_nanosleep.2: clock_nanosleep() can also sleep against CLOCK_TAI
Presumably since Linux 3.10, when CLOCK_TAI was added to the
kernel.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk b24db7cb8a clock_nanosleep.2: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:25 +02:00
Michael Kerrisk 14df252bf8 clock_getres.2: CLOCK_REALTIME_COARSE is not settable
In kernel/time/posix-timers.c, 'CLOCK_REALTIME_COARSE' has
no 'timer_set' method.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:57:18 +02:00
Michael Kerrisk 41043c0bd6 clock_getres.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:42:54 +02:00
Michael Kerrisk ac90b58942 clock_getres.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:42:54 +02:00
Michael Kerrisk eb6567fb00 clock_getres.2: Add CLOCK_REALTIME_ALARM and CLOCK_BOOTTIME_ALARM
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:42:54 +02:00
Michael Kerrisk da8a95bca1 timer_create.2: timer_create(2) also supports CLOCK_TAI
Presumably (and from a quick glance at the source code)
since Linux 3.10. when CLOCK_TAI was introduced.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:42:54 +02:00
Michael Kerrisk 0e4b87c4fd timer_create.2: Mention clock_getres(2) for further details on the various clocks
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 12:42:54 +02:00
Michael Kerrisk 966051ca74 clock_nanosleep.2: clock_nanosleep() also supports CLOCK_BOOTTIME
Presumably (and from a quick glance at the source code)
since Linux 2.6.39, when CLOCK_BOOTTIME was introduced.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-02 09:34:37 +02:00
Michael Kerrisk 2c16f1bc28 clock_getres.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-01 21:30:37 +02:00
Michael Kerrisk 066dcd09cb timerfd_create.2: wfix
See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=947091

Reported-by: Marc Lehmann <debian-reportbug@plan9.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-01 10:04:32 +02:00
Michael Kerrisk 372b58573a openat2.2: srcfix: remove a FIXME
Aleksa Sarai is okay with my text changes.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-01 08:36:49 +02:00
Michael Kerrisk 08ba10a6d5 openat2.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-01 08:18:45 +02:00
Michael Kerrisk b4e1568256 openat2.2: Improve text describing caveat for use of RESOLVE_NO_XDEV
From email discussions with Aleksa Sarai:

> .\" FIXME I find the "previously-functional systems" in the previous
> .\" sentence a little odd (since openat2() ia new sysycall), so I would
> .\" like to clarify a little...
> .\" Are you referring to the scenario where someone might take an
> .\" existing application that uses openat() and replaces the uses
> .\" of openat() with openat2()? In which case, is it correct to
> .\" understand that you mean that one should not just indiscriminately
> .\" add the RESOLVE_NO_XDEV flag to all of the openat2() calls?
> .\" If I'm not on the right track, could you point me in the right
> .\" direction please.

This is mostly meant as a warning to hopefully avoid applications
because the developer didn't realise that system paths may contain
symlinks or bind-mounts. For an application which has switched to
openat2() and then uses RESOLVE_NO_SYMLINKS for a non-security reason,
it's possible that on some distributions (or future versions of a
distribution) that their application will stop working because a system
path suddenly contains a symlink or is a bind-mount.

This was a concern which was brought up on LWN some time ago. If you can
think of a phrasing that makes this more clear, I'd appreciate it.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-04-01 08:16:40 +02:00
Michael Kerrisk c85ebb3c94 openat2.2: Various tweaks to the dicussion of 'resolve' flags
Some tweaks inspired by https://lwn.net/Articles/796868/

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-31 09:50:48 +02:00
Michael Kerrisk e31d5bfd36 openat2.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-31 09:11:20 +02:00
Michael Kerrisk 193f7fb272 openat2.2: Place 'resolve' flags in alphabetical order
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-31 09:10:05 +02:00
Michael Kerrisk 1ae24555ba timerfd_create.2: Negetive changes to CLOCK_REALTIME may cause read() to return 0
Devi R K reported this issue, and went on to note:

> We have written a program using real time clock and it has been raised to
> the community.
>
> https://lore.kernel.org/lkml/alpine.DEB.2.21.1908191943280.1796@nanos.tec.linutronix.de/T/

[...]

Thanks for pointing me at that thread. In particular, the test
program at
https://lore.kernel.org/lkml/alpine.DEB.2.21.1908191943280.1796@nanos.tec.linutronix.de/T/#m489d81abdfbb2699743e18c37657311f8d52a4cd

[...]

I think this patch does not really capture the details
properly. The immediately preceding paragraph says:

         If  the  associated  clock  is  either  CLOCK_REALTIME   or
         CLOCK_REALTIME_ALARM,     the     timer     is     absolute
         (TFD_TIMER_ABSTIME), and the  flag  TFD_TIMER_CANCEL_ON_SET
         was  specified when calling timerfd_settime(), then read(2)
         fails with the  error  ECANCELED  if  the  real-time  clock
         undergoes a discontinuous change.  (This allows the reading
         application to discover such discontinuous changes  to  the
         clock.)

Following on from that, I think we should have a paragraph that says
something like:

         If  the  associated  clock  is  either  CLOCK_REALTIME   or
         CLOCK_REALTIME_ALARM,     the     timer     is     absolute
         (TFD_TIMER_ABSTIME), and the  flag  TFD_TIMER_CANCEL_ON_SET
         was not specified when calling timerfd_settime(), then a
         discontinuous negative change to the clock
         (e.g., clock_settime(2)) may cause read(2) to unblock, but
         return a value of 0 (i.e., no bytes read), if the clock
         change occurs after the time expired, but before the
         read(2) on the timerfd file descriptor.

This seems consistent with Thomas's observations in
https://lore.kernel.org/lkml/alpine.DEB.2.21.1908191943280.1796@nanos.tec.linutronix.de/T/#m49b78122b573a2749a05b720dc9fa036546db490

==
Thomas Gleixner replied:

Yes, that's correct. Accurate as always!

This is pretty much in line with clock_nanosleep(CLOCK_REALTIME,
TIMER_ABSTIME) which has a similar problem vs. observability in user
space.

clock_nanosleep(2) mutters:

  "POSIX.1 specifies that after changing the value of the CLOCK_REALTIME
   clock via clock_settime(2), the new clock value shall be used to
   determine the time at which a thread blocked on an absolute
   clock_nanosleep() will wake up; if the new clock value falls past the
   end of the sleep interval, then the clock_nanosleep() call will return
   immediately."

which can be interpreted as guarantee that clock_nanosleep() never
returns prematurely, i.e. the assert() in the below code would indicate
a kernel failure:

   ret = clock_nanosleep(CLOCK_REALTIME, TIMER_ABSTIME, &expiry, NULL);
   if (!ret) {
         clock_gettime(CLOCK_REALTIME, &now);
         assert(now >= expiry);
   }

But that assert can trigger when CLOCK_REALTIME was modified after the
timer fired and the kernel decided to wake up the task and let it return
to user space.

   clock_nanosleep(..., &expiry)
     arm_timer(expires);
     schedule();

   -> timer interrupt
      now = ktime_get_real();
      if (expires <= now)
              -------------------------------- After this point
         wakeup();                             clock_settime(2) or
                                               adjtimex(2) which
                                               makes CLOCK_REALTIME
                                               jump back far enough will
                                               cause the above assert
                                               to trigger.

   ...
   return from syscall (retval == 0)

There is no guarantee against clock_settime() coming after the
wakeup. Even if we put another check into the return to user path then
we won't catch a clock_settime() which comes right after that and before
user space invokes clock_gettime().

POSIX spec Issue 7 (2018 edition) says:

 The suspension for the absolute clock_nanosleep() function (that is,
 with the TIMER_ABSTIME flag set) shall be in effect at least until the
 value of the corresponding clock reaches the absolute time specified by
 rqtp.

And that's what the kernel implements for clock_nanosleep() and timerfd
behaves exactly the same way.

The wakeup of the waiter, i.e. task blocked in clock_nanosleep(2),
read(2), poll(2), is not happening _before_ the absolute time specified
is reached.

If clock_settime() happens right before the expiry check, then it does
the right thing, but any modification to the clock after the wakeup
cannot be mitigated. At least not in a way which would make the assert()
in the example code above a reliable indicator for a kernel fail.

That's the reason why I rejected the attempt to mitigate that particular
0 tick issue in timerfd as it would just scratch a particular itch but
still not provide any guarantee. So having the '0' return documented is
the right way to go.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: devi R.K <devi.feb27@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 22:52:58 +02:00
Michael Kerrisk 1f4cf8e85e openat2.2: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 22:41:25 +02:00
Michael Kerrisk 6b6505af4d path_resolution.7: srcfix: semantic newlines
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 22:36:13 +02:00
Aleksa Sarai 61d24bff30 path_resolution.7: Update to mention openat2(2) features
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 22:35:33 +02:00
Michael Kerrisk 7a18f60e4d openat2.2: Minor tweaks to the text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 22:26:50 +02:00
Michael Kerrisk 552f379960 openat2.2: Further tweaks to the RESOLVE_IN_ROOT text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 21:52:32 +02:00
Michael Kerrisk 9e0168b018 openat2.2: Minor tweaks to RESOLVE_IN_ROOT text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 21:41:47 +02:00
Michael Kerrisk 75cd77e3c1 openat2.2: Minor change: reword a sentence
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 21:20:41 +02:00
Michael Kerrisk 39bfd04683 openat2.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 21:17:52 +02:00
Michael Kerrisk a424f7c064 openat2.2: srcfix: Disfavor multiargument .BR and .IR usage
For me, source lines such as:

    .BR perf_setattr "(2), " perf_event_open "(2), and " clone3 (2).

is harder to read than:

    .BR perf_setattr (2),
    .BR perf_event_open (2),
    and
    .BR clone3 (2).

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 14:52:47 +02:00
Michael Kerrisk d144dc36b8 openat2.2: Rework RESOLVE_IN_ROOT text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 14:52:44 +02:00
Michael Kerrisk 36c9d56de6 openat2.2: Reorganize and rework introductory text a little
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 14:52:40 +02:00
Michael Kerrisk 6c6945d461 openat2.2: Remove one of the forward references to the "Extensibility" subsection
There are currently three of these forward references (two in
DESCRIPTION, one in ERRORS). This is a little redundant.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 14:44:59 +02:00
Michael Kerrisk 4ec6d407a9 openat2.2: Various wording improvements to Aleksa Sarai's text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 14:44:59 +02:00
Michael Kerrisk 7a11fc63b8 open.2: Clarify that O_NOFOLLOW is relevant (only) for basename of 'pathname'
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 12:32:07 +02:00
Michael Kerrisk 7b7aad695b openat2.2: wfix: explicitly qualify fields of 'how' argument with "how."
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 12:32:07 +02:00
Michael Kerrisk 3fcaeb806a openat2.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 12:32:07 +02:00
Michael Kerrisk 03625dc12d openat2.2: Place ERRORS in alphabetical order
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 12:32:07 +02:00
Michael Kerrisk 0105739e8b openat2.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 10:20:06 +02:00
Michael Kerrisk 0389373e6e openat2.2: ffix (mainly: replace blank lines by .IP or .PP)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 10:19:20 +02:00
Michael Kerrisk 669403e99e openat2.2: spfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 10:19:20 +02:00
Michael Kerrisk 2d82152f53 openat2.2: srcfix: eliminate redundant blank lines
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 10:19:20 +02:00
Michael Kerrisk 2359744f97 openat2.2: srcfix: semantic newlines and rewrap some long source lines
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 10:19:20 +02:00
Michael Kerrisk 4b322a2fc8 open.2: Minor tweaks to Aleksa Sarai's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 08:45:58 +02:00
Aleksa Sarai a2dbb2e378 open.2: Add references to new openat2(2) page
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 08:37:02 +02:00
Aleksa Sarai 89de505522 openat2.2: Document new openat2(2) syscall
Rather than trying to merge the new syscall documentation into
open.2 (which would probably result in the man-page being
incomprehensible), instead the new syscall gets its own dedicated
page with links between open(2) and openat2(2) to avoid
duplicating information such as the list of O_* flags or common
errors.

In addition to describing all of the key flags, information about
the extensibility design is provided so that users can better
understand why they need to pass sizeof(struct open_how) and how
their programs will work across kernels. After some discussions
with David Laight, I also included explicit instructions to zero
the structure to avoid issues when recompiling with new headers.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-30 08:34:45 +02:00
Michael Kerrisk 238442a2de clock_getres.2: ERRORS: add EINVAL for attempt to set a nonsettable clock
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-03-29 22:36:19 +02:00