mirror of https://github.com/mkerrisk/man-pages
366 lines
9.8 KiB
Groff
366 lines
9.8 KiB
Groff
.\" Copyright (c) 2020 by Michael Kerrisk <mtk.manpages@gmail.com>
|
|
.\"
|
|
.\" %%%LICENSE_START(VERBATIM)
|
|
.\" Permission is granted to make and distribute verbatim copies of this
|
|
.\" manual provided the copyright notice and this permission notice are
|
|
.\" preserved on all copies.
|
|
.\"
|
|
.\" Permission is granted to copy and distribute modified versions of this
|
|
.\" manual under the conditions for verbatim copying, provided that the
|
|
.\" entire resulting derived work is distributed under the terms of a
|
|
.\" permission notice identical to this one.
|
|
.\"
|
|
.\" Since the Linux kernel and libraries are constantly changing, this
|
|
.\" manual page may be incorrect or out-of-date. The author(s) assume no
|
|
.\" responsibility for errors or omissions, or for damages resulting from
|
|
.\" the use of the information contained herein. The author(s) may not
|
|
.\" have taken the same level of care in the production of this manual,
|
|
.\" which is licensed free of charge, as they might when working
|
|
.\" professionally.
|
|
.\"
|
|
.\" Formatted or processed versions of this manual, if unaccompanied by
|
|
.\" the source, must acknowledge the copyright and authors of this work.
|
|
.\" %%%LICENSE_END
|
|
.\"
|
|
.\"
|
|
.TH TIME_NAMESPACES 7 2021-03-22 "Linux" "Linux Programmer's Manual"
|
|
.SH NAME
|
|
time_namespaces \- overview of Linux time namespaces
|
|
.SH DESCRIPTION
|
|
Time namespaces virtualize the values of two system clocks:
|
|
.IP \(bu 2
|
|
.BR CLOCK_MONOTONIC
|
|
(and likewise
|
|
.BR CLOCK_MONOTONIC_COARSE
|
|
and
|
|
.BR CLOCK_MONOTONIC_RAW ),
|
|
a nonsettable clock that represents monotonic time since\(emas
|
|
described by POSIX\(em"some unspecified point in the past".
|
|
.IP \(bu
|
|
.BR CLOCK_BOOTTIME
|
|
(and likewise
|
|
.BR CLOCK_BOOTTIME_ALARM ),
|
|
a nonsettable clock that is identical to
|
|
.BR CLOCK_MONOTONIC ,
|
|
except that it also includes any time that the system is suspended.
|
|
.PP
|
|
Thus, the processes in a time namespace share per-namespace values
|
|
for these clocks.
|
|
This affects various APIs that measure against these clocks, including:
|
|
.BR clock_gettime (2),
|
|
.BR clock_nanosleep (2),
|
|
.BR nanosleep (2),
|
|
.BR timer_settime (2),
|
|
.BR timerfd_settime (2),
|
|
and
|
|
.IR /proc/uptime .
|
|
.PP
|
|
Currently, the only way to create a time namespace is by calling
|
|
.BR unshare (2)
|
|
with the
|
|
.BR CLONE_NEWTIME
|
|
flag.
|
|
This call creates a new time namespace but does
|
|
.I not
|
|
place the calling process in the new namespace.
|
|
Instead, the calling process's
|
|
subsequently created children are placed in the new namespace.
|
|
This allows clock offsets (see below) for the new namespace
|
|
to be set before the first process is placed in the namespace.
|
|
The
|
|
.IR /proc/[pid]/ns/time_for_children
|
|
symbolic link shows the time namespace in which
|
|
the children of a process will be created.
|
|
(A process can use a file descriptor opened on
|
|
this symbolic link in a call to
|
|
.BR setns (2)
|
|
in order to move into the namespace.)
|
|
.\"
|
|
.SS /proc/PID/timens_offsets
|
|
Associated with each time namespace are offsets,
|
|
expressed with respect to the initial time namespace,
|
|
that define the values of the monotonic and
|
|
boot-time clocks in that namespace.
|
|
These offsets are exposed via the file
|
|
.IR /proc/PID/timens_offsets .
|
|
Within this file,
|
|
the offsets are expressed as lines consisting of
|
|
three space-delimited fields:
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
<clock-id> <offset-secs> <offset-nanosecs>
|
|
.EE
|
|
.in
|
|
.PP
|
|
The
|
|
.I clock-id
|
|
is a string that identifies the clock whose offsets are being shown.
|
|
This field is either
|
|
.IR monotonic ,
|
|
for
|
|
.BR CLOCK_MONOTONIC ,
|
|
or
|
|
.IR boottime ,
|
|
for
|
|
.BR CLOCK_BOOTTIME .
|
|
The remaining fields express the offset (seconds plus nanoseconds) for the
|
|
clock in this time namespace.
|
|
These offsets are expressed relative to the clock values in
|
|
the initial time namespace.
|
|
The
|
|
.I offset-secs
|
|
value can be negative, subject to restrictions noted below;
|
|
.I offset-nanosecs
|
|
is an unsigned value.
|
|
.PP
|
|
In the initial time namespace, the contents of the
|
|
.I timens_offsets
|
|
file are as follows:
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
$ \fBcat /proc/self/timens_offsets\fP
|
|
monotonic 0 0
|
|
boottime 0 0
|
|
.EE
|
|
.in
|
|
.PP
|
|
In a new time namespace that has had no member processes,
|
|
the clock offsets can be modified by writing newline-terminated
|
|
records of the same form to the
|
|
.I timens_offsets
|
|
file.
|
|
The file can be written to multiple times,
|
|
but after the first process has been created in or has entered the namespace,
|
|
.BR write (2)s
|
|
on this file fail with the error
|
|
.BR EACCES .
|
|
In order to write to the
|
|
.IR timens_offsets
|
|
file, a process must have the
|
|
.BR CAP_SYS_TIME
|
|
capability in the user namespace that owns the time namespace.
|
|
.PP
|
|
Writes to the
|
|
.I timens_offsets
|
|
file can fail with the following errors:
|
|
.TP
|
|
.B EINVAL
|
|
An
|
|
.I offset-nanosecs
|
|
value is greater than 999,999,999.
|
|
.TP
|
|
.B EINVAL
|
|
A
|
|
.I clock-id
|
|
value is not valid.
|
|
.TP
|
|
.B EPERM
|
|
The caller does not have the
|
|
.BR CAP_SYS_TIME
|
|
capability.
|
|
.TP
|
|
.B ERANGE
|
|
An
|
|
.I offset-secs
|
|
value is out of range.
|
|
In particular;
|
|
.RS
|
|
.IP \(bu 2
|
|
.I offset-secs
|
|
can't be set to a value which would make the current
|
|
time on the corresponding clock inside the namespace a negative value; and
|
|
.IP \(bu
|
|
.I offset-secs
|
|
can't be set to a value such that the time on the corresponding clock
|
|
inside the namespace would exceed half of the value of the kernel constant
|
|
.BR KTIME_SEC_MAX
|
|
(this limits the clock value to a maximum of approximately 146 years).
|
|
.RE
|
|
.PP
|
|
In a new time namespace created by
|
|
.BR unshare (2),
|
|
the contents of the
|
|
.I timens_offsets
|
|
file are inherited from the time namespace of the creating process.
|
|
.SH NOTES
|
|
Use of time namespaces requires a kernel that is configured with the
|
|
.B CONFIG_TIME_NS
|
|
option.
|
|
.PP
|
|
Note that time namespaces do not virtualize the
|
|
.BR CLOCK_REALTIME
|
|
clock.
|
|
Virtualization of this clock was avoided for reasons of complexity
|
|
and overhead within the kernel.
|
|
.PP
|
|
For compatibility with the initial implementation, when writing a
|
|
.I clock-id
|
|
to the
|
|
.IR /proc/[pid]/timens_offsets
|
|
file, the numerical values of the IDs can be written
|
|
instead of the symbolic names show above; i.e., 1 instead of
|
|
.IR monotonic ,
|
|
and 7 instead of
|
|
.IR boottime .
|
|
For readability, the use of the symbolic names over the numbers is preferred.
|
|
.PP
|
|
The motivation for adding time namespaces was to allow
|
|
the monotonic and boot-time clocks to maintain consistent values
|
|
during container migration and checkpoint/restore.
|
|
.SH EXAMPLES
|
|
The following shell session demonstrates the operation of time namespaces.
|
|
We begin by displaying the inode number of the time namespace
|
|
of a shell in the initial time namespace:
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
$ \fBreadlink /proc/$$/ns/time\fP
|
|
time:[4026531834]
|
|
.EE
|
|
.in
|
|
.PP
|
|
Continuing in the initial time namespace, we display the system uptime using
|
|
.BR uptime (1)
|
|
and use the
|
|
.I clock_times
|
|
example program shown in
|
|
.BR clock_getres (2)
|
|
to display the values of various clocks:
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
$ \fBuptime \-\-pretty\fP
|
|
up 21 hours, 17 minutes
|
|
$ \fB./clock_times\fP
|
|
CLOCK_REALTIME : 1585989401.971 (18356 days + 8h 36m 41s)
|
|
CLOCK_TAI : 1585989438.972 (18356 days + 8h 37m 18s)
|
|
CLOCK_MONOTONIC: 56338.247 (15h 38m 58s)
|
|
CLOCK_BOOTTIME : 76633.544 (21h 17m 13s)
|
|
.EE
|
|
.in
|
|
.PP
|
|
We then use
|
|
.BR unshare (1)
|
|
to create a time namespace and execute a
|
|
.BR bash (1)
|
|
shell.
|
|
From the new shell, we use the built-in
|
|
.B echo
|
|
command to write records to the
|
|
.I timens_offsets
|
|
file adjusting the offset for the
|
|
.B CLOCK_MONOTONIC
|
|
clock forward 2 days
|
|
and the offset for the
|
|
.B CLOCK_BOOTTIME
|
|
clock forward 7 days:
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
$ \fBPS1="ns2# " sudo unshare \-T \-\- bash \-\-norc\fP
|
|
ns2# \fBecho "monotonic $((2*24*60*60)) 0" > /proc/$$/timens_offsets\fP
|
|
ns2# \fBecho "boottime $((7*24*60*60)) 0" > /proc/$$/timens_offsets\fP
|
|
.EE
|
|
.in
|
|
.PP
|
|
Above, we started the
|
|
.BR bash (1)
|
|
shell with the
|
|
.B \-\-norc
|
|
options so that no start-up scripts were executed.
|
|
This ensures that no child processes are created from the
|
|
shell before we have a chance to update the
|
|
.I timens_offsets
|
|
file.
|
|
.PP
|
|
We then use
|
|
.BR cat (1)
|
|
to display the contents of the
|
|
.I timens_offsets
|
|
file.
|
|
The execution of
|
|
.BR cat (1)
|
|
creates the first process in the new time namespace,
|
|
after which further attempts to update the
|
|
.I timens_offsets
|
|
file produce an error.
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
ns2# \fBcat /proc/$$/timens_offsets\fP
|
|
monotonic 172800 0
|
|
boottime 604800 0
|
|
ns2# \fBecho "boottime $((9*24*60*60)) 0" > /proc/$$/timens_offsets\fP
|
|
bash: echo: write error: Permission denied
|
|
.EE
|
|
.in
|
|
.PP
|
|
Continuing in the new namespace, we execute
|
|
.BR uptime (1)
|
|
and the
|
|
.I clock_times
|
|
example program:
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
ns2# \fBuptime \-\-pretty\fP
|
|
up 1 week, 21 hours, 18 minutes
|
|
ns2# \fB./clock_times\fP
|
|
CLOCK_REALTIME : 1585989457.056 (18356 days + 8h 37m 37s)
|
|
CLOCK_TAI : 1585989494.057 (18356 days + 8h 38m 14s)
|
|
CLOCK_MONOTONIC: 229193.332 (2 days + 15h 39m 53s)
|
|
CLOCK_BOOTTIME : 681488.629 (7 days + 21h 18m 8s)
|
|
.EE
|
|
.in
|
|
.PP
|
|
From the above output, we can see that the monotonic
|
|
and boot-time clocks have different values in the new time namespace.
|
|
.PP
|
|
Examining the
|
|
.I /proc/[pid]/ns/time
|
|
and
|
|
.I /proc/[pid]/ns/time_for_children
|
|
symbolic links, we see that the shell is a member of the initial time
|
|
namespace, but its children are created in the new namespace.
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
ns2# \fBreadlink /proc/$$/ns/time\fP
|
|
time:[4026531834]
|
|
ns2# \fBreadlink /proc/$$/ns/time_for_children\fP
|
|
time:[4026532900]
|
|
ns2# \fBreadlink /proc/self/ns/time\fP # Creates a child process
|
|
time:[4026532900]
|
|
.EE
|
|
.in
|
|
.PP
|
|
Returning to the shell in the initial time namespace,
|
|
we see that the monotonic and boot-time clocks
|
|
are unaffected by the
|
|
.I timens_offsets
|
|
changes that were made in the other time namespace:
|
|
.PP
|
|
.in +4n
|
|
.EX
|
|
$ \fBuptime \-\-pretty\fP
|
|
up 21 hours, 19 minutes
|
|
$ \fB./clock_times\fP
|
|
CLOCK_REALTIME : 1585989401.971 (18356 days + 8h 38m 51s)
|
|
CLOCK_TAI : 1585989438.972 (18356 days + 8h 39m 28s)
|
|
CLOCK_MONOTONIC: 56338.247 (15h 41m 8s)
|
|
CLOCK_BOOTTIME : 76633.544 (21h 19m 23s)
|
|
.EE
|
|
.in
|
|
.SH SEE ALSO
|
|
.BR nsenter (1),
|
|
.BR unshare (1),
|
|
.BR clock_settime (2),
|
|
.\" clone3() support for time namespaces is a work in progress
|
|
.\" .BR clone3 (2),
|
|
.BR setns (2),
|
|
.BR unshare (2),
|
|
.BR namespaces (7),
|
|
.BR time (7)
|