mirror of https://github.com/mkerrisk/man-pages
namespaces.7: Rework discussion of cgroup namespaces
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
This commit is contained in:
parent
99ef85aba8
commit
8079aefa6f
|
@ -193,10 +193,10 @@ This file is a handle for the UTS namespace of the process.
|
|||
.\" ==================== Cgroup namespaces ====================
|
||||
.\"
|
||||
.SS Cgroup namespaces (CLONE_NEWCGROUP)
|
||||
Cgroup namespaces virtualize the view of a process's cgroups as seen via
|
||||
.IR /proc/[pid]/cgroup
|
||||
(see
|
||||
.BR cgroups (7)).
|
||||
Cgroup namespaces virtualize the view of a process's cgroups (see
|
||||
.BR cgroups (7))
|
||||
as seen via
|
||||
.IR /proc/[pid]/cgroup .
|
||||
|
||||
Each cgroup namespace has its own set of cgroup root directories,
|
||||
which are the base points for the relative locations displayed in
|
||||
|
@ -209,7 +209,7 @@ with the
|
|||
.BR CLONE_NEWCGROUP
|
||||
flag, then its current cgroups directories become its cgroup root directories.
|
||||
(This applies both for the cgroups version 1 hierarchies
|
||||
as well as the cgroups version 2 unified hierarchy.)
|
||||
and the cgroups version 2 unified hierarchy.)
|
||||
|
||||
When viewing
|
||||
.IR /proc/[pid]/cgroup ,
|
||||
|
@ -223,28 +223,28 @@ entries for each ancestor level in the cgroup hierarchy.
|
|||
|
||||
The following shell session demonstrates the effect of creating
|
||||
a new cgroup namespace.
|
||||
First, we create child cgroup in the
|
||||
First, (as superuser) we create a child cgroup in the
|
||||
.I freezer
|
||||
hierarchy, and put the shell into that cgroup:
|
||||
|
||||
.nf
|
||||
.in +4n
|
||||
$ \fBsudo mkdir \-p /sys/fs/cgroup/freezer/sub\fP
|
||||
$ \fBecho $$\fP # Show PID of this shell
|
||||
# \fBmkdir \-p /sys/fs/cgroup/freezer/sub\fP
|
||||
# \fBecho $$\fP # Show PID of this shell
|
||||
30655
|
||||
$ \fBsudo sh \-c 'echo 30655 > /sys/fs/cgroup/sub'\fP
|
||||
$ \fBcat /proc/self/cgroup | grep freezer\fP
|
||||
# \fBsh \-c 'echo 30655 > /sys/fs/cgroup/sub'\fP
|
||||
# \fBcat /proc/self/cgroup | grep freezer\fP
|
||||
7:freezer:/sub
|
||||
.in
|
||||
.fi
|
||||
|
||||
Next, we use
|
||||
.BR unshare (1)
|
||||
to create a process running a shell in new user and cgroup namespaces:
|
||||
to create a process running a shell in a new cgroup namespace:
|
||||
|
||||
.nf
|
||||
.in +4n
|
||||
$ \fBunshare -U -C bash\fP
|
||||
# \fBunshare \-C bash\fP
|
||||
.in
|
||||
.fi
|
||||
|
||||
|
@ -267,26 +267,65 @@ $ \fBcat /proc/20124/cgroup | grep freezer\fP
|
|||
.in
|
||||
.fi
|
||||
|
||||
The virtualization provided by cgroup namespaces serves at least two purposes.
|
||||
First, it can be used to prevent
|
||||
information leaks whereby cgroup directory paths outside of
|
||||
a container would otherwise be visible to processes in the container.
|
||||
More importantly, this allows easier and more flexible
|
||||
confinement of container root tasks, because they can mount
|
||||
their own cgroup filesystems without needing to gain access to ancestor
|
||||
cgroup directories.
|
||||
So, for example, even if
|
||||
.I /cg/1
|
||||
is owned by uid 100000, a task namespaced under
|
||||
.I /cg/1/2
|
||||
owned by UID 100000 can mount that cgroup but not change settings in
|
||||
.IR /cg/1 .
|
||||
Combined with correct enforcement of hierarchical limits,
|
||||
this prevents that task from escaping its limits.
|
||||
|
||||
Use of cgroup namespaces requires a kernel that is configured with the
|
||||
.B CONFIG_CGROUPS
|
||||
option.
|
||||
|
||||
Among the purposes served by the
|
||||
virtualization provided by cgroup namespaces are the following:
|
||||
.IP * 2
|
||||
It prevents information leaks whereby cgroup directory paths outside of
|
||||
a container would otherwise be visible to processes in the container.
|
||||
Such leakages could, for example,
|
||||
reveal information about the container framework
|
||||
to containerized applications.
|
||||
.IP *
|
||||
It allows easier and more flexible
|
||||
confinement of container root tasks, because they can mount
|
||||
their own cgroup filesystems without gaining access to ancestor
|
||||
cgroup directories.
|
||||
Consider, for example, the following scenario:
|
||||
.RS 4
|
||||
.IP \(bu 2
|
||||
We have a cgroup directory,
|
||||
.IR /cg/1 ,
|
||||
that is owned by user ID 9000.
|
||||
.IP \(bu
|
||||
We have a process,
|
||||
.IR X ,
|
||||
also owned by user ID 9000,
|
||||
that is namespaced under the cgroup
|
||||
.IR /cg/1/2
|
||||
(i.e.,
|
||||
.I X
|
||||
was placed in a new cgroup namespace via
|
||||
.BR clone (2)
|
||||
or
|
||||
.BR unshare (2)
|
||||
with the
|
||||
.BR CLONE_NEWCGROUP
|
||||
flag).
|
||||
.RE
|
||||
.IP
|
||||
In the absence of cgroup namespacing, because the cgroup directory
|
||||
.IR /cg/1
|
||||
is owned (and writable) by UID 9000 and process X is also owned
|
||||
by user ID 9000, then process X would be able to modify the contents
|
||||
of cgroups files (i.e., change cgroup settings) not only in
|
||||
.IR /cg/1/2
|
||||
but also in the ancestor cgroup directory
|
||||
.IR /cg/1 .
|
||||
Namespacing process
|
||||
.IR X
|
||||
under the cgroup directory
|
||||
.IR /cg/1/2
|
||||
prevents it modifying files in
|
||||
.IR /cg/1 ,
|
||||
since it cannot even see the contents of that directory
|
||||
(or of further removed cgroup ancestor directories).
|
||||
Combined with correct enforcement of hierarchical limits,
|
||||
this prevents that process X from escaping the limits imposed
|
||||
by ancestor cgroups.
|
||||
.\"
|
||||
.\" ==================== IPC namespaces ====================
|
||||
.\"
|
||||
|
|
Loading…
Reference in New Issue