namespaces.7: Rework discussion of cgroup namespaces

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2016-05-06 15:01:11 +02:00 · 2016-05-06 15:01:11 +02:00 · 8079aefa6f
parent 99ef85aba8
commit 8079aefa6f
1 changed files with 68 additions and 29 deletions
--- a/man7/namespaces.7
+++ b/man7/namespaces.7
@ -193,10 +193,10 @@ This file is a handle for the UTS namespace of the process.
 .\" ==================== Cgroup namespaces ====================
 .\"
 .SS Cgroup namespaces (CLONE_NEWCGROUP)
-Cgroup namespaces virtualize the view of a process's cgroups as seen via
-.IR /proc/[pid]/cgroup
-(see
-.BR cgroups (7)).
+Cgroup namespaces virtualize the view of a process's cgroups (see
+.BR cgroups (7))
+as seen via
+.IR /proc/[pid]/cgroup .

 Each cgroup namespace has its own set of cgroup root directories,
 which are the base points for the relative locations displayed in
@ -209,7 +209,7 @@ with the
 .BR CLONE_NEWCGROUP
 flag, then its current cgroups directories become its cgroup root directories.
 (This applies both for the cgroups version 1 hierarchies
-as well as the cgroups version 2 unified hierarchy.)
+and the cgroups version 2 unified hierarchy.)

 When viewing
 .IR /proc/[pid]/cgroup ,
@ -223,28 +223,28 @@ entries for each ancestor level in the cgroup hierarchy.

 The following shell session demonstrates the effect of creating
 a new cgroup namespace.
-First, we create child cgroup in the
+First, (as superuser) we create a child cgroup in the
 .I freezer
 hierarchy, and put the shell into that cgroup:

 .nf
 .in +4n
-$ \fBsudo mkdir \-p /sys/fs/cgroup/freezer/sub\fP
-$ \fBecho $$\fP                      # Show PID of this shell
+# \fBmkdir \-p /sys/fs/cgroup/freezer/sub\fP
+# \fBecho $$\fP                      # Show PID of this shell
 30655
-$ \fBsudo sh \-c 'echo 30655 > /sys/fs/cgroup/sub'\fP
-$ \fBcat /proc/self/cgroup | grep freezer\fP
+# \fBsh \-c 'echo 30655 > /sys/fs/cgroup/sub'\fP
+# \fBcat /proc/self/cgroup | grep freezer\fP
 7:freezer:/sub
 .in
 .fi

 Next, we use
 .BR unshare (1)
-to create a process running a shell in new user and cgroup namespaces:
+to create a process running a shell in a new cgroup namespace:

 .nf
 .in +4n
-$ \fBunshare -U -C bash\fP
+# \fBunshare \-C bash\fP
 .in
 .fi

@ -267,26 +267,65 @@ $ \fBcat /proc/20124/cgroup | grep freezer\fP
 .in
 .fi

-The virtualization provided by cgroup namespaces serves at least two purposes.
-First, it can be used to prevent
-information leaks whereby cgroup directory paths outside of
-a container would otherwise be visible to processes in the container.
-More importantly, this allows easier and more flexible
-confinement of container root tasks, because they can mount
-their own cgroup filesystems without needing to gain access to ancestor
-cgroup directories.
-So, for example, even if
-.I /cg/1
-is owned by uid 100000, a task namespaced under
-.I /cg/1/2
-owned by UID 100000 can mount that cgroup but not change settings in
-.IR /cg/1 .
-Combined with correct enforcement of hierarchical limits,
-this prevents that task from escaping its limits.
-
 Use of cgroup namespaces requires a kernel that is configured with the
 .B CONFIG_CGROUPS
 option.
+
+Among the purposes served by the
+virtualization provided by cgroup namespaces are the following:
+.IP * 2
+It prevents information leaks whereby cgroup directory paths outside of
+a container would otherwise be visible to processes in the container.
+Such leakages could, for example,
+reveal information about the container framework
+to containerized applications.
+.IP *
+It allows easier and more flexible
+confinement of container root tasks, because they can mount
+their own cgroup filesystems without gaining access to ancestor
+cgroup directories.
+Consider, for example, the following scenario:
+.RS 4
+.IP \(bu 2
+We have a cgroup directory,
+.IR /cg/1 ,
+that is owned by user ID 9000.
+.IP \(bu
+We have a process,
+.IR X ,
+also owned by user ID 9000,
+that is namespaced under the cgroup
+.IR /cg/1/2
+(i.e.,
+.I X
+was placed in a new cgroup namespace via
+.BR clone (2)
+or
+.BR unshare (2)
+with the
+.BR CLONE_NEWCGROUP
+flag).
+.RE
+.IP
+In the absence of cgroup namespacing, because the cgroup directory
+.IR /cg/1
+is owned (and writable) by UID 9000 and process X is also owned
+by user ID 9000, then process X would be able to modify the contents
+of cgroups files (i.e., change cgroup settings) not only in
+.IR /cg/1/2
+but also in the ancestor cgroup directory
+.IR /cg/1 .
+Namespacing process
+.IR X
+under the cgroup directory
+.IR /cg/1/2
+prevents it modifying files in
+.IR /cg/1 ,
+since it cannot even see the contents of that directory
+(or of further removed cgroup ancestor directories).
+Combined with correct enforcement of hierarchical limits,
+this prevents that process X from escaping the limits imposed
+by ancestor cgroups.
 .\"
 .\" ==================== IPC namespaces ====================
 .\"