clone.2, namespaces.7: Move some CLONE_NEWUSER text from clone.2 to namespaces.7

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2013-01-14 04:49:29 +01:00 · 2013-01-14 04:49:29 +01:00 · 9d005472a8
parent 3dd2331ce7
commit 9d005472a8
2 changed files with 165 additions and 158 deletions
--- a/man2/clone.2
+++ b/man2/clone.2
@ -379,90 +379,6 @@ in the same
 .BR clone ()
 call.

-.TP
-.BR CLONE_NEWUSER
-(This flag first became meaningful for
-.BR clone ()
-in Linux 2.6.23,
-the current
-.BR clone()
-semantics were merged in Linux 3.5,
-and the final pieces to make the user namespaces completely usable were
-merged in Linux 3.8.)
-
-If
-.B CLONE_NEWUSER
-is set, then create the process in a new user namespace.
-If this flag is not set, then (as with
-.BR fork (2))
-the process is created in the same user namespace as the calling process.
-
-A user namespace provides an isolated environment for
-security related identifiers, in particular,
-user IDs, group IDs, keys (see
-.BR keyctl (2)),
-and capabilities.
-
-When a user namespace is created,
-it starts out without a mapping of user IDs (group IDs)
-to the parent user namespace.
-The desired mapping of user IDs (group IDs) to the parent user namespace
-may be set by writing into  
-.IR /proc/[pid]/uid_map
-.RI ( /proc/[pid]/gid_map );
-see
-.BR proc (5).
-
-The first process in a user namespace starts out with a complete set
-of capabilities with respect to the new user namespace.  
-
-System calls that return user IDs (group IDs) will return
-either the user ID (group ID) mapped into the current
-user namespace if there is a mapping, or the overflow user ID (group ID);
-the default value for the overflow user ID (group ID) is 65534.
-See the descriptions of
-.IR /proc/sys/kernel/overflowuid
-and
-.IR /proc/sys/kernel/overflowgid
-in
-.BR proc (5).
-
-Use of this flag requires a kernel configured with the
-.BR CONFIG_USER_NS 
-option.
-Before Linux 3.8, use of
-.BR CLONE_NEWUSER
-required that the caller have three capabilities:
-.BR CAP_SYS_ADMIN ,
-.BR CAP_SETUID ,
-and
-.BR CAP_SETGID .
-.\" Before Linux 2.6.29, it appears that only CAP_SYS_ADMIN was needed
-Starting with Linux 3.8,
-no privileges are needed to create a user namespace,
-and mount, PID, IPC, network, and UTS namespaces can be created with just the
-.B CAP_SYS_ADMIN
-capability in the caller's user namespace.
-
-If
-.BR CLONE_NEWUSER
-is specified along with other
-.B CLONE_NEW*
-flags in a single
-.BR clone()
-call, the user namespace is guaranteed to be created first,
-giving the caller privileges over the remaining
-namespaces created by the call.
-Thus, it possible for an unprivileged caller to specify this combination
-of flags.
-
-Over the years, there have been a lot of features that have been added
-to the Linux kernel that are only available to privileged users
-because of their potential to confuse set-user-ID-root applications.
-In general, it becomes safe to allow the root user in a user namespace to
-use those features because it is impossible, while in a user namespace,
-to gain more privilege than the root user of a user namespace has.
-
 .TP
 .BR CLONE_NEWPID " (since Linux 2.6.24)"
 .\" This explanation draws a lot of details from
@ -481,68 +397,47 @@ the process is created in the same PID namespace as
 the calling process.
 This flag is intended for the implementation of containers.

-A PID namespace provides an isolated environment for PIDs:
-PIDs in a new namespace start at 1,
-somewhat like a standalone system, and calls to
-.BR fork (2),
-.BR vfork (2),
-or
-.BR clone ()
-will produce processes with PIDs that are unique within the namespace.
+For further information on PID namespaces, see
+.BR namespaces (7).

-The first process created in a new namespace
-(i.e., the process created using the
-.BR CLONE_NEWPID
-flag) has the PID 1, and is the "init" process for the namespace.
-Children that are orphaned within the namespace will be reparented
-to this process rather than
-.BR init (8).
-Unlike the traditional
-.B init
-process, the "init" process of a PID namespace can terminate,
-and if it does, all of the processes in the namespace are terminated.
-
-PID namespaces form a hierarchy.
-When a new PID namespace is created,
-the processes in that namespace are visible
-in the PID namespace of the process that created the new namespace;
-analogously, if the parent PID namespace is itself
-the child of another PID namespace,
-then processes in the child and parent PID namespaces will both be
-visible in the grandparent PID namespace.
-Conversely, the processes in the "child" PID namespace do not see
-the processes in the parent namespace.
-The existence of a namespace hierarchy means that each process
-may now have multiple PIDs:
-one for each namespace in which it is visible;
-each of these PIDs is unique within the corresponding namespace.
-(A call to
-.BR getpid (2)
-always returns the PID associated with the namespace in which
-the process lives.)
-
-After creating the new namespace,
-it is useful for the child to change its root directory
-and mount a new procfs instance at
-.I /proc
-so that tools such as
-.BR ps (1)
-work correctly.
-.\" mount -t proc proc /proc
-(If
-.BR CLONE_NEWNS
-is also included in
-.IR flags ,
-then it isn't necessary to change the root directory:
-a new procfs instance can be mounted directly over
-.IR /proc .)
-
-Use of this flag requires: a kernel configured with the
-.B CONFIG_PID_NS
-option and that the process be privileged
+Use of this flag requires
+that the process be privileged
 .RB ( CAP_SYS_ADMIN ).
 This flag can't be specified in conjunction with
 .BR CLONE_THREAD .
+
+.TP
+.BR CLONE_NEWUSER
+(This flag first became meaningful for
+.BR clone ()
+in Linux 2.6.23,
+the current
+.BR clone()
+semantics were merged in Linux 3.5,
+and the final pieces to make the user namespaces completely usable were
+merged in Linux 3.8.)
+
+If
+.B CLONE_NEWUSER
+is set, then create the process in a new user namespace.
+If this flag is not set, then (as with
+.BR fork (2))
+the process is created in the same user namespace as the calling process.
+
+For further information on user namespaces, see
+.BR namespaces (7).
+
+Before Linux 3.8, use of
+.BR CLONE_NEWUSER
+required that the caller have three capabilities:
+.BR CAP_SYS_ADMIN ,
+.BR CAP_SETUID ,
+and
+.BR CAP_SETGID .
+.\" Before Linux 2.6.29, it appears that only CAP_SYS_ADMIN was needed
+Starting with Linux 3.8,
+no privileges are needed to create a user namespace.
+
 .TP
 .BR CLONE_NEWUTS " (since Linux 2.6.19)"
 If
--- a/man7/namespaces.7
+++ b/man7/namespaces.7
@ -292,27 +292,88 @@ PID namespaces isolate the process ID number space,
 meaning that processes in different PID namespaces can have the same PID.
 PID namespaces allow containers to migrate to a new hosts
 while the processes inside the container maintain the same PIDs.
-Each PID namespace has its own init (PID 1, see
-.BR init (1)),
-the "ancestor of all processes" that
-manages various system initialization tasks and
-reaps orphaned child processes when they terminate.

-From the point of view of a particular PID namespace instance,
-a process has two PIDs: the PID inside the namespace,
-and the PID outside the namespace on the host system.
-PID namespaces can be nested:
-a process will have one PID for each of the layers of the hierarchy
+PIDs in a new PID namespace start at 1,
+somewhat like a standalone system, and calls to
+.BR fork (2),
+.BR vfork (2),
+or
+.BR clone (2)
+will produce processes with PIDs that are unique within the namespace.
+
+The first process created in a new namespace
+(i.e., the process created using
+.BR clone (2)
+with the
+.BR CLONE_NEWPID
+flag, or the first child created by a process after a call to
+.BR unshare (2)
+using the
+.BR CLONE_NEWPID
+flag) has the PID 1, and is the "init" process for the namespace (see
+.BR init (1)).
+Children that are orphaned within the namespace will be reparented
+to this process rather than
+.BR init (8).
+Unlike the traditional
+.B init
+process, the "init" process of a PID namespace can terminate,
+and if it does, all of the processes in the namespace are terminated.
+
+PID namespaces can be nested.
+When a new PID namespace is created,
+the processes in that namespace are visible
+in the PID namespace of the process that created the new namespace;
+analogously, if the parent PID namespace is itself
+the child of another PID namespace,
+then processes in the child and parent PID namespaces will both be
+visible in the grandparent PID namespace.
+Conversely, the processes in the "child" PID namespace do not see
+the processes in the parent namespace.
+More succinctly: a process can see (e.g., send signals with
+.BR kill(2))
+only to processes contained in its own PID namespace
+and the namespaces nested below that PID namespace.
+
+A process will have one PID for each of the layers of the hierarchy
 starting from the PID namespace in which it resides
 through to the root PID namespace.
-A process can see (e.g., send signals with
-.BR kill(2))
-only processes contained in its own PID namespace
-and the namespaces nested below that PID namespace.
+A call to
+.BR getpid (2)
+always returns the PID associated with the namespace in which
+the process resides.
+
+After creating a new PID namespace,
+it is useful for the child to change its root directory
+and mount a new procfs instance at
+.I /proc
+so that tools such as
+.BR ps (1)
+work correctly.
+.\" mount -t proc proc /proc
+(If
+.BR CLONE_NEWNS
+is also included in the
+.IR flags 
+argument of
+.BR clone (2)
+or
+.BR unshare (2)),
+then it isn't necessary to change the root directory:
+a new procfs instance can be mounted directly over
+.IR /proc .)
+
+Use of PID namespaces requires a kernel that is configured with the
+.B CONFIG_PID_NS
+option.

 .SS User namespaces (CLONE_NEWUSER)

-User namespaces isolate the user and group ID number spaces.
+User namespaces isolate
+security related identifiers, in particular,
+user IDs, group IDs, keys (see
+.BR keyctl (2)),
+and capabilities.
 In other words, a process's user and group IDs can be different
 inside and outside a user namespace.
 A process can have a normal unprivileged user ID outside a user namespace
@ -321,7 +382,58 @@ in other words,
 the process has full privileges for operations inside the user namespace,
 but is unprivileged for operations outside the namespace.

-Starting in Linux 3.8, unprivileged processes can create user namespaces.
+When a user namespace is created,
+it starts out without a mapping of user IDs (group IDs)
+to the parent user namespace.
+The desired mapping of user IDs (group IDs) to the parent user namespace
+may be set by writing into  
+.IR /proc/[pid]/uid_map
+.RI ( /proc/[pid]/gid_map );
+see below.
+
+The first process in a user namespace starts out with a complete set
+of capabilities with respect to the new user namespace.  
+
+System calls that return user IDs (group IDs) will return
+either the user ID (group ID) mapped into the current
+user namespace if there is a mapping, or the overflow user ID (group ID);
+the default value for the overflow user ID (group ID) is 65534.
+See the descriptions of
+.IR /proc/sys/kernel/overflowuid
+and
+.IR /proc/sys/kernel/overflowgid
+in
+.BR proc (5).
+
+Starting in Linux 3.8, unprivileged processes can create user namespaces,
+and mount, PID, IPC, network, and UTS namespaces can be created with just the
+.B CAP_SYS_ADMIN
+capability in the caller's user namespace.
+
+If
+.BR CLONE_NEWUSER
+is specified along with other
+.B CLONE_NEW*
+flags in a single
+.BR clone (2)
+or
+.BR unshare (2)
+call, the user namespace is guaranteed to be created first,
+giving the caller privileges over the remaining
+namespaces created by the call.
+Thus, it possible for an unprivileged caller to specify this combination
+of flags.
+
+Use of user namespaces requires a kernel that is configured with the
+.B CONFIG_USER_NS
+option.
+
+Over the years, there have been a lot of features that have been added
+to the Linux kernel that are only available to privileged users
+because of their potential to confuse set-user-ID-root applications.
+In general, it becomes safe to allow the root user in a user namespace to
+use those features because it is impossible, while in a user namespace,
+to gain more privilege than the root user of a user namespace has.

 The
 .IR /proc/[pid]/uid_map