clone.2: Rework Eric's CLONE_NEWUSER patch

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
This commit is contained in:
Michael Kerrisk 2012-12-27 11:20:30 +01:00
parent 57ef8c39e7
commit 06b3045839
1 changed files with 42 additions and 28 deletions

View File

@ -39,9 +39,6 @@
.\" 2008-11-19, mtk, document CLONE_NEWIPC
.\" 2008-11-19, Jens Axboe, mtk, document CLONE_IO
.\"
.\" FIXME Document CLONE_NEWUSER, which is new in 2.6.23
.\" (also supported for unshare()?)
.\"
.TH CLONE 2 2014-08-19 "Linux" "Linux Programmer's Manual"
.SH NAME
clone, __clone2 \- create a child process
@ -283,7 +280,7 @@ If
.B CLONE_NEWIPC
is set, then create the process in a new IPC namespace.
If this flag is not set, then (as with
.BR fork (2))
.BR fork (2)),
the process is created in the same IPC namespace as
the calling process.
This flag is intended for the implementation of containers.
@ -398,42 +395,59 @@ in the same
.BR clone ()
call.
.TP
.BR CLONE_NEWUSER " (since Linux 3.6)"
.BR CLONE_NEWUSER
(This flag first became meaningful for
.BR clone ()
in Linux 2.6.29,
but the implementation of user namespaces was only completed in Linux 3.8.)
If
.B CLONE_NEWUSER
is set, the create the process in a new user namespace. If this flag is not set, then (as with
is set, then create the process in a new user namespace.
If this flag is not set, then (as with
.BR fork (2))
the process is created in the same user namespace as the calling process.
A user namespace provides an isolated environment for security related identifiers in particular
uids, gids, keys (see
A user namespace provides an isolated environment for
security related identifiers, in particular,
user IDs, group IDs, keys (see
.BR keyctl (2)),
and capabilities.
When a user namespace is created it initially starts out without a mapping of uids and gids
to the parent user namespace. The desired mapping of uids to the parent user namespace
may be set by writting into
.IR /proc/[pid]/uid_map.
The desired mapping of gids to the parent user namespace may be set by writinng into
.IR /proc/[pid]/gid_map.
When a user namespace is created,
it starts out without a mapping of user IDs (group IDs)
to the parent user namespace.
The desired mapping of user IDs (group IDs) to the parent user namespace
may be set by writing into
.IR /proc/[pid]/uid_map
.RI ( /proc/[pid]/gid_map );
see
.BR proc (5).
The first process in a user namespace starts out with a complete set of capabilities with
respect to the new user namespace.
The first process in a user namespace starts out with a complete set
of capabilities with respect to the new user namespace.
syscalls that return uids and gids will either return the uid or gid mapped into the current
user namespace if there is a mapping or depending on the context will return either
the overflowuid (default 65534) or the overflowgid (default 65534). See
.IR /proc/sys/kernel/overflowuid, /proc/sys/kernel/overflowgid
System calls that return user IDs (group IDs) will return
either the user ID (group ID) mapped into the current
user namespace if there is a mapping, or the overflow user ID (group ID);
the default value for the overflow user ID (group ID) is 65534.
See the descriptions of
.IR /proc/sys/kernel/overflowuid
and
.IR /proc/sys/kernel/overflowgid
in
.BR proc (5).
As of Linux 3.8 no priviliges are needed to create a user namespace,
and mount, pid, ipc, net, uts namespaces can be created with just
CAP_SYS_ADMIN privileges in your current user namespace.
Starting with Linux 3.8,
no privileges are needed to create a user namespace,
and mount, PID, IPC, network, and UTS namespaces can be created with just the
.B CAP_SYS_ADMIN
capability in the caller's user namespace.
Over the years there have been a lot of features that have been added
to the linux kernel that are only available to privileged users
because of their potential to confuse setuid root applications. In
general it becomes safe to allow the root user in a user namespace to
use those features because it is impossible while in a user namespace
Over the years, there have been a lot of features that have been added
to the Linux kernel that are only available to privileged users
because of their potential to confuse set-user-ID-root applications.
In general, it becomes safe to allow the root user in a user namespace to
use those features because it is impossible, while in a user namespace,
to gain more privilege than the root user of a user namespace has.
.TP