After comments from Miklos, and further digging in the kernel
source that showed that chroot() can also result in "hidden"
parent-IDs in mountinfo, I've revised the description of
mountinfo.
In fs/proc_namespace.cs::how_mountinfo() there is:
/* mountpoints outside of chroot jail will give SEQ_SKIP on this */
err = seq_path_root(m, &mnt_path, &p->root, " \t\n\\");
if (err)
goto out;
I instrumented the 'if (err)' code path with printk()
to show that there is indeed a record corresponding to the
parent-ID for the process root that is being skipped.
Reported-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
I do not have an exact handle on the details, but I can see
roughly what is going on. Internally, there seems to be one
("hidden") mount ID reserved to each mount namespace, and that ID
is the parent of the root mount point.
Looking through the (4.14) kernel source, mount IDs are allocated
by a kernel function called mnt_alloc_id() (in fs/namespace.c),
which is in turn called by alloc_vfsmnt() which is in turn called
by clone_mnt().
A new mount namespace is created by the kernel function
copy_mnt_ns() (in fs/namespace.c, called by
create_new_namespaces() in kernel/nsproxy.c). The copy_mnt_ns()
function calls copy_tree() (in fs/namespace.c), and copy_tree()
calls clone_mnt() in *two* places. The first of these is the call
that creates the "hidden" mount ID that becomes the parent of the
root mount point. (I verified this by instrumenting the kernel
with a few printk() calls to display the IDs.) The second place
where copy_tree() calls clone_mnt() is in a loop that replicates
each of the mount points (including the root mount point) in the
source mount namespace.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
After Linux 2.6.36, the heuristic calculation of oom_score
has changed to only consider used memory and CAP_SYS_ADMIN.
See kernel commit a63d83f427fbce97a6cea0db2e64b0eb8435cd10.
Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Document the seccomp /proc interfaces in Linux 4.14:
/proc/sys/kernel/seccomp/actions_avail and
/proc/sys/kernel/seccomp/actions_logged.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Since the symbolic links for pipes and sockets do not refer to real
files in the file system tree, it can be hard to discover that they
still have mode and ownership information (revealed e.g. by `stat -L`),
so let's point this out in the manpage.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
When referring to the architecture, consistently use "x86-64",
not "x86_64". Hitherto, there was a mixture of usages, with
"x86-64" predominant.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
See Linux commit 56873f43abdcd574b25105867a990f067747b2f4
and Linux commit f074a8f49eb87cde95ac9d040ad5e7ea4f029738
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>