Logically, this section should follow the section that
describes cgroup.subtree_control.
No content changes in this patch.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
When the user creates an unprivileged mount namespace, the Linux
kernel sets the MNT_LOCKED flag [1] on any submounts to prevent
such mounts from being unmounted inside the mount namespace. Such
an unmount would reveal the filesystem tree behind the mount,
which is not otherwise possible from an unprivileged vantage
point.
Attempting to unmount such a mount will fail with EINVAL. However,
less obvious implication is that attempting a bind mount without
MS_REC, where the tree being bound contains locked sub-mounts,
will also fail with EINVAL, because, without MS_REC, such
submounts are effectively being unmounted.
Cursory googling shows several instances of people running into
this problem, so I felt it advantageous to have it documented in
the man page.
[1] 4fbd8d194f/fs/namespace.c (L1110-L1113)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
As noted by Pradeep (and I tested also the code on OpenBSD):
The man page for fts library call states both the following:
short fts_pathlen; /* strlen(fts_path) */
fts_pathlen The length of the string referenced by fts_path
However, for the structures returned from fts_children() function,
fts_pathlen is strlen(fts_path) + strlen(fts_name), which contradicts
the man page statement. So I believe that there is either a bug in the
man page or the fts_children() library call.
The following program can be used for verification:
int main() {
struct passwd *pwd_entry = getpwuid(getuid());
char *paths[] = {pwd_entry->pw_dir, NULL};
FTS* fts = fts_open(paths, FTS_LOGICAL, NULL);
FTSENT* ftsent = fts_read(fts);
FTSENT* child = fts_children(fts, 0);
while (child != NULL) {
printf("\n %s %s %d %lu", child->fts_path, child->fts_name,
child->fts_pathlen, strlen(child->fts_path));
child = child->fts_link;
}
return 0;
}
Reported-by: Pradeep Kumar <pradeepsixer@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
If an advisory lock is lost, then read/write requests on any
affected file descriptor can return EIO - for NFSv4 at least.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
One last thing: reading through this, I think it might need a
wording fix (this is my fault), in order to avoid implying that
brk() or malloc() use dlopen().
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
-- Expand the documentation to discuss the hazards in
enough detail to allow avoiding them.
-- Mention the upcoming MAP_FIXED_SAFE flag.
-- Enhance the alignment requirement slightly.
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: Jann Horn <jannh@google.com>
CC: Matthew Wilcox <willy@infradead.org>
CC: Michal Hocko <mhocko@kernel.org>
CC: Mike Rapoport <rppt@linux.vnet.ibm.com>
CC: Cyril Hrubis <chrubis@suse.cz>
CC: Michal Hocko <mhocko@suse.com>
CC: Pavel Machek <pavel@ucw.cz>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>