Commit Graph

96 Commits

Author SHA1 Message Date
Michael Kerrisk bffbb22fda iconv.1, locale.1, memusage.1, memusagestat.1, pldd.1, sprof.1, _syscall.2, add_key.2, adjtimex.2, bind.2, bpf.2, chown.2, clone.2, close.2, copy_file_range.2, eventfd.2, fanotify_init.2, fanotify_mark.2, fork.2, fsync.2, futex.2, getdents.2, getrlimit.2, getxattr.2, io_cancel.2, io_destroy.2, io_getevents.2, io_setup.2, ioctl_fat.2, ioctl_getfsmap.2, ioctl_ns.2, ioctl_tty.2, ioctl_userfaultfd.2, kcmp.2, keyctl.2, listen.2, listxattr.2, mbind.2, membarrier.2, memfd_create.2, mkdir.2, move_pages.2, mremap.2, msync.2, nfsservctl.2, open.2, perf_event_open.2, pidfd_send_signal.2, pipe.2, pivot_root.2, pkey_alloc.2, process_vm_readv.2, ptrace.2, readlink.2, readv.2, recv.2, recvmmsg.2, rename.2, request_key.2, s390_runtime_instr.2, sched_setaffinity.2, seccomp.2, send.2, sendmmsg.2, sigaltstack.2, signalfd.2, socket.2, socketpair.2, splice.2, spu_create.2, spu_run.2, statfs.2, syscall.2, sysctl.2, sysfs.2, tee.2, timer_getoverrun.2, timer_settime.2, umount.2, userfaultfd.2, utimensat.2, wait4.2, INFINITY.3, __ppc_get_timebase.3, __setfpucw.3, abort.3, aio_cancel.3, aio_error.3, aio_read.3, aio_return.3, atexit.3, backtrace.3, basename.3, bsearch.3, bswap.3, cacos.3, cacosh.3, catan.3, catanh.3, cexp2.3, clock_getcpuclockid.3, clog2.3, cmsg.3, confstr.3, div.3, dl_iterate_phdr.3, dlerror.3, dlinfo.3, dlopen.3, dlsym.3, duplocale.3, encrypt.3, end.3, endian.3, envz_add.3, err.3, expm1.3, fdim.3, flockfile.3, fmtmsg.3, frexp.3, ftw.3, get_nprocs_conf.3, get_phys_pages.3, getaddrinfo_a.3, getauxval.3, getdate.3, getdtablesize.3, getgrent_r.3, getgrouplist.3, gethostbyname.3, getline.3, getnameinfo.3, getopt.3, getprotoent_r.3, getpwent_r.3, getpwnam.3, getservent_r.3, getsubopt.3, getutent.3, glob.3, gnu_get_libc_version.3, hsearch.3, if_nameindex.3, inet.3, inet_net_pton.3, inet_ntop.3, inet_pton.3, insque.3, killpg.3, makecontext.3, mallinfo.3, malloc.3, malloc_hook.3, malloc_info.3, mallopt.3, matherr.3, mbsnrtowcs.3, mbstowcs.3, mcheck.3, mempcpy.3, mq_getattr.3, mq_notify.3, mtrace.3, newlocale.3, nextafter.3, ntp_gettime.3, offsetof.3, open_memstream.3, pow.3, printf.3, pthread_attr_init.3, pthread_attr_setdetachstate.3, pthread_attr_setguardsize.3, pthread_attr_setinheritsched.3, pthread_attr_setschedparam.3, pthread_attr_setschedpolicy.3, pthread_attr_setstack.3, pthread_attr_setstacksize.3, pthread_cancel.3, pthread_cleanup_push.3, pthread_create.3, pthread_detach.3, pthread_getattr_default_np.3, pthread_getattr_np.3, pthread_getcpuclockid.3, pthread_join.3, pthread_mutex_consistent.3, pthread_mutexattr_setrobust.3, pthread_setaffinity_np.3, pthread_setcancelstate.3, pthread_setname_np.3, pthread_setschedparam.3, pthread_sigmask.3, pthread_spin_init.3, pthread_testcancel.3, pthread_tryjoin_np.3, ptsname.3, qsort.3, rand.3, random.3, remainder.3, rpmatch.3, rtime.3, rtnetlink.3, scalb.3, scalbln.3, scandir.3, sem_getvalue.3, sem_wait.3, setaliasent.3, setlogmask.3, sigwait.3, sincos.3, sockatmark.3, stdarg.3, stpcpy.3, strcat.3, strfmon.3, strptime.3, strtod.3, strtok.3, strtol.3, strtoul.3, strverscmp.3, tsearch.3, uselocale.3, wcstok.3, wcstombs.3, wordexp.3, y0.3, loop.4, vcs.4, veth.4, charmap.5, core.5, filesystems.5, gai.conf.5, hosts.5, hosts.equiv.5, locale.5, nss.5, repertoiremap.5, securetty.5, shells.5, ttytype.5, ascii.7, complex.7, cpuset.7, credentials.7, fanotify.7, hier.7, inotify.7, ip.7, mount_namespaces.7, mq_overview.7, netlink.7, network_namespaces.7, pid_namespaces.7, pkeys.7, rtld-audit.7, rtnetlink.7, sem_overview.7, signal-safety.7, sock_diag.7, spufs.7, standards.7, symlink.7, tcp.7, time_namespaces.7, unix.7, user_namespaces.7, xattr.7, ldconfig.8: tstamp
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-06-09 14:48:40 +02:00
Michael Kerrisk a14af333d6 Various pages: retitle EXAMPLE section heading to EXAMPLES
EXAMPLES appears to be the wider majority usage across various
projects' manual pages, and is also what is used in the POSIX
manual pages.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2020-05-31 10:45:27 +02:00
Michael Kerrisk a5409de92c clone.2, fallocate.2, ioctl_iflags.2, ioctl_list.2, pidfd_open.2, pivot_root.2, quotactl.2, seccomp.2, select.2, wait.2, proc.5, cgroups.7, netdevice.7, uts_namespaces.7: tstamp
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-19 15:31:20 +01:00
Michael Kerrisk 1b54731692 pivot_root.2: EXAMPLE: allocate stack using mmap() MAP_STACK rather than malloc()
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-11-14 22:24:45 +01:00
Michael Kerrisk a2dd6388a7 pivot_root.2: Update the copyright and license
After my rewriting, almost nothing of the original page remains,
so update the copyright. As the author, I'm relicensing to the
"verbatim" license most commonly used in man pages.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-10 12:24:28 +02:00
Michael Kerrisk 875298005d pivot_root.2: Minor wording tweaks
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-10 12:24:28 +02:00
Michael Kerrisk ba4b07c30f pivot_root.2: Another couple of s/filesystem/mount/
This is consistent with some earlier changes suggested by
Eric Biederman.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-10 12:24:28 +02:00
Michael Kerrisk 542175d8e4 pivot_root.2: Tweak text of an EINVAL error to correspond to DESCRIPTION
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-10 12:24:28 +02:00
Michael Kerrisk 01c64c3b4b pivot_root.2: Relegate text about what pivot_root() may or may not do to NOTES
The text stating that "pivot_root() may or may not change the
current root and the current working directory of any processes
or threads which use the old root directory" was written 19 years
ago, before the system call itself was even finalized in the
kernel. The implementation has never changed, and it won't
change in the future, since that would cause user-space breakage.
The existence of that text in DESCRIPTION, followed by qualifying
text stating what the implementation actually does (and has always
done) makes for confusing reading. Therefore, relegate this text
to a historical note in NOTES (so that readers with long memories
can see why the manual page was changed) and rework the text in
DESCRIPTION accordingly.

Reported-by: Philipp Wendler <ml@philippwendler.de>
Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Reported-by: Reid Priedhorsky <reidpr@lanl.gov>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-10 12:24:28 +02:00
Michael Kerrisk 3db820fe18 pivot_root.2: Add a subsection header for the pivot_root(".", ".") discussion
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-10 12:24:28 +02:00
Michael Kerrisk 97076c5a0b pivot_root.2: Minor change: relocate a paragraph in NOTES
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-10 12:24:28 +02:00
Michael Kerrisk 0843016c9b pivot_root.2: s/root filesystem/root mount/
As suggested by Eric Biederman.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-10 11:37:02 +02:00
Michael Kerrisk 666373fc08 pivot_root.2: Reword one of the restrictions on 'new_root'
A suggested by Eric Biederman

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-09 23:26:54 +02:00
Michael Kerrisk 33313a260c pivot_root.2: Change "filesystem" to "mount" in various places
Quoting Eric:

    If we are going to be pedantic "filesystem" is really the
    wrong concept here.  The section about bind mount clarifies
    it, but I wonder if there is a better term.

    I think I would say: "new_root and put_old must not be on
    the same mount as the current root."

    I think using "mount" instead of "filesystem" keeps the
    concepts less confusing.

    As I am reading through this email and seeing text that is
    trying to be precise and clear then hitting the term
    "filesystem" is a bit jarring.  pivot_root doesn't care a
    thing for file systems.  pivot_root only cares about mounts.

    And by a "mount" I mean the thing that you get when you
    create a bind mount or you call mount normally.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-09 12:14:35 +02:00
Michael Kerrisk 9f3af6b8c8 pivot_root.2: Simplify discussion of restrictions for 'new_root'
Philipp Wendler noted that the text on the restrictions for
'new_root' was slightly contradictory, and things could be
clarified and simplified by describing the restrictions on
'new_root' in one place.

Reported-by: Philipp Wendler <ml@philippwendler.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-09 09:40:15 +02:00
Michael Kerrisk b27d444f34 pivot_root.2: Remove an imprecision in description
Remove the text that suggests that pivot_root() changes the root
directory and CWD of process that have directory and CWD on the
old root *filesystem*. Change "filesystem" to "directory".

Reported-by: Philipp Wendler <ml@philippwendler.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-09 09:40:09 +02:00
Michael Kerrisk 47b69a37cf pivot_root.2: srcfix: FIXME
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-08 21:29:28 +02:00
Michael Kerrisk 8f2a9129e6 pivot_root.2: Remove the term 'old_root'
Reid noted a confusion between 'old_root' (my attempt at a
shorthand for the old root point) and 'put_old. Eliminate the
confusion by replacing the shorthand with "old root mount point".

Reported-by: Reid Priedhorsky <reidpr@lanl.gov>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-08 20:57:55 +02:00
Michael Kerrisk 93cc3b3827 pivot_root.2: Simplify pivot_root(".", ".") example
Eric Biederman notes that the change in commit f646ac88ef was
not strictly necessary for this example, since one of the already
documented requirements is that various mount points must not have
shared propagation, or else pivot_root() will fail. So, simplify
the example.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-10-07 14:02:42 +03:00
Jakub Wilk bf421740d4 pivot_root.2: tfix
Remove duplicated words.

Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-25 20:41:48 +02:00
Michael Kerrisk 9d33e03b95 pivot_root.2: Explain why various mount points can't have shared propagation
Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk d4b2104ae5 pivot_root.2: Correct the list of mount points that can't be MS_SHARED
Eric Biederman noted that my list of directories that could not
have shared propagation was incorrect.  I had written that
new_root could not be shared; rather it should be: the parent of
the current root mount point.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk f646ac88ef pivot_root.2: Tweak pivot_root(".", ".") example
Quoting Eric Biederman:

    The concern from our conversation at the container
    mini-summit was that there is a pathology if in your initial
    mount namespace all of the mounts are marked MS_SHARED like
    systemd does (and is almost necessary if you are going to
    use mount propagation), that if new_root itself is MS_SHARED
    then unmounting the old_root could propagate.

    So I believe the desired sequence is:

    >>>            chdir(new_root);
    +++            mount("", ".", MS_SLAVE | MS_REC, NULL);
    >>>            pivot_root(".", ".");
    >>>            umount2(".", MNT_DETACH);

    The change to new new_root could be either MS_SLAVE or
    MS_PRIVATE.  So long as it is not MS_SHARED the mount won't
    propagate back to the parent mount namespace.

Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk 57bab66a92 pivot_root.2: pivot_root(".", ".") really is a thing
LXC uses this [1]. I tested, to double-check, and it works.

The fchdir() dance done by LXC is not needed though:

fchdir(old_root); umount(".", MNT_DETACH); fchdir(new_root);

As far as I can see, just the umount() is sufficient, since,
after pivot_root(), oldi_root is at the top of the stack
of mounts at "/" and thus (so long as CWD is at "/")
the umount will remove the mount at the top of the stack.
Eric Biederman confirmed my understanding by mail, and
Philipp Wendler verified my results by experiment.

[1] See the following commit in LXC:

    commit 2d489f9e87fa0cccd8a1762680a43eeff2fe1b6e
    Author: Serge Hallyn <serge.hallyn@ubuntu.com>
    Date:   Sat Sep 20 03:15:44 2014 +0000

        pivot_root: switch to a new mechanism (v2)

Helped-by: Eric W. Biederman <ebiederm@xmission.com>
Helped-by: Philipp Wendler <ml@philippwendler.de>
Helped-by: Aleksa Sarai <asarai@suse.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk 682e1329f9 pivot_root.2: Eliminate text suggesting that behavior may change in the future
After around 19 years, the behavior of pivot_root() has not been
changed, and will almost certainly not change in the future.
So, reword to remove the suggestion that the behavior may change.
Also, more clearly document the effect of pivot_root() on
the calling process's current working directory.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk 4a8b7d7b13 pivot_root.2: Rework a "hanging" description into an earlier paragraph
The reference of "Note that this also applies" was vague. So
combine this paragraph with an earlier one to make the linkage
clearer.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk aff78c76f7 pivot_root.2: Remove a note about a historical idea/expectation
The idea that there might one day be a mechanism for kernel
threads to explicitly relinquish access to the filesystem never
came to pass (after 20 years), and the presence of text
describing this idea is, IMO, a distraction. So, remove it.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk c4bf33331b pivot_root.2: ffix (break up a paragraph)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk eb9078a7a9 pivot_root.2: Remove text describing case where current root is not a mount point
One kernel printk() later, my suspicions seem confirmed: the text
describing the situation where the current root is not a mount
point (because of a chroot()) seems to be bogus. (Perhaps it was
true once upon a time.) In my testing, if the current root is not
a mount point, an EINVAL error results.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk fc17fc6502 pivot_root.2: srcfix: FIXME
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk d761305516 pivot_root.2: Fix a technical detail
In this text:

        If the current root is not a mount point (e.g., after an
        earlier chroot(2) or pivot_root())...

mention of pivot_root() makes no sense, since (as noted in an
earlier commit message for this page) 'new_root' in a previous
pivot_root() must (since Linux 2.4.5) have been a mount point.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk 14caaed2c1 pivot_root.2: Minor change: rewrite the reference to pivot_root(8)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk bbae63c580 pivot_root.2: Remove BUGS section
One of these "bugs" is a philosophical point already covered
elsewhere in the page, while the other is a somewhat obscure joke.
Both pieces are a bit of a distraction, really.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk 41d4557c09 pivot_root.2: Minor wording fix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk fc2f474d77 pivot_root.2: Relocate details about kernel threads to NOTES
This text is a side point that somewhat distracts from the
flow in DESCRIPTION.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk b647c4c93a pivot_root.2: Add some more detail to the remaining EBUSY error
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk 071505e9fb pivot_root.2: Remove bogus a bogus EBUSY error case
The note that EBUSY is given if a filesystem is already mounted
on 'Iput_old' was never really true. That restriction was in
Linux 2.3.14, but removed in Linux 2.3.99-pre6 so it never made
it to mainline.

The relevant diff in pivot_root() was:

        error = -EBUSY;
-       if (d_new_root->d_sb == root->d_sb || d_put_old->d_sb == root->d_sb)
+       if (new_nd.mnt == root_mnt || old_nd.mnt == root_mnt)
                goto out2; /* loop */
-       if (d_put_old != d_put_old->d_covers)
-               goto out2; /* mount point is busy */
        error = -EINVAL;

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk 2f2e1a2296 pivot_root.2: Add an example program
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk 0c2329cdbe pivot_root.2: Minor fix: add a reference to a relevant piece in NOTES
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk 422e36b7f2 pivot_root.2: Relocate text on use cases and add text on purpose of pivot_root(2)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk a94f69d6db pivot_root.2: Rework the text on "future changes" to reflect that 20 years have passed
Some of the text was written long ago, and hinted that things
might change in the future. However, 20 years have passed
and these details have not changed, so rework the text to
hint at that fact.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk 3afc97b20b pivot_root.2: Mention containers as a use case for pivot_root()
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk 0ac6f9008e pivot_root.2: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk b16dd3037d pivot_root.2: There is no restriction against 'put_old' being a mount point
As far as I can see from the source code, the statement that
"No other filesystem may be mounted on 'put_old'" is incorrect.
Even looking at the 2.4.0 source code, there I can't see such
a restriction. In addition, some testing on a 5.0 kernel
(mounting 'put_old' in the new mount namespace just before
pivot_root()) did not result in an error for this case when
calling pivot_root().

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:19 +02:00
Michael Kerrisk 83cc245d6d pivot_root.2: srcfix: add self to copyright
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:18 +02:00
Michael Kerrisk ac2eb791b3 pivot_root.2: pivot_root() affects only other processes in the same mount namespace
pivot_root() only affects the current working directory and root
directory of other processes in the same mount namespace as the
caller.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:18 +02:00
Michael Kerrisk 7cc1a16df6 pivot_root.2: Introduce mount namespaces in the very first sentence
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:18 +02:00
Michael Kerrisk fdc558bda9 pivot_root.2: Note capability requirements
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:18 +02:00
Michael Kerrisk 81b24320d8 pivot_root.2: Mention mount namespaces
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:18 +02:00
Michael Kerrisk f42778c4e5 pivot_root.2: SEE ALSO: add mount_namespaces(7)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-09-23 13:11:18 +02:00