mount_setattr.2: Minor tweaks to Christian's patch

- Fix SYNOPSIS to fit in 78 columns

  Also, we don't show when an include is included for a specific type,
  unless that header is included _only_ for the type,
  or there might be confusion (e.g., termios).
  Instead, that type should be documented in system_data_types(7),
  with a link page mount_attr-struct(3).

- Fix references to mount_setattr().  See man-pages(7):

       Any reference to the subject of the current manual page should be writ‐
       ten with the name in bold followed by a pair of  parentheses  in  Roman
       (normal)  font.   For  example, in the fcntl(2) man page, references to
       the subject of the page would be written as:  fcntl().   The  preferred
       way to write this in the source file is:

           .BR fcntl ()

- Fix line breaks according to semantic newline rules (and add some commas)
- Fix wrong usage of .IR when .RI should have been used
- Fix formatting of variable part in FOO<number>:
  - Make italic the variable part (as groff_man(7) recommends)
  - Remove <>
  - Use syntax recommended by G. Branden Robinson (groff)

- Fix unnecessary uses of .BR or .IR when .B or .I would suffice
- Fix formatting of punctuation

  In some cases, it was in italics or bold, and it should always be in roman.

- Use uppercase to begin text, even in bullet points, since those were
  multi-sentence.

- Simplify usage of .RS/.RE in combination with .IP
- s/fat/FAT/ as fs(7) does
- Slightly reword some sentences for consistency
- Use Linux-specific for consistency with other pages (in VERSIONS)
- EXAMPLES: Place the return type in a line of its own (as in other pages)
- Fix alignment of code
- Replace unnecessary use of the GNU extension ({}) by do {} while (0)

  In that case, there was no return value (moreover, it's a noreturn).

- Break complex declaration lines into a line for each variable

  The variables were being initialized, some to non-zero values,
  so for clarity, a line for each one seems more appropriate.

- Add const to pointers when possible
- s/\\/\e/
- Remove unmatched groff commands

Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
This commit is contained in:
Alejandro Colomar 2021-08-08 10:41:31 +02:00 committed by Michael Kerrisk
parent f3a5ba3f01
commit 63097cb7be
1 changed files with 225 additions and 221 deletions

View File

@ -30,13 +30,13 @@ mount_setattr \- change mount properties of a mount or mount tree
.PP .PP
.BR "#include <linux/fcntl.h>" " /* Definition of " AT_* " constants */" .BR "#include <linux/fcntl.h>" " /* Definition of " AT_* " constants */"
.BR "#include <linux/mount.h>" " /* Definition of struct mount_attr and MOUNT_ATTR_* constants */" .BR "#include <linux/mount.h>" " /* Definition of " MOUNT_ATTR_* " constants */"
.BR "#include <sys/syscall.h>" " /* Definition of " SYS_* " constants */" .BR "#include <sys/syscall.h>" " /* Definition of " SYS_* " constants */"
.B #include <unistd.h> .B #include <unistd.h>
.PP .PP
.BI "int syscall(SYS_mount_setattr, int " dfd ", const char *" path \ .BI "int syscall(SYS_mount_setattr, int " dfd ", const char *" path ,
", unsigned int " flags \ .BI " unsigned int " flags ", struct mount_attr *" attr \
", struct mount_attr *" attr ", size_t " size ); ", size_t " size );
.fi .fi
.PP .PP
.IR Note : .IR Note :
@ -46,13 +46,13 @@ necessitating the use of
.BR syscall (2). .BR syscall (2).
.SH DESCRIPTION .SH DESCRIPTION
The The
.BR mount_setattr (2) .BR mount_setattr ()
system call changes the mount properties of a mount or entire mount tree. system call changes the mount properties of a mount or entire mount tree.
If If
.I path .I path
is a relative pathname, is a relative pathname,
then it is interpreted relative to the directory referred to by the file then it is interpreted relative to
descriptor the directory referred to by the file descriptor
.IR dfd . .IR dfd .
If If
.I dfd .I dfd
@ -60,24 +60,25 @@ is the special value
.B AT_FDCWD .B AT_FDCWD
then then
.I path .I path
is taken to be relative to the current working directory of the calling process. is interpreted relative to
the current working directory of the calling process.
If If
.I path .I path
is the empty string and is the empty string and
.BR AT_EMPTY_PATH .B AT_EMPTY_PATH
is specified in is specified in
.I flags .IR flags ,
then the mount properties of the mount identified by then the mount properties of the mount identified by
.I dfd .I dfd
are changed. are changed.
.PP .PP
The The
.BR mount_setattr (2) .BR mount_setattr ()
system call uses an extensible structure system call uses an extensible structure
.IR ( "struct mount_attr" ) .RI ( "struct mount_attr" )
to allow for future extensions. to allow for future extensions.
Any non-flag extensions to Any non-flag extensions to
.BR mount_setattr (2) .BR mount_setattr ()
will be implemented as new fields appended to the above structure, will be implemented as new fields appended to the above structure,
with a zero value in a new field resulting in the kernel behaving with a zero value in a new field resulting in the kernel behaving
as though that extension field was not present. as though that extension field was not present.
@ -94,17 +95,18 @@ The
argument should usually be specified as argument should usually be specified as
.IR "sizeof(struct mount_attr)" . .IR "sizeof(struct mount_attr)" .
However, However,
if the caller does not intend to make use of features that got if the caller does not intend to make use of features that
introduced after the initial version of got introduced after the initial version of
.I struct mount_attr .I struct mount_attr
they are free to pass the size of the initial struct together with the larger they are free to pass
struct. the size of the initial struct together with the larger struct.
This allows the kernel to not copy later parts of the struct that aren't used This allows the kernel to not copy later parts of the struct
anyway. that aren't used anyway.
With each extension that changes the size of With each extension that changes the size of
.I struct mount_attr .I struct mount_attr
the kernel will expose a define of the form the kernel will expose a define of the form
.BR MOUNT_ATTR_SIZE_VER<number> . .BI MOUNT_ATTR_SIZE_VER number\c
\&.
For example the macro for the size of the initial version of For example the macro for the size of the initial version of
.I struct mount_attr .I struct mount_attr
is is
@ -118,7 +120,8 @@ The supported values are:
.B AT_EMPTY_PATH .B AT_EMPTY_PATH
If If
.I path .I path
is the empty string change the mount properties on is the empty string,
change the mount properties on
.I dfd .I dfd
itself. itself.
.TP .TP
@ -134,7 +137,7 @@ Don't trigger automounts.
The The
.I attr .I attr
argument of argument of
.BR mount_setattr (2) .BR mount_setattr ()
is a structure of the following form: is a structure of the following form:
.PP .PP
.in +4n .in +4n
@ -152,18 +155,21 @@ The
.I attr_set .I attr_set
and and
.I attr_clr .I attr_clr
members are used to specify the mount properties that are supposed to be set or members are used to specify the mount properties that
cleared for a mount or mount tree. are supposed to be set or cleared for a mount or mount tree.
Flags set in Flags set in
.I attr_set .I attr_set
enable a property on a mount or mount tree and flags set in enable a property on a mount or mount tree,
and flags set in
.I attr_clr .I attr_clr
remove a property from a mount or mount tree. remove a property from a mount or mount tree.
.PP .PP
When changing mount properties the kernel will first clear the flags specified When changing mount properties,
the kernel will first clear the flags specified
in the in the
.I attr_clr .I attr_clr
field and then set the flags specified in the field,
and then set the flags specified in the
.I attr_set .I attr_set
field: field:
.PP .PP
@ -192,8 +198,8 @@ mnt->mnt_flags = current_mnt_flags;
.in .in
.PP .PP
The effect of this change will be a mount or mount tree that is read-only, The effect of this change will be a mount or mount tree that is read-only,
blocks the execution of set-user-ID and set-group-ID binaries but does allow to blocks the execution of set-user-ID and set-group-ID binaries,
execute programs and access to devices nodes. but does allow to execute programs and access to devices nodes.
Multiple changes with the same set of flags requested Multiple changes with the same set of flags requested
in in
.I attr_clr .I attr_clr
@ -210,7 +216,8 @@ fields:
.B MOUNT_ATTR_RDONLY .B MOUNT_ATTR_RDONLY
If set in If set in
.I attr_set .I attr_set
makes the mount read-only and if set in makes the mount read-only,
and if set in
.I attr_clr .I attr_clr
removes the read-only setting if set on the mount. removes the read-only setting if set on the mount.
.TP .TP
@ -227,46 +234,50 @@ and file capability restriction if set on this mount.
.B MOUNT_ATTR_NODEV .B MOUNT_ATTR_NODEV
If set in If set in
.I attr_set .I attr_set
prevents access to devices on this mount and if set in prevents access to devices on this mount,
and if set in
.I attr_clr .I attr_clr
removes the device access restriction if set on this mount. removes the restriction that prevented accesing devices on this mount.
.TP .TP
.BR MOUNT_ATTR_NOEXEC .B MOUNT_ATTR_NOEXEC
If set in If set in
.I attr_set .I attr_set
prevents executing programs on this mount and if set in prevents executing programs on this mount,
and if set in
.I attr_clr .I attr_clr
removes the restriction to execute programs on this mount. removes the restriction that prevented executing programs on this mount.
.TP .TP
.BR MOUNT_ATTR_NOSYMFOLLOW .B MOUNT_ATTR_NOSYMFOLLOW
If set in If set in
.I attr_set .I attr_set
prevents following symlinks on this mount and if set in prevents following symlinks on this mount,
and if set in
.I attr_clr .I attr_clr
removes the restriction to not follow symlinks on this mount. removes the restriction that prevented following symlinks on this mount.
.TP .TP
.B MOUNT_ATTR_NODIRATIME .B MOUNT_ATTR_NODIRATIME
If set in If set in
.I attr_set .I attr_set
prevents updating access time for directories on this mount and if set in prevents updating access time for directories on this mount,
and if set in
.I attr_clr .I attr_clr
removes access time restriction for directories. removes the restriction that prevented updating access time for directories.
Note that Note that
.BR MOUNT_ATTR_NODIRATIME .B MOUNT_ATTR_NODIRATIME
can be combined with other access time settings and is implied can be combined with other access time settings
by the noatime setting. and is implied by the noatime setting.
All other access time settings are mutually exclusive. All other access time settings are mutually exclusive.
.TP .TP
.BR MOUNT_ATTR__ATIME " - Changing access time settings .BR MOUNT_ATTR__ATIME " - Changing access time settings
In the new mount api the access time values are an enum starting from 0. In the new mount API the access time values are an enum starting from 0.
Even though they are an enum in contrast to the other mount flags such as Even though they are an enum in contrast to the other mount flags such as
.BR MOUNT_ATTR_NOEXEC .BR MOUNT_ATTR_NOEXEC ,
they are nonetheless passed in they are nonetheless passed in
.I attr_set .I attr_set
and and
.I attr_clr .I attr_clr
for consistency with for consistency with
.BR fsmount (2) .BR fsmount (2),
which introduced this behavior. which introduced this behavior.
.IP .IP
Note, Note,
@ -281,68 +292,67 @@ in the
.I attr_clr .I attr_clr
field. field.
The kernel will verify that The kernel will verify that
.BR MOUNT_ATTR__ATIME .B MOUNT_ATTR__ATIME
isn't partially set in isn't partially set in
.I attr_clr .IR attr_clr ,
and that and that
.I attr_set .I attr_set
doesn't have any access time bits set if doesn't have any access time bits set if
.BR MOUNT_ATTR__ATIME .B MOUNT_ATTR__ATIME
isn't set in isn't set in
.IR attr_clr . .IR attr_clr .
.RS .RS
.TP .TP
.B MOUNT_ATTR_RELATIME .B MOUNT_ATTR_RELATIME
When a file is accessed via this mount, When a file is accessed via this mount,
update the file's last access time update the file's last access time (atime)
(atime) only if the current value of atime is less than or equal to
only if the current value of atime is less than or equal to the file's the file's last modification time (mtime) or last status change time (ctime).
last modification time (mtime) or last status change time (ctime).
.IP .IP
To enable this access time setting on a mount or mount tree To enable this access time setting on a mount or mount tree,
.BR MOUNT_ATTR_RELATIME .B MOUNT_ATTR_RELATIME
must be set in must be set in
.I attr_set .I attr_set
and and
.BR MOUNT_ATTR__ATIME .B MOUNT_ATTR__ATIME
must be set in the must be set in the
.I attr_clr .I attr_clr
field. field.
.TP .TP
.BR MOUNT_ATTR_NOATIME .B MOUNT_ATTR_NOATIME
Do not update access times for (all types of) files on this mount. Do not update access times for (all types of) files on this mount.
.IP .IP
To enable this access time setting on a mount or mount tree To enable this access time setting on a mount or mount tree,
.BR MOUNT_ATTR_NOATIME .B MOUNT_ATTR_NOATIME
must be set in must be set in
.I attr_set .I attr_set
and and
.BR MOUNT_ATTR__ATIME .B MOUNT_ATTR__ATIME
must be set in the must be set in the
.I attr_clr .I attr_clr
field. field.
.TP .TP
.BR MOUNT_ATTR_STRICTATIME .B MOUNT_ATTR_STRICTATIME
Always update the last access time (atime) when files are accessed on this Always update the last access time (atime)
mount. when files are accessed on this mount.
.IP .IP
To enable this access time setting on a mount or mount tree To enable this access time setting on a mount or mount tree,
.BR MOUNT_ATTR_STRICTATIME .B MOUNT_ATTR_STRICTATIME
must be set in must be set in
.I attr_set .I attr_set
and and
.BR MOUNT_ATTR__ATIME .B MOUNT_ATTR__ATIME
must be set in the must be set in the
.I attr_clr .I attr_clr
field. field.
.RE .RE
.TP .TP
.BR MOUNT_ATTR_IDMAP .B MOUNT_ATTR_IDMAP
If set in If set in
.I attr_set .I attr_set
creates an idmapped mount. creates an idmapped mount.
Since it is not supported to change the idmapping of a mount after it has been Since it is not supported to
idmapped, change the idmapping of a mount after it has been idmapped,
it is invalid to specify it is invalid to specify
.B MOUNT_ATTR_IDMAP .B MOUNT_ATTR_IDMAP
in in
@ -350,54 +360,51 @@ in
The idmapping is taken from the user namespace specified in The idmapping is taken from the user namespace specified in
.I userns_fd .I userns_fd
and attached to the mount. and attached to the mount.
More details can be found in subsequent paragraphs.
.IP .IP
Creating an idmapped mount allows to change the ownership of all files located Creating an idmapped mount allows to
under a mount. change the ownership of all files located under a mount.
Thus, idmapped mounts make it possible to change ownership in a temporary and Thus, idmapped mounts make it possible to
localized way. change ownership in a temporary and localized way.
It is a localized change because ownership changes are restricted to a specific It is a localized change because
mount. ownership changes are restricted to a specific mount.
All other users and locations where the filesystem is exposed are unaffected. All other users and locations where the filesystem is exposed are unaffected.
And it is a temporary change because ownership changes are tied to the lifetime And it is a temporary change because
of the mount. ownership changes are tied to the lifetime of the mount.
.IP .IP
Whenever callers interact with the filesystem through an idmapped mount the Whenever callers interact with the filesystem through an idmapped mount,
idmapping of the mount will be applied to user and group IDs associated with the idmapping of the mount will be applied to
filesystem objects. user and group IDs associated with filesystem objects.
This encompasses the user and group IDs associated with inodes and also This encompasses the user and group IDs associated with inodes
the following and also the following
.BR xattr (7) .BR xattr (7)
keys: keys:
.RS .RS
.RS .IP \(bu
.IP \(bu 2 .IR security.capability ,
.IR security.capability
whenever filesystem whenever filesystem
.BR capabilities (7) .BR capabilities (7)
are stored or returned in the are stored or returned in the
.I VFS_CAP_REVISION_3 .I VFS_CAP_REVISION_3
format which stores a rootid alongside the capabilities. format,
.IP \(bu 2 which stores a rootid alongside the capabilities.
.IP \(bu
.I system.posix_acl_access .I system.posix_acl_access
and and
.I system.posix_acl_default .IR system.posix_acl_default ,
whenever user IDs or group IDs are stored in whenever user IDs or group IDs are stored in
.BR ACL_USER .B ACL_USER
and or
.BR ACL_GROUP .B ACL_GROUP
entries. entries.
.RE .RE
.RE
.IP .IP
The following conditions must be met in order to create an idmapped mount: The following conditions must be met in order to create an idmapped mount:
.RS .RS
.RS .IP \(bu
.IP \(bu 2
The caller must have The caller must have
.I CAP_SYS_ADMIN .I CAP_SYS_ADMIN
in the initial user namespace. in the initial user namespace.
.IP \(bu 2 .IP \(bu
The filesystem must be mounted in the initial user namespace. The filesystem must be mounted in the initial user namespace.
.IP \(bu .IP \(bu
The underlying filesystem must support idmapped mounts. The underlying filesystem must support idmapped mounts.
@ -405,9 +412,9 @@ Currently
.BR xfs (5), .BR xfs (5),
.BR ext4 (5) .BR ext4 (5)
and and
.BR fat .B FAT
filesystems support idmapped mounts with more filesystems being actively worked filesystems support idmapped mounts
on. with more filesystems being actively worked on.
.IP \(bu .IP \(bu
The mount must not already be idmapped. The mount must not already be idmapped.
This also implies that the idmapping of a mount cannot be altered. This also implies that the idmapping of a mount cannot be altered.
@ -420,24 +427,24 @@ with the
.I OPEN_TREE_CLONE .I OPEN_TREE_CLONE
flag and it must not already have been visible in the filesystem. flag and it must not already have been visible in the filesystem.
.RE .RE
.RE
.IP .IP
Idmappings can be created for user IDs, group IDs, and project IDs. Idmappings can be created for user IDs, group IDs, and project IDs.
An idmapping is essentially a mapping of a range of user or group IDs into An idmapping is essentially a mapping of a range of user or group IDs into
another or the same range of user or group IDs. another or the same range of user or group IDs.
Idmappings are usually written as three numbers either separated by white space Idmappings are usually written as three numbers
or a full stop. either separated by white space or a full stop.
The first two numbers specify the starting user or group ID in each of the two The first two numbers specify the starting user or group ID
user namespaces. in each of the two user namespaces.
The third number specifies the range of the idmapping. The third number specifies the range of the idmapping.
For example, a mapping for user IDs such as 1000:1001:1 would indicate that For example, a mapping for user IDs such as 1000:1001:1 would indicate that
user ID 1000 in the caller's user namespace is mapped to user ID 1001 in its user ID 1000 in the caller's user namespace is mapped to
ancestor user namespace. user ID 1001 in its ancestor user namespace.
Since the map range is 1 only user ID 1000 is mapped. Since the map range is 1,
only user ID 1000 is mapped.
It is possible to specify up to 340 idmappings for each idmapping type. It is possible to specify up to 340 idmappings for each idmapping type.
If any user IDs or group IDs are not mapped all files owned by that unmapped If any user IDs or group IDs are not mapped,
user or group ID will appear as being owned by the overflow user ID or overflow all files owned by that unmapped user or group ID will appear as
group ID respectively. being owned by the overflow user ID or overflow group ID respectively.
Further details and instructions for setting up idmappings can be found in the Further details and instructions for setting up idmappings can be found in the
.BR user_namespaces (7) .BR user_namespaces (7)
man page. man page.
@ -445,69 +452,70 @@ man page.
In the common case the user namespace passed in In the common case the user namespace passed in
.I userns_fd .I userns_fd
together with together with
.BR MOUNT_ATTR_IDMAP .B MOUNT_ATTR_IDMAP
in in
.I attr_set .I attr_set
to create an idmapped mount will be the user namespace of a container. to create an idmapped mount will be the user namespace of a container.
In other scenarios it will be a dedicated user namespace associated with a In other scenarios it will be a dedicated user namespace associated with
user's login session as is the case for portable home directories in a user's login session as is the case for portable home directories in
.BR systemd-homed.service (8) ). .BR systemd-homed.service (8) ).
It is also perfectly fine to create a dedicated user namespace for the sake of It is also perfectly fine to create a dedicated user namespace
idmapping a mount. for the sake of idmapping a mount.
.IP .IP
Idmapped mounts can be useful in the following and a variety of other Idmapped mounts can be useful in the following
scenarios: and a variety of other scenarios:
.RS .RS
.RS .IP \(bu
.IP \(bu 2 Sharing files between multiple users or multiple machines,
sharing files between multiple users or multiple machines especially in especially in complex scenarios.
complex scenarios.
For example, For example,
idmapped mounts are used to implement portable home directories in idmapped mounts are used to implement portable home directories in
.BR systemd-homed.service (8) .BR systemd-homed.service (8)
where they allow users to move their home directory to an external storage where they allow users to move their home directory
device and use it on multiple computers where they are assigned different user IDs to an external storage device
and group IDs. and use it on multiple computers
This effectively makes it possible to assign random user IDs and group IDs at login time. where they are assigned different user IDs and group IDs.
This effectively makes it possible to
assign random user IDs and group IDs at login time.
.IP \(bu .IP \(bu
sharing files from the host with unprivileged containers. Sharing files from the host with unprivileged containers.
This allows user to avoid having to change ownership permanently through This allows a user to avoid having to change ownership permanently through
.BR chown (2) . .BR chown (2) .
.IP \(bu .IP \(bu
idmapping a container's root filesystem. Idmapping a container's root filesystem.
Users don't need to change ownership Users don't need to change ownership permanently through
permanently through
.BR chown (2) . .BR chown (2) .
Especially for large root filesystems using Especially for large root filesystems, using
.BR chown (2) .BR chown (2)
can be prohibitively expensive. can be prohibitively expensive.
.IP \(bu .IP \(bu
sharing files between containers with non-overlapping Sharing files between containers with non-overlapping idmappings.
idmappings.
.IP \(bu .IP \(bu
implementing discretionary access (DAC) permission checking for fileystems Implementing discretionary access (DAC) permission checking
lacking a concept of ownership. for fileystems lacking a concept of ownership.
.IP \(bu .IP \(bu
efficiently change ownership on a per-mount basis. Efficiently change ownership on a per-mount basis.
In contrast to In contrast to
.BR chown (2) .BR chown (2),
changing ownership of large sets of files is instantenous with idmapped mounts. changing ownership of large sets of files is instantenous with idmapped mounts.
This is especially useful when ownership of an entire root filesystem of a This is especially useful when ownership of
virtual machine or container is to be changed as we've mentioned above. an entire root filesystem of a virtual machine or container
With idmapped mounts a single is to be changed as we've mentioned above.
.BR mount_setattr (2) With idmapped mounts,
a single
.BR mount_setattr ()
system call will be sufficient to change the ownership of all files. system call will be sufficient to change the ownership of all files.
.IP \(bu .IP \(bu
taking the current ownership into account. Taking the current ownership into account.
Idmappings specify precisely what a user or group ID is supposed to be Idmappings specify precisely
mapped to. what a user or group ID is supposed to be mapped to.
This contrasts with the This contrasts with the
.BR chown (2) .BR chown (2)
system call which cannot by itself take the current ownership of the files it system call which cannot by itself
changes into account. take the current ownership of the files it changes into account.
It simply changes the ownership to the specified user ID and group ID. It simply changes the ownership to the specified user ID and group ID.
.IP \(bu .IP \(bu
locally and temporarily restricted ownership changes. Locally and temporarily restricted ownership changes.
Idmapped mounts allow to change ownership locally, Idmapped mounts allow to change ownership locally,
restricting it to specific mounts, restricting it to specific mounts,
and temporarily as the ownership changes only apply as long as the mount exists. and temporarily as the ownership changes only apply as long as the mount exists.
@ -516,7 +524,6 @@ changing ownership via the
.BR chown (2) .BR chown (2)
system call changes the ownership globally and permanently. system call changes the ownership globally and permanently.
.RE .RE
.RE
.PP .PP
The The
.I propagation .I propagation
@ -538,13 +545,13 @@ will propagate to the other mount points that are members of the peer group.
Propagation here means that the same mount or unmount will automatically occur Propagation here means that the same mount or unmount will automatically occur
under all of the other mount points in the peer group. under all of the other mount points in the peer group.
Conversely, Conversely,
mount and unmount events that take place under peer mount points will propagate mount and unmount events that take place under peer mount points
to this mount point. will propagate to this mount point.
.TP .TP
.B MS_SLAVE .B MS_SLAVE
Turn all mounts into dependent mounts. Turn all mounts into dependent mounts.
Mount and unmount events propagate into this mount point from a shared peer Mount and unmount events propagate into this mount point
group. from a shared peer group.
Mount and unmount events under this mount point do not propagate to any peer. Mount and unmount events under this mount point do not propagate to any peer.
.TP .TP
.B MS_UNBINDABLE .B MS_UNBINDABLE
@ -558,7 +565,7 @@ when replicating that subtree to produce the target subtree.
.PP .PP
.SH RETURN VALUE .SH RETURN VALUE
On success, On success,
.BR mount_setattr (2) .BR mount_setattr ()
returns zero. returns zero.
On error, On error,
\-1 is returned and \-1 is returned and
@ -576,8 +583,8 @@ is not a valid file descriptor.
.TP .TP
.B EBUSY .B EBUSY
The caller tried to change the mount to The caller tried to change the mount to
.BR MOUNT_ATTR_RDONLY .B MOUNT_ATTR_RDONLY
but the mount still has files open for writing. but the mount still holds files open for writing.
.TP .TP
.B EINVAL .B EINVAL
The path specified via the The path specified via the
@ -585,7 +592,7 @@ The path specified via the
and and
.I path .I path
arguments to arguments to
.BR mount_setattr (2) .BR mount_setattr ()
isn't a mountpoint. isn't a mountpoint.
.TP .TP
.B EINVAL .B EINVAL
@ -612,11 +619,11 @@ field of
.TP .TP
.B EINVAL .B EINVAL
More than one of More than one of
.BR MS_SHARED, .BR MS_SHARED ,
.BR MS_SLAVE, .BR MS_SLAVE ,
.BR MS_PRIVATE, .BR MS_PRIVATE ,
or or
.BR MS_UNBINDABLE .B MS_UNBINDABLE
was set in was set in
.I propagation .I propagation
field of field of
@ -626,13 +633,13 @@ field of
An access time setting was specified in the An access time setting was specified in the
.I attr_set .I attr_set
field without field without
.BR MOUNT_ATTR__ATIME .B MOUNT_ATTR__ATIME
being set in the being set in the
.I attr_clr .I attr_clr
field. field.
.TP .TP
.B EINVAL .B EINVAL
.BR MOUNT_ATTR_IDMAP .B MOUNT_ATTR_IDMAP
was specified in was specified in
.IR attr_clr . .IR attr_clr .
.TP .TP
@ -645,8 +652,8 @@ which exceeds
.B EINVAL .B EINVAL
A valid file descriptor value was specified in A valid file descriptor value was specified in
.I userns_fd .I userns_fd
but the file descriptor wasn't a namespace file descriptor or did not refer to but the file descriptor wasn't a namespace file descriptor
a user namespace. or did not refer to a user namespace.
.TP .TP
.B EINVAL .B EINVAL
The underlying filesystem does not support idmapped mounts. The underlying filesystem does not support idmapped mounts.
@ -660,7 +667,7 @@ the mount is already visible in the filesystem.
A partial access time setting was specified in A partial access time setting was specified in
.I attr_clr .I attr_clr
instead of instead of
.BR MOUNT_ATTR__ATIME .B MOUNT_ATTR__ATIME
being set. being set.
.TP .TP
.B EINVAL .B EINVAL
@ -674,14 +681,14 @@ A pathname was empty or had a nonexistent component.
.TP .TP
.B ENOMEM .B ENOMEM
When changing mount propagation to When changing mount propagation to
.BR MS_SHARED .B MS_SHARED
a new peer group id needs to be allocated for all mounts without a peer group a new peer group id needs to be allocated for all mounts without a peer group
id set. id set.
Allocation of this peer group id has failed. Allocation of this peer group id has failed.
.TP .TP
.B ENOSPC .B ENOSPC
When changing mount propagation to When changing mount propagation to
.BR MS_SHARED .B MS_SHARED
a new peer group id needs to be allocated for all mounts without a peer group a new peer group id needs to be allocated for all mounts without a peer group
id set. id set.
Allocation of this peer group id can fail. Allocation of this peer group id can fail.
@ -690,25 +697,25 @@ id allocation implementation used.
.TP .TP
.B EPERM .B EPERM
One of the mounts had at least one of One of the mounts had at least one of
.BR MOUNT_ATTR_NOATIME, .BR MOUNT_ATTR_NOATIME ,
.BR MOUNT_ATTR_NODEV, .BR MOUNT_ATTR_NODEV ,
.BR MOUNT_ATTR_NODIRATIME, .BR MOUNT_ATTR_NODIRATIME ,
.BR MOUNT_ATTR_NOEXEC, .BR MOUNT_ATTR_NOEXEC ,
.BR MOUNT_ATTR_NOSUID, .BR MOUNT_ATTR_NOSUID ,
or or
.BR MOUNT_ATTR_RDONLY .B MOUNT_ATTR_RDONLY
set and the flag is locked. set and the flag is locked.
Mount attributes become locked on a mount if: Mount attributes become locked on a mount if:
.RS .RS
.IP \(bu 2 .IP \(bu
a new mount or mount tree is created causing mount propagation across user A new mount or mount tree is created causing mount propagation across user
namespaces. namespaces.
The kernel will lock the aforementioned flags to protect these sensitive The kernel will lock the aforementioned flags to protect these sensitive
properties from being altered. properties from being altered.
.IP \(bu .IP \(bu
a new mount and user namespace pair is created. A new mount and user namespace pair is created.
This happens for example when specifying This happens for example when specifying
.BR CLONE_NEWUSER | CLONE_NEWNS .B CLONE_NEWUSER | CLONE_NEWNS
in in
.BR unshare (2), .BR unshare (2),
.BR clone (2), .BR clone (2),
@ -731,18 +738,18 @@ The caller does not have
.I CAP_SYS_ADMIN .I CAP_SYS_ADMIN
in the initial user namespace. in the initial user namespace.
.SH VERSIONS .SH VERSIONS
.BR mount_setattr (2) .BR mount_setattr ()
first appeared in Linux 5.12. first appeared in Linux 5.12.
.\" commit 7d6beb71da3cc033649d641e1e608713b8220290 .\" commit 7d6beb71da3cc033649d641e1e608713b8220290
.\" commit 2a1867219c7b27f928e2545782b86daaf9ad50bd .\" commit 2a1867219c7b27f928e2545782b86daaf9ad50bd
.\" commit 9caccd41541a6f7d6279928d9f971f6642c361af .\" commit 9caccd41541a6f7d6279928d9f971f6642c361af
.SH CONFORMING TO .SH CONFORMING TO
.BR mount_setattr (2) .BR mount_setattr ()
is Linux specific. is Linux-specific.
.SH NOTES .SH NOTES
.SS Extensibility .SS Extensibility
In order to allow for future extensibility, In order to allow for future extensibility,
.BR mount_setattr (2) .BR mount_setattr ()
along with other system calls such as along with other system calls such as
.BR openat2 (2) .BR openat2 (2)
and and
@ -751,7 +758,7 @@ requires the user-space application to specify the size of the
.I mount_attr .I mount_attr
structure that it is passing. structure that it is passing.
By providing this information, it is possible for By providing this information, it is possible for
.BR mount_setattr (2) .BR mount_setattr ()
to provide both forwards- and backwards-compatibility, with to provide both forwards- and backwards-compatibility, with
.I size .I size
acting as an implicit version number. acting as an implicit version number.
@ -771,10 +778,9 @@ and let
.I ksize .I ksize
be the size of the structure which the kernel supports, be the size of the structure which the kernel supports,
then there are three cases to consider: then there are three cases to consider:
.RS .IP \(bu
.IP \(bu 2
If If
.IR ksize .I ksize
equals equals
.IR usize , .IR usize ,
then there is no version mismatch and then there is no version mismatch and
@ -782,31 +788,32 @@ then there is no version mismatch and
can be used verbatim. can be used verbatim.
.IP \(bu .IP \(bu
If If
.IR ksize .I ksize
is larger than is larger than
.IR usize , .IR usize ,
then there are some extension fields that the kernel supports which the then there are some extension fields that the kernel supports
user-space application is unaware of. which the user-space application is unaware of.
Because a zero value in any added extension field signifies a no-op, Because a zero value in any added extension field signifies a no-op,
the kernel treats all of the extension fields not provided by the user-space the kernel treats all of the extension fields
application as having zero values. not provided by the user-space application
as having zero values.
This provides backwards-compatibility. This provides backwards-compatibility.
.IP \(bu .IP \(bu
If If
.IR ksize .I ksize
is smaller than is smaller than
.IR usize , .IR usize ,
then there are some extension fields which the user-space application is aware then there are some extension fields which the user-space application is aware
of but which the kernel does not support. of but which the kernel does not support.
Because any extension field must have its zero values signify a no-op, Because any extension field must have its zero values signify a no-op,
the kernel can safely ignore the unsupported extension fields if they are the kernel can safely ignore the unsupported extension fields
all zero. if they are all zero.
If any unsupported extension fields are non-zero, then \-1 is returned and If any unsupported extension fields are non-zero,
then \-1 is returned and
.I errno .I errno
is set to is set to
.BR E2BIG . .BR E2BIG .
This provides forwards-compatibility. This provides forwards-compatibility.
.RE
.PP .PP
Because the definition of Because the definition of
.I struct mount_attr .I struct mount_attr
@ -842,7 +849,7 @@ attr.attr_clr = MOUNT_ATTR_NODEV;
.PP .PP
A user-space application that wishes to determine which extensions the running A user-space application that wishes to determine which extensions the running
kernel supports can do so by conducting a binary search on kernel supports can do so by conducting a binary search on
.IR size .I size
with a structure which has every byte nonzero with a structure which has every byte nonzero
(to find the largest value which doesn't produce an error of (to find the largest value which doesn't produce an error of
.BR E2BIG ) . .BR E2BIG ) .
@ -865,30 +872,26 @@ with a structure which has every byte nonzero
#include <sys/syscall.h> #include <sys/syscall.h>
#include <unistd.h> #include <unistd.h>
static inline int mount_setattr(int dfd, static inline int
const char *path, mount_setattr(int dfd, const char *path, unsigned int flags,
unsigned int flags, struct mount_attr *attr, size_t size)
struct mount_attr *attr,
size_t size)
{ {
return syscall(SYS_mount_setattr, dfd, path, return syscall(SYS_mount_setattr, dfd, path, flags, attr, size);
flags, attr, size);
} }
static inline int open_tree(int dfd, const char *filename, static inline int
open_tree(int dfd, const char *filename,
unsigned int flags) unsigned int flags)
{ {
return syscall(SYS_open_tree, dfd, filename, flags); return syscall(SYS_open_tree, dfd, filename, flags);
} }
static inline int move_mount(int from_dfd, static inline int
const char *from_pathname, move_mount(int from_dfd, const char *from_pathname,
int to_dfd, int to_dfd, const char *to_pathname, unsigned int flags)
const char *to_pathname,
unsigned int flags)
{ {
return syscall(SYS_move_mount, from_dfd, return syscall(SYS_move_mount, from_dfd, from_pathname,
from_pathname, to_dfd, to_pathname, flags); to_dfd, to_pathname, flags);
} }
static const struct option longopts[] = { static const struct option longopts[] = {
@ -902,23 +905,25 @@ static const struct option longopts[] = {
{ NULL, 0, NULL, 0 }, { NULL, 0, NULL, 0 },
}; };
#define exit_log(format, ...) \\ #define exit_log(format, ...) do \e
({ \\ { \e
fprintf(stderr, format, ##__VA_ARGS__); \\ fprintf(stderr, format, ##__VA_ARGS__); \e
exit(EXIT_FAILURE); \\ exit(EXIT_FAILURE); \e
}) } while (0)
int main(int argc, char *argv[]) int
main(int argc, char *argv[])
{ {
int fd_userns = \-EBADF, index = 0; int fd_userns = \-EBADF;
int index = 0;
bool recursive = false; bool recursive = false;
struct mount_attr *attr = &(struct mount_attr){}; struct mount_attr *attr = &(struct mount_attr){};
const char *source, *target; const char *source, *target;
int fd_tree, new_argc, ret; int fd_tree, new_argc, ret;
char *const *new_argv; const char *const *new_argv;
while ((ret = getopt_long_only(argc, argv, "", while ((ret = getopt_long_only(argc, argv, "",
longopts, &index)) != \-1) { longopts, &index)) != \-1) {
switch (ret) { switch (ret) {
case 'a': case 'a':
fd_userns = open(optarg, O_RDONLY | O_CLOEXEC); fd_userns = open(optarg, O_RDONLY | O_CLOEXEC);
@ -985,7 +990,6 @@ int main(int argc, char *argv[])
exit(EXIT_SUCCESS); exit(EXIT_SUCCESS);
} }
.EE .EE
.fi
.SH SEE ALSO .SH SEE ALSO
.BR capabilities (7), .BR capabilities (7),
.BR clone (2), .BR clone (2),