2015-01-08 05:40:33 +00:00
|
|
|
.\" Copyright (C) 2014 David Herrmann <dh.herrmann@gmail.com>
|
2015-01-08 13:10:21 +00:00
|
|
|
.\" and Copyright (C) 2014 Michael Kerrisk <mtk.manpages@gmail.com>
|
2015-01-08 05:40:33 +00:00
|
|
|
.\"
|
|
|
|
.\" %%%LICENSE_START(GPLv2+_SW_3_PARA)
|
2015-01-08 11:24:37 +00:00
|
|
|
.\"
|
|
|
|
.\" FIXME What is _SW_3_PARA?
|
|
|
|
.\"
|
2015-01-08 05:40:33 +00:00
|
|
|
.\" This program is free software; you can redistribute it and/or modify
|
|
|
|
.\" it under the terms of the GNU General Public License as published by
|
|
|
|
.\" the Free Software Foundation; either version 2 of the License, or
|
|
|
|
.\" (at your option) any later version.
|
|
|
|
.\"
|
|
|
|
.\" This program is distributed in the hope that it will be useful,
|
|
|
|
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
.\" GNU General Public License for more details.
|
|
|
|
.\"
|
|
|
|
.\" You should have received a copy of the GNU General Public
|
|
|
|
.\" License along with this manual; if not, see
|
|
|
|
.\" <http://www.gnu.org/licenses/>.
|
|
|
|
.\" %%%LICENSE_END
|
|
|
|
.\"
|
|
|
|
.TH MEMFD_CREATE 2 2014-07-08 Linux "Linux Programmer's Manual"
|
|
|
|
.SH NAME
|
|
|
|
memfd_create \- create an anonymous file
|
|
|
|
.SH SYNOPSIS
|
|
|
|
.B #include <sys/memfd.h>
|
|
|
|
.sp
|
|
|
|
.BI "int memfd_create(const char *" name ", unsigned int " flags ");"
|
|
|
|
.SH DESCRIPTION
|
|
|
|
.BR memfd_create ()
|
2015-01-08 11:24:37 +00:00
|
|
|
creates an anonymous file and returns a file descriptor that refers to it.
|
|
|
|
The file behaves like a regular file, and so can be modified,
|
2015-01-09 09:36:16 +00:00
|
|
|
truncated, memory-mapped, and so on.
|
2015-01-08 11:24:37 +00:00
|
|
|
However, unlike a regular file,
|
|
|
|
it lives in RAM and has a volatile backing storage.
|
|
|
|
.\" FIXME In the following sentence I changed "released" to
|
|
|
|
.\" "destroyed". Okay?
|
2015-01-08 05:47:42 +00:00
|
|
|
Once all references to the file are dropped, it is automatically released.
|
|
|
|
Anonymous memory is used for all backing pages of the file.
|
2015-01-08 11:24:37 +00:00
|
|
|
.\" FIXME In the following sentence I changed "they" to
|
|
|
|
.\" "files created by memfd_create()". Okay?
|
|
|
|
Therefore, files created by
|
|
|
|
.BR memfd_create ()
|
|
|
|
are subject to the same restrictions as other anonymous
|
2015-01-09 09:36:49 +00:00
|
|
|
.\" FIXME Can you give some examples of some of the restrictions please.
|
2015-01-08 11:24:37 +00:00
|
|
|
memory allocations such as those allocated using
|
2015-01-08 05:40:33 +00:00
|
|
|
.BR mmap (2)
|
2015-01-08 11:24:37 +00:00
|
|
|
with the
|
|
|
|
.BR MAP_ANONYMOUS
|
|
|
|
flag.
|
2015-01-08 05:40:33 +00:00
|
|
|
|
|
|
|
The initial size of the file is set to 0.
|
2015-01-08 13:10:21 +00:00
|
|
|
.\" FIXME I added the following sentence. Please review.
|
2015-01-08 11:24:37 +00:00
|
|
|
Following the call, the file size should be set using
|
|
|
|
.BR ftruncate (2).
|
|
|
|
|
|
|
|
The name supplied in
|
2015-01-08 05:40:33 +00:00
|
|
|
.I name
|
2015-01-09 10:00:18 +00:00
|
|
|
is used as an internal filename and will be displayed
|
2015-01-08 11:24:37 +00:00
|
|
|
.\" FIXME What does "internal" in the previous line mean?
|
|
|
|
as the target of the corresponding symbolic link in the directory
|
|
|
|
.\" FIXME I added the previous line. Is it correct?
|
2015-01-08 05:40:33 +00:00
|
|
|
.IR /proc/self/fd/ .
|
2015-01-08 11:24:37 +00:00
|
|
|
.\" FIXME In the next line, I added "as displayed in that
|
|
|
|
The displayed name is always prefixed with
|
|
|
|
.IR memfd:
|
|
|
|
and serves only for debugging purposes.
|
2015-01-08 05:47:42 +00:00
|
|
|
Names do not affect the behavior of the memfd,
|
2015-01-08 11:24:37 +00:00
|
|
|
.\" FIXME The term "memfd" appears here without having previously been
|
2015-01-09 11:20:52 +00:00
|
|
|
.\" defined. Would the correct definition of "the memfd" be
|
2015-01-08 11:24:37 +00:00
|
|
|
.\" "the file descriptor created by memfd_create"?
|
2015-01-08 05:47:42 +00:00
|
|
|
and as such multiple files can have the same name without any side effects.
|
2015-01-08 05:40:33 +00:00
|
|
|
|
|
|
|
The following values may be bitwise ORed in
|
|
|
|
.IR flags
|
|
|
|
to change the behaviour of
|
|
|
|
.BR memfd_create ():
|
|
|
|
.TP
|
|
|
|
.BR MFD_CLOEXEC
|
|
|
|
Set the close-on-exec
|
|
|
|
.RB ( FD_CLOEXEC )
|
|
|
|
flag on the new file descriptor.
|
|
|
|
See the description of the
|
|
|
|
.B O_CLOEXEC
|
|
|
|
flag in
|
|
|
|
.BR open (2)
|
2015-01-08 05:47:42 +00:00
|
|
|
for reasons why this may be useful.
|
2015-01-08 05:40:33 +00:00
|
|
|
.TP
|
|
|
|
.BR MFD_ALLOW_SEALING
|
2015-01-09 10:07:01 +00:00
|
|
|
Allow sealing operations on this file.
|
|
|
|
See
|
2015-01-08 05:40:33 +00:00
|
|
|
.BR fcntl (2)
|
|
|
|
with
|
|
|
|
.B F_ADD_SEALS
|
|
|
|
and
|
2015-01-09 10:07:01 +00:00
|
|
|
.BR F_GET_SEALS ,
|
|
|
|
and also NOTES, below.
|
2015-01-08 05:47:42 +00:00
|
|
|
The initial set of seals is empty.
|
|
|
|
If this flag is not set, the initial set of seals will be
|
2015-01-08 11:24:37 +00:00
|
|
|
.BR F_SEAL_SEAL ,
|
|
|
|
meaning that no other seals can be set on the file.
|
|
|
|
.\" FIXME Why is the MFD_ALLOW_SEALING behavior not simply the default?
|
|
|
|
.\" Is it worth adding some text explaining this?
|
2015-01-08 05:40:33 +00:00
|
|
|
.PP
|
2015-01-08 11:24:37 +00:00
|
|
|
Unused bits in
|
|
|
|
.I flags
|
|
|
|
must be 0.
|
2015-01-08 05:40:33 +00:00
|
|
|
|
|
|
|
As its return value,
|
|
|
|
.BR memfd_create ()
|
|
|
|
returns a new file descriptor that can be used to refer to the file.
|
2015-01-08 11:24:37 +00:00
|
|
|
This file descriptor is opened for both reading and writing
|
|
|
|
.RB ( O_RDWR )
|
|
|
|
and
|
|
|
|
.B O_LARGEFILE
|
|
|
|
is set for the descriptor.
|
|
|
|
|
|
|
|
With respect to
|
|
|
|
.BR fork (2)
|
|
|
|
and
|
|
|
|
.BR execve (2),
|
|
|
|
the usual semantics apply for the file descriptor created by
|
|
|
|
.BR memfd_create ().
|
|
|
|
A copy of the file descriptor is inherited by the child produced by
|
|
|
|
.BR fork (2)
|
|
|
|
and refers to the same file.
|
|
|
|
The file descriptor is preserved across
|
2015-01-08 05:40:33 +00:00
|
|
|
.BR execve (2),
|
|
|
|
unless the close-on-exec flag has been set.
|
|
|
|
.SH RETURN VALUE
|
|
|
|
On success,
|
|
|
|
.BR memfd_create ()
|
|
|
|
returns a new file descriptor.
|
|
|
|
On error, \-1 is returned and
|
|
|
|
.I errno
|
|
|
|
is set to indicate the error.
|
|
|
|
.SH ERRORS
|
|
|
|
.TP
|
2015-01-08 11:24:37 +00:00
|
|
|
.B EFAULT
|
|
|
|
The address in
|
|
|
|
.IR name
|
|
|
|
points to invalid memory.
|
|
|
|
.TP
|
2015-01-08 05:40:33 +00:00
|
|
|
.B EINVAL
|
2015-01-08 11:26:30 +00:00
|
|
|
An unsupported value was specified in one of the arguments:
|
|
|
|
.I flags
|
|
|
|
included unknown bits, or
|
|
|
|
.I name
|
|
|
|
was too long.
|
2015-01-08 05:40:33 +00:00
|
|
|
.TP
|
|
|
|
.B EMFILE
|
|
|
|
The per-process limit on open file descriptors has been reached.
|
|
|
|
.TP
|
|
|
|
.B ENFILE
|
2015-01-08 05:47:42 +00:00
|
|
|
The system-wide limit on the total number of open files has been reached.
|
2015-01-08 05:40:33 +00:00
|
|
|
.TP
|
|
|
|
.B ENOMEM
|
|
|
|
There was insufficient memory to create a new anonymous file.
|
|
|
|
.SH VERSIONS
|
2015-01-08 11:24:37 +00:00
|
|
|
The
|
|
|
|
.BR memfd_create ()
|
|
|
|
system call first appeared in Linux 3.17.
|
2015-01-09 09:10:30 +00:00
|
|
|
.\" FIXME . When glibc support appears, update the following sentence:
|
|
|
|
Support in the GNU C library is pending.
|
2015-01-08 05:40:33 +00:00
|
|
|
.SH CONFORMING TO
|
2015-01-08 11:24:37 +00:00
|
|
|
The
|
2015-01-08 05:40:33 +00:00
|
|
|
.BR memfd_create ()
|
2015-01-08 11:24:37 +00:00
|
|
|
system call is Linux-specific.
|
2015-01-08 13:10:21 +00:00
|
|
|
.\" FIXME I added the NOTES section below. Please review.
|
|
|
|
.SH NOTES
|
|
|
|
.\" See also http://lwn.net/Articles/593918/
|
|
|
|
.\" and http://lwn.net/Articles/594919/ and http://lwn.net/Articles/591108/
|
|
|
|
The
|
|
|
|
.BR memfd_create ()
|
|
|
|
system call provides a simple alternative to manually mounting a
|
|
|
|
.I tmpfs
|
|
|
|
filesystem and creating and opening a file in that filesystem.
|
|
|
|
The primary purpose of
|
|
|
|
.BR memfd_create ()
|
|
|
|
is to create files and associated file descriptors that are
|
|
|
|
used with the file-sealing APIs provided by
|
|
|
|
.BR fcntl (2).
|
|
|
|
.SS File sealing
|
|
|
|
In the absence of file sealing,
|
|
|
|
processes that communicate via shared memory must either trust each other,
|
|
|
|
or take measures to deal with the possibility that an untrusted peer
|
|
|
|
may manipulate the shared memory region in problematics ways.
|
|
|
|
For example, an untrusted peer might modify the contents of the
|
|
|
|
shared memory at any time, or shrink the shared memory region.
|
|
|
|
The former possibility leaves the local process vulnerable to
|
|
|
|
time-of-check-to-time-of-use race conditions
|
|
|
|
(typically dealt with by copying data from
|
|
|
|
the shared memory region before checking and using it).
|
|
|
|
The latter possibility leaves the local process vulnerable to
|
|
|
|
.BR SIGBUS
|
|
|
|
signals when an attempt is made to access a now-nonexistent
|
|
|
|
location in the shared memory region.
|
|
|
|
(Dealing with this possibility necessitates the use of a handler for the
|
|
|
|
.BR SIGBUS
|
|
|
|
signal.)
|
|
|
|
|
|
|
|
Dealing with untrusted peers imposes extra complexity on
|
|
|
|
code that employs shared memory.
|
|
|
|
Memory sealing enables that extra complexity to be eliminated,
|
|
|
|
by allowing a process to operate secure in the knowledge that
|
|
|
|
its peer can't modify the shared memory in an undesired fashion.
|
|
|
|
|
|
|
|
An example of the usage of the sealing mechanism is as follows:
|
|
|
|
|
|
|
|
.IP 1. 3
|
|
|
|
The first process creates a
|
|
|
|
.I tmpfs
|
|
|
|
file using
|
|
|
|
.BR memfd_create ().
|
|
|
|
The call yields a file descriptor used in subsequent steps.
|
|
|
|
.IP 2.
|
|
|
|
The first process
|
|
|
|
sizes the file created in the previous step using
|
|
|
|
.BR ftruncate (2),
|
|
|
|
maps it using
|
|
|
|
.BR mmap (2),
|
|
|
|
and populates the shared memory with the desired data.
|
|
|
|
.IP 3.
|
|
|
|
The first process uses the
|
|
|
|
.BR fcntl (2)
|
|
|
|
.B F_ADD_SEALS
|
|
|
|
operation to place one or more seals on the file,
|
|
|
|
in order to restrict further modifications on the file.
|
|
|
|
(If placing the seal
|
|
|
|
.BR F_SEAL_WRITE ,
|
|
|
|
then it will be necessary to first unmap the shared writable mapping
|
|
|
|
created in the previous step.)
|
|
|
|
.IP 4.
|
|
|
|
A second process obtains a file descriptor for the
|
|
|
|
.I tmpfs
|
|
|
|
file and maps it.
|
|
|
|
This could happen in one of two ways:
|
|
|
|
.RS
|
|
|
|
.IP * 3
|
|
|
|
The second process is created via
|
|
|
|
.BR fork (2)
|
|
|
|
and thus automatically inherits the file descriptor and mapping.
|
|
|
|
.IP *
|
|
|
|
The second process opens the file
|
|
|
|
.IR /proc/<pd>/fd/<fd> ,
|
|
|
|
where
|
|
|
|
.I <pid>
|
|
|
|
is the PID of the first process (the one that called
|
|
|
|
.BR memfd_create ()),
|
|
|
|
and
|
|
|
|
.I <fd>
|
|
|
|
is the number of the file descriptor returned by the call to
|
|
|
|
.BR memfd_create ()
|
|
|
|
in that process.
|
|
|
|
The second process then maps the file using
|
|
|
|
.BR mmap (2).
|
|
|
|
.RE
|
|
|
|
.IP 5.
|
|
|
|
The second process uses the
|
|
|
|
.BR fcntl (2)
|
|
|
|
.B F_GET_SEALS
|
2015-01-09 10:46:49 +00:00
|
|
|
operation to retrieve the bit mask of seals
|
|
|
|
that has been applied to the file.
|
|
|
|
This bit mask can be inspected in order to determine
|
|
|
|
what kinds of restrictions have been placed on file modifications.
|
2015-01-08 13:10:21 +00:00
|
|
|
If desired, the second process can apply further seals
|
|
|
|
to impose additional restrictions (so long as the
|
|
|
|
.BR F_SEAL_SEAL
|
|
|
|
seal has not yet been applied).
|
|
|
|
.\"
|
2015-01-08 11:24:37 +00:00
|
|
|
.\" FIXME Do we have any nice example program that could go in the man page?
|
2015-01-08 05:40:33 +00:00
|
|
|
.SH SEE ALSO
|
|
|
|
.BR fcntl (2),
|
2015-01-09 10:07:01 +00:00
|
|
|
.BR ftruncate (2),
|
|
|
|
.BR mmap (2),
|
2015-01-08 11:24:37 +00:00
|
|
|
.\" FIXME Why the reference to shmget(2) in particular (and not,
|
2015-01-09 10:07:01 +00:00
|
|
|
.\" e.g., shm_open(3))?
|
2015-01-08 05:48:58 +00:00
|
|
|
.BR shmget (2)
|