mirror of https://github.com/mkerrisk/man-pages
userfaultfd.2: Start documenting non-cooperative events
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
This commit is contained in:
parent
78cba5ac49
commit
5b1c4a1ed7
|
@ -75,7 +75,7 @@ flag in
|
|||
.PP
|
||||
When the last file descriptor referring to a userfaultfd object is closed,
|
||||
all memory ranges that were registered with the object are unregistered
|
||||
and unread page-fault events are flushed.
|
||||
and unread events are flushed.
|
||||
.\"
|
||||
.SS Usage
|
||||
The userfaultfd mechanism is designed to allow a thread in a multithreaded
|
||||
|
@ -99,6 +99,20 @@ In such non-cooperative mode,
|
|||
the process that monitors userfaultfd and handles page faults
|
||||
needs to be aware of the changes in the virtual memory layout
|
||||
of the faulting process to avoid memory corruption.
|
||||
|
||||
Starting from Linux 4.11,
|
||||
userfaultfd may notify the fault-handling threads about changes
|
||||
in the virtual memory layout of the faulting process.
|
||||
In addition, if the faulting process invokes
|
||||
.BR fork (2)
|
||||
system call,
|
||||
the userfaultfd objects associated with the parent may be duplicated
|
||||
into the child process and the userfaultfd monitor will be notified
|
||||
about the file descriptor associated with the userfault objects
|
||||
created for the child process,
|
||||
which allows userfaultfd monitor to perform user-space paging
|
||||
for the child process.
|
||||
|
||||
.\" FIXME elaborate about non-cooperating mode, describe its limitations
|
||||
.\" for kernels before 4.11, features added in 4.11
|
||||
.\" and limitations remaining in 4.11
|
||||
|
@ -144,6 +158,10 @@ Details of the various
|
|||
operations can be found in
|
||||
.BR ioctl_userfaultfd (2).
|
||||
|
||||
Since Linux 4.11, events other than page-fault may enabled during
|
||||
.B UFFDIO_API
|
||||
operation.
|
||||
|
||||
Up to Linux 4.11,
|
||||
userfaultfd can be used only with anonymous private memory mappings.
|
||||
|
||||
|
@ -156,7 +174,8 @@ Each
|
|||
.BR read (2)
|
||||
from the userfaultfd file descriptor returns one or more
|
||||
.I uffd_msg
|
||||
structures, each of which describes a page-fault event:
|
||||
structures, each of which describes a page-fault event
|
||||
or an event required for the non-cooperative userfaultfd usage:
|
||||
|
||||
.nf
|
||||
.in +4n
|
||||
|
@ -168,6 +187,23 @@ struct uffd_msg {
|
|||
__u64 flags; /* Flags describing fault */
|
||||
__u64 address; /* Faulting address */
|
||||
} pagefault;
|
||||
struct {
|
||||
__u32 ufd; /* userfault file descriptor
|
||||
of the child process */
|
||||
} fork; /* since Linux 4.11 */
|
||||
struct {
|
||||
__u64 from; /* old address of the
|
||||
remapped area */
|
||||
__u64 to; /* new address of the
|
||||
remapped area */
|
||||
__u64 len; /* original mapping length */
|
||||
} remap; /* since Linux 4.11 */
|
||||
struct {
|
||||
__u64 start; /* start address of the
|
||||
removed area */
|
||||
__u64 end; /* end address of the
|
||||
removed area */
|
||||
} remove; /* since Linux 4.11 */
|
||||
...
|
||||
} arg;
|
||||
|
||||
|
@ -194,14 +230,73 @@ structure are as follows:
|
|||
.TP
|
||||
.I event
|
||||
The type of event.
|
||||
Currently, only one value can appear in this field:
|
||||
.BR UFFD_EVENT_PAGEFAULT ,
|
||||
which indicates a page-fault event.
|
||||
Depending of the event type,
|
||||
different fields of the
|
||||
.I arg
|
||||
union represent details required for the event processing.
|
||||
The non-page-fault events are generated only when appropriate feature
|
||||
is enabled during API handshake with
|
||||
.B UFFDIO_API
|
||||
.BR ioctl (2).
|
||||
|
||||
The following values can appear in the
|
||||
.I event
|
||||
field:
|
||||
.RS
|
||||
.TP
|
||||
.I address
|
||||
.B UFFD_EVENT_PAGEFAULT
|
||||
A page-fault event.
|
||||
The page-fault details are available in the
|
||||
.I pagefault
|
||||
field.
|
||||
.TP
|
||||
.B UFFD_EVENT_FORK
|
||||
Generated when the faulting process invokes
|
||||
.BR fork (2)
|
||||
system call.
|
||||
The event details are available in the
|
||||
.I fork
|
||||
field.
|
||||
.\" FIXME descirbe duplication of userfault file descriptor during fork
|
||||
.TP
|
||||
.B UFFD_EVENT_REMAP
|
||||
Generated when the faulting process invokes
|
||||
.BR mremap (2)
|
||||
system call.
|
||||
The event details are available in the
|
||||
.I remap
|
||||
field.
|
||||
.TP
|
||||
.B UFFD_EVENT_REMOVE
|
||||
Generated when the faulting process invokes
|
||||
.BR madvise (2)
|
||||
system call with
|
||||
.BR MADV_DONTNEED
|
||||
or
|
||||
.BR MADV_REMOVE
|
||||
advice.
|
||||
The event details are available in the
|
||||
.I remove
|
||||
field.
|
||||
.TP
|
||||
.B UFFD_EVENT_UNMAP
|
||||
Generated when the faulting process unmaps a memory range,
|
||||
either explicitly using
|
||||
.BR munmap (2)
|
||||
system call or implicitly during
|
||||
.BR mmap (2)
|
||||
or
|
||||
.BR mremap (2)
|
||||
system calls.
|
||||
The event details are available in the
|
||||
.I remove
|
||||
field.
|
||||
.RE
|
||||
.TP
|
||||
.I pagefault.address
|
||||
The address that triggered the page fault.
|
||||
.TP
|
||||
.I flags
|
||||
.I pagefault.flags
|
||||
A bit mask of flags that describe the event.
|
||||
For
|
||||
.BR UFFD_EVENT_PAGEFAULT ,
|
||||
|
@ -218,6 +313,32 @@ otherwise it is a read fault.
|
|||
.\"
|
||||
.\" UFFD_PAGEFAULT_FLAG_WP is not yet supported.
|
||||
.RE
|
||||
.TP
|
||||
.I fork.ufd
|
||||
The file descriptor associated with the userfault object
|
||||
created for the child process
|
||||
.TP
|
||||
.I remap.from
|
||||
The original address of the memory range that was remapped using
|
||||
.BR mremap (2).
|
||||
.TP
|
||||
.I remap.to
|
||||
The new address of the memory range that was remapped using
|
||||
.BR mremap (2).
|
||||
.TP
|
||||
.I remap.len
|
||||
The original length of the the memory range that was remapped using
|
||||
.BR mremap (2).
|
||||
.TP
|
||||
.I remove.start
|
||||
The start address of the memory range that was freed using
|
||||
.BR madvise (2)
|
||||
or unmapped
|
||||
.TP
|
||||
.I remove.end
|
||||
The end address of the memory range that was freed using
|
||||
.BR madvise (2)
|
||||
or unmapped
|
||||
.PP
|
||||
A
|
||||
.BR read (2)
|
||||
|
|
Loading…
Reference in New Issue