mirror of https://github.com/mkerrisk/man-pages
+ changed the "policy" parameter to "mode" through out the
descriptions in an attempt to promote the concept that the memory policy is a tuple consisting of a mode and optional set of nodes. + added requirement to link '-lnuma' to synopsis + rewrite portions of description for clarification. ++ clarify interaction of policy with mmap()'d files. ++ defined how "empty set of nodes" specified and what this means for MPOL_PREFERRED. ++ mention what happens if local/target node contains no free memory. ++ clarify semantics of multiple nodes to BIND policy. Note: subject to change. We'll fix the man pages when/if this happens. + added all errors currently returned by sys call. + added mmap(2) to See Also list.
This commit is contained in:
parent
9f5682a8ed
commit
73ae0a09ae
|
@ -1,4 +1,5 @@
|
|||
.\" Copyright 2003,2004 Andi Kleen, SuSE Labs.
|
||||
.\" and Copyright 2007 Lee Schermerhorn, Hewlett Packard
|
||||
.\"
|
||||
.\" Permission is granted to make and distribute verbatim copies of this
|
||||
.\" manual provided the copyright notice and this permission notice are
|
||||
|
@ -18,88 +19,150 @@
|
|||
.\" the source, must acknowledge the copyright and authors of this work.
|
||||
.\"
|
||||
.\" 2006-02-03, mtk, substantial wording changes and other improvements
|
||||
.\" 2007-06-01, lts, more precise specification of behavior.
|
||||
.\"
|
||||
.TH SET_MEMPOLICY 2 2006-02-07 "Linux" "Linux Programmer's Manual"
|
||||
.TH SET_MEMPOLICY 2 2006-02-07 Linux "Linux Programmer's Manual"
|
||||
.SH NAME
|
||||
set_mempolicy \- set default NUMA memory policy for a process and its children.
|
||||
set_mempolicy \- set default NUMA memory policy for a process and its children
|
||||
.SH SYNOPSIS
|
||||
.nf
|
||||
.B "#include <numaif.h>"
|
||||
.sp
|
||||
.BI "int set_mempolicy(int " policy ", unsigned long *" nodemask ,
|
||||
.BI "int set_mempolicy(int " mode ", unsigned long *" nodemask ,
|
||||
.BI " unsigned long " maxnode );
|
||||
.sp
|
||||
.BI "cc ... \-lnuma"
|
||||
.fi
|
||||
.SH DESCRIPTION
|
||||
.BR set_mempolicy ()
|
||||
sets the NUMA memory policy of the calling process to
|
||||
.IR policy .
|
||||
sets the NUMA memory policy of the calling process,
|
||||
which consists of a policy mode and zero or more nodes,
|
||||
to the values specified by the
|
||||
.IR mode ,
|
||||
.I nodemask
|
||||
and
|
||||
.IR maxnode
|
||||
arguments.
|
||||
|
||||
A NUMA machine has different
|
||||
memory controllers with different distances to specific CPUs.
|
||||
The memory policy defines in which node memory is allocated for
|
||||
The memory policy defines from which node memory is allocated for
|
||||
the process.
|
||||
|
||||
This system call defines the default policy for the process;
|
||||
in addition a policy can be set for specific memory ranges using
|
||||
This system call defines the default policy for the process.
|
||||
The process policy governs allocation of pages in the process'
|
||||
address space outside of memory ranges
|
||||
controlled by a more specific policy set by
|
||||
.BR mbind (2).
|
||||
The process default policy also controls allocation of any pages for
|
||||
memory mapped files mapped using the
|
||||
.BR mmap (2)
|
||||
call with the
|
||||
.B MAP_PRIVATE
|
||||
flag and that are only read [loaded] from by the task
|
||||
and of memory mapped files mapped using the
|
||||
.BR mmap (2)
|
||||
call with the
|
||||
.B MAP_SHARED
|
||||
flag, regardless of the access type.
|
||||
The policy is only applied when a new page is allocated
|
||||
for the process.
|
||||
For anonymous memory this is when the page is first
|
||||
touched by the application.
|
||||
|
||||
Available policies are
|
||||
The
|
||||
.I mode
|
||||
argument must specify one of
|
||||
.BR MPOL_DEFAULT ,
|
||||
.BR MPOL_BIND ,
|
||||
.BR MPOL_INTERLEAVE ,
|
||||
.B MPOL_INTERLEAVE
|
||||
or
|
||||
.BR MPOL_PREFERRED .
|
||||
All policies except
|
||||
All modes except
|
||||
.B MPOL_DEFAULT
|
||||
require the caller to specify the nodes to which the policy applies in the
|
||||
require the caller to specify via the
|
||||
.I nodemask
|
||||
parameter.
|
||||
parameter
|
||||
one or more nodes.
|
||||
|
||||
.I nodemask
|
||||
is pointer to a bit field of nodes that contains up to
|
||||
points to a bit mask of node ids that contains up to
|
||||
.I maxnode
|
||||
bits.
|
||||
The bit field size is rounded to the next multiple of
|
||||
The bit mask size is rounded to the next multiple of
|
||||
.IR "sizeof(unsigned long)" ,
|
||||
but the kernel will only use bits up to
|
||||
.IR maxnode .
|
||||
A NULL value of
|
||||
.I nodemask
|
||||
or a
|
||||
.I maxnode
|
||||
value of zero specifies the empty set of nodes.
|
||||
If the value of
|
||||
.I maxnode
|
||||
is zero,
|
||||
the
|
||||
.I nodemask
|
||||
argument is ignored.
|
||||
|
||||
The
|
||||
.B MPOL_DEFAULT
|
||||
policy is the default and means to allocate memory locally,
|
||||
mode is the default and means to allocate memory locally,
|
||||
i.e., on the node of the CPU that triggered the allocation.
|
||||
.I nodemask
|
||||
should be specified as NULL.
|
||||
must be specified as NULL.
|
||||
If the "local node" contains no free memory, the system will
|
||||
attempt to allocate memory from a "near by" node.
|
||||
|
||||
The
|
||||
.B MPOL_BIND
|
||||
policy is a strict policy that restricts memory allocation to the
|
||||
mode defines a strict policy that restricts memory allocation to the
|
||||
nodes specified in
|
||||
.IR nodemask .
|
||||
There won't be allocations on other nodes.
|
||||
If
|
||||
.I nodemask
|
||||
specifies more than one node, page allocations will come from
|
||||
the node with the lowest numeric node id first, until that node
|
||||
contains no free memory.
|
||||
Allocations will then come from the node with the next highest
|
||||
node id specified in
|
||||
.I nodemask
|
||||
and so forth, until none of the specified nodes contain free memory.
|
||||
Pages will not be allocated from any node not specified in the
|
||||
.IR nodemask .
|
||||
|
||||
.B MPOL_INTERLEAVE
|
||||
interleaves allocations to the nodes specified in
|
||||
.IR nodemask .
|
||||
This optimizes for bandwidth instead of latency.
|
||||
To be effective the memory area should be fairly large,
|
||||
at least 1MB or bigger.
|
||||
interleaves page allocations across the nodes specified in
|
||||
.I nodemask
|
||||
in numeric node id order.
|
||||
This optimizes for bandwidth instead of latency
|
||||
by spreading out pages and memory accesses to those pages across
|
||||
multiple nodes.
|
||||
However, accesses to a single page will still be limited to
|
||||
the memory bandwidth of a single node.
|
||||
.\" NOTE: the following sentence doesn't make sense in the context
|
||||
.\" of set_mempolicy() -- no memory area specified.
|
||||
.\" To be effective the memory area should be fairly large,
|
||||
.\" at least 1MB or bigger.
|
||||
|
||||
.B MPOL_PREFERRED
|
||||
sets the preferred node for allocation.
|
||||
The kernel will try to allocate in this
|
||||
node first and fall back to other nodes if the preferred node is low on free
|
||||
The kernel will try to allocate pages from this node first
|
||||
and fall back to "near by" nodes if the preferred node is low on free
|
||||
memory.
|
||||
Only the first node in the
|
||||
If
|
||||
.I nodemask
|
||||
is used.
|
||||
If no node is set in the mask, then the memory is allocated on
|
||||
specifies more than one node id, the first node in the
|
||||
mask will be selected as the preferred node.
|
||||
If the
|
||||
.I nodemask
|
||||
and
|
||||
.I maxnode
|
||||
arguments specify the empty set, then the memory is allocated on
|
||||
the node of the CPU that triggered the allocation (like
|
||||
.BR MPOL_DEFAULT ).
|
||||
|
||||
The memory policy is preserved across an
|
||||
The process memory policy is preserved across an
|
||||
.BR execve (2),
|
||||
and is inherited by child processes created using
|
||||
.BR fork (2)
|
||||
|
@ -112,21 +175,62 @@ returns 0;
|
|||
on error, \-1 is returned and
|
||||
.I errno
|
||||
is set to indicate the error.
|
||||
.\" .SH ERRORS
|
||||
.\" FIXME no errors are listed on this page
|
||||
.\" .
|
||||
.\" .TP
|
||||
.\" .B EINVAL
|
||||
.\" .I mode is invalid.
|
||||
.SH ERRORS
|
||||
.TP
|
||||
.B EINVAL
|
||||
.I mode is invalid.
|
||||
Or,
|
||||
.I mode
|
||||
is
|
||||
.I MPOL_DEFAULT
|
||||
and
|
||||
.I nodemask
|
||||
is non-empty,
|
||||
or
|
||||
.I mode
|
||||
is
|
||||
.I MPOL_BIND
|
||||
or
|
||||
.I MPOL_INTERLEAVE
|
||||
and
|
||||
.I nodemask
|
||||
is empty.
|
||||
Or,
|
||||
.I maxnode
|
||||
specifies more than a page worth of bits.
|
||||
Or,
|
||||
.I nodemask
|
||||
specifies one or more node ids that are
|
||||
greater than the maximum supported node id,
|
||||
or are not allowed in the calling task's context.
|
||||
.\" "calling task's context" refers to cpusets. No man page avail to ref. --lts
|
||||
Or, none of the node ids specified by
|
||||
.I nodemask
|
||||
are on-line, or none of the specified nodes contain memory.
|
||||
.TP
|
||||
.B EFAULT
|
||||
Part of all of the memory range specified by
|
||||
.I nodemask
|
||||
and
|
||||
.I maxnode
|
||||
points outside your accessible address space.
|
||||
.TP
|
||||
.B ENOMEM
|
||||
Insufficient kernel memory was available.
|
||||
|
||||
.SH CONFORMING TO
|
||||
This system call is Linux specific.
|
||||
.SH NOTES
|
||||
Process policy is not remembered if the page is swapped out.
|
||||
When such a page is paged back in, it will use the policy of
|
||||
the process or memory range that is in effect at the time the
|
||||
page is allocated.
|
||||
.SS "Versions and Library Support"
|
||||
See
|
||||
.BR mbind (2).
|
||||
.SH SEE ALSO
|
||||
.BR mbind (2),
|
||||
.BR mmap (2),
|
||||
.BR get_mempolicy (2),
|
||||
.BR numactl (8),
|
||||
.BR numa (3)
|
||||
|
|
Loading…
Reference in New Issue