mirror of https://github.com/mkerrisk/man-pages
296 lines
7.8 KiB
Groff
296 lines
7.8 KiB
Groff
.\" Copyright 2003,2004 Andi Kleen, SuSE Labs.
|
|
.\" and Copyright 2007 Lee Schermerhorn, Hewlett Packard
|
|
.\"
|
|
.\" Permission is granted to make and distribute verbatim copies of this
|
|
.\" manual provided the copyright notice and this permission notice are
|
|
.\" preserved on all copies.
|
|
.\"
|
|
.\" Permission is granted to copy and distribute modified versions of this
|
|
.\" manual under the conditions for verbatim copying, provided that the
|
|
.\" entire resulting derived work is distributed under the terms of a
|
|
.\" permission notice identical to this one.
|
|
.\"
|
|
.\" Since the Linux kernel and libraries are constantly changing, this
|
|
.\" manual page may be incorrect or out-of-date. The author(s) assume no
|
|
.\" responsibility for errors or omissions, or for damages resulting from
|
|
.\" the use of the information contained herein.
|
|
.\"
|
|
.\" Formatted or processed versions of this manual, if unaccompanied by
|
|
.\" the source, must acknowledge the copyright and authors of this work.
|
|
.\"
|
|
.\" 2006-02-03, mtk, substantial wording changes and other improvements
|
|
.\" 2007-08-27, Lee Schermerhorn <Lee.Schermerhorn@hp.com>
|
|
.\" more precise specification of behavior.
|
|
.\"
|
|
.TH SET_MEMPOLICY 2 2008-08-15 Linux "Linux Programmer's Manual"
|
|
.SH NAME
|
|
set_mempolicy \- set default NUMA memory policy for a process and its children
|
|
.SH SYNOPSIS
|
|
.nf
|
|
.B "#include <numaif.h>"
|
|
.sp
|
|
.BI "int set_mempolicy(int " mode ", unsigned long *" nodemask ,
|
|
.BI " unsigned long " maxnode );
|
|
.sp
|
|
Link with \fI\-lnuma\fP.
|
|
.fi
|
|
.SH DESCRIPTION
|
|
.BR set_mempolicy ()
|
|
sets the NUMA memory policy of the calling process,
|
|
which consists of a policy mode and zero or more nodes,
|
|
to the values specified by the
|
|
.IR mode ,
|
|
.I nodemask
|
|
and
|
|
.I maxnode
|
|
arguments.
|
|
|
|
A NUMA machine has different
|
|
memory controllers with different distances to specific CPUs.
|
|
The memory policy defines from which node memory is allocated for
|
|
the process.
|
|
|
|
This system call defines the default policy for the process.
|
|
The process policy governs allocation of pages in the process's
|
|
address space outside of memory ranges
|
|
controlled by a more specific policy set by
|
|
.BR mbind (2).
|
|
The process default policy also controls allocation of any pages for
|
|
memory mapped files mapped using the
|
|
.BR mmap (2)
|
|
call with the
|
|
.B MAP_PRIVATE
|
|
flag and that are only read [loaded] from by the process
|
|
and of memory mapped files mapped using the
|
|
.BR mmap (2)
|
|
call with the
|
|
.B MAP_SHARED
|
|
flag, regardless of the access type.
|
|
The policy is only applied when a new page is allocated
|
|
for the process.
|
|
For anonymous memory this is when the page is first
|
|
touched by the application.
|
|
|
|
The
|
|
.I mode
|
|
argument must specify one of
|
|
.BR MPOL_DEFAULT ,
|
|
.BR MPOL_BIND ,
|
|
.B MPOL_INTERLEAVE
|
|
or
|
|
.BR MPOL_PREFERRED .
|
|
All modes except
|
|
.B MPOL_DEFAULT
|
|
require the caller to specify via the
|
|
.I nodemask
|
|
argument one or more nodes.
|
|
|
|
The
|
|
.I mode
|
|
argument may also include an optional
|
|
.IR "mode flag" .
|
|
The supported
|
|
.I "mode flags"
|
|
are:
|
|
.TP
|
|
.BR MPOL_F_STATIC_NODES " (since Linux 2.6.26)"
|
|
A nonempty
|
|
.I nodemask
|
|
specifies physical node ids.
|
|
Linux does will not remap the
|
|
.I nodemask
|
|
when the process moves to a different cpuset context,
|
|
nor when the set of nodes allowed by the process's
|
|
current cpuset context changes.
|
|
.TP
|
|
.BR MPOL_F_RELATIVE_NODES " (since Linux 2.6.26)"
|
|
A nonempty
|
|
.I nodemask
|
|
specifies node ids that are relative to the set of
|
|
node ids allowed by the process's current cpuset.
|
|
.PP
|
|
.I nodemask
|
|
points to a bit mask of node IDs that contains up to
|
|
.I maxnode
|
|
bits.
|
|
The bit mask size is rounded to the next multiple of
|
|
.IR "sizeof(unsigned long)" ,
|
|
but the kernel will only use bits up to
|
|
.IR maxnode .
|
|
A NULL value of
|
|
.I nodemask
|
|
or a
|
|
.I maxnode
|
|
value of zero specifies the empty set of nodes.
|
|
If the value of
|
|
.I maxnode
|
|
is zero,
|
|
the
|
|
.I nodemask
|
|
argument is ignored.
|
|
|
|
Where a
|
|
.I nodemask
|
|
is required, it must contain at least one node that is on-line,
|
|
allowed by the process's current cpuset context,
|
|
[unless the
|
|
.B MPOL_F_STATIC_NODES
|
|
mode flag is specified],
|
|
and contains memory.
|
|
If the
|
|
.B MPOL_F_STATIC_NODES
|
|
is set in
|
|
.I mode
|
|
and a required
|
|
.I nodemask
|
|
contains no nodes that are allowed by the process's current cpuset context,
|
|
the memory policy reverts to
|
|
.IR "local allocation" .
|
|
This effectively overrides the specified policy until the process's
|
|
cpuset context includes one or more of the nodes specified by
|
|
.IR nodemask.
|
|
|
|
The
|
|
.B MPOL_DEFAULT
|
|
mode specifies that any nondefault process memory policy be removed,
|
|
so that the memory policy "falls back" to the system default policy.
|
|
The system default policy is "local allocation"--
|
|
i.e., allocate memory on the node of the CPU that triggered the allocation.
|
|
.I nodemask
|
|
must be specified as NULL.
|
|
If the "local node" contains no free memory, the system will
|
|
attempt to allocate memory from a "near by" node.
|
|
|
|
The
|
|
.B MPOL_BIND
|
|
mode defines a strict policy that restricts memory allocation to the
|
|
nodes specified in
|
|
.IR nodemask .
|
|
If
|
|
.I nodemask
|
|
specifies more than one node, page allocations will come from
|
|
the node with the lowest numeric node ID first, until that node
|
|
contains no free memory.
|
|
Allocations will then come from the node with the next highest
|
|
node ID specified in
|
|
.I nodemask
|
|
and so forth, until none of the specified nodes contain free memory.
|
|
Pages will not be allocated from any node not specified in the
|
|
.IR nodemask .
|
|
|
|
.B MPOL_INTERLEAVE
|
|
interleaves page allocations across the nodes specified in
|
|
.I nodemask
|
|
in numeric node ID order.
|
|
This optimizes for bandwidth instead of latency
|
|
by spreading out pages and memory accesses to those pages across
|
|
multiple nodes.
|
|
However, accesses to a single page will still be limited to
|
|
the memory bandwidth of a single node.
|
|
.\" NOTE: the following sentence doesn't make sense in the context
|
|
.\" of set_mempolicy() -- no memory area specified.
|
|
.\" To be effective the memory area should be fairly large,
|
|
.\" at least 1MB or bigger.
|
|
|
|
.B MPOL_PREFERRED
|
|
sets the preferred node for allocation.
|
|
The kernel will try to allocate pages from this node first
|
|
and fall back to "near by" nodes if the preferred node is low on free
|
|
memory.
|
|
If
|
|
.I nodemask
|
|
specifies more than one node ID, the first node in the
|
|
mask will be selected as the preferred node.
|
|
If the
|
|
.I nodemask
|
|
and
|
|
.I maxnode
|
|
arguments specify the empty set, then the policy
|
|
specifies "local allocation"
|
|
(like the system default policy discussed above).
|
|
|
|
The process memory policy is preserved across an
|
|
.BR execve (2),
|
|
and is inherited by child processes created using
|
|
.BR fork (2)
|
|
or
|
|
.BR clone (2).
|
|
.SH RETURN VALUE
|
|
On success,
|
|
.BR set_mempolicy ()
|
|
returns 0;
|
|
on error, \-1 is returned and
|
|
.I errno
|
|
is set to indicate the error.
|
|
.SH ERRORS
|
|
.TP
|
|
.B EFAULT
|
|
Part of all of the memory range specified by
|
|
.I nodemask
|
|
and
|
|
.I maxnode
|
|
points outside your accessible address space.
|
|
.TP
|
|
.B EINVAL
|
|
.I mode
|
|
is invalid.
|
|
Or,
|
|
.I mode
|
|
is
|
|
.B MPOL_DEFAULT
|
|
and
|
|
.I nodemask
|
|
is nonempty,
|
|
or
|
|
.I mode
|
|
is
|
|
.B MPOL_BIND
|
|
or
|
|
.B MPOL_INTERLEAVE
|
|
and
|
|
.I nodemask
|
|
is empty.
|
|
Or,
|
|
.I maxnode
|
|
specifies more than a page worth of bits.
|
|
Or,
|
|
.I nodemask
|
|
specifies one or more node IDs that are
|
|
greater than the maximum supported node ID.
|
|
Or, none of the node IDs specified by
|
|
.I nodemask
|
|
are on-line and allowed by the process's current cpuset context,
|
|
or none of the specified nodes contain memory.
|
|
Or, the
|
|
.I mode
|
|
argument specified both
|
|
.B MPOL_F_STATIC_NODES
|
|
and
|
|
.BR MPOL_F_RELATIVE_NODES .
|
|
.TP
|
|
.B ENOMEM
|
|
Insufficient kernel memory was available.
|
|
.SH VERSIONS
|
|
The
|
|
.BR set_mempolicy (),
|
|
system call was added to the Linux kernel in version 2.6.7.
|
|
.SH CONFORMING TO
|
|
This system call is Linux-specific.
|
|
.SH NOTES
|
|
Process policy is not remembered if the page is swapped out.
|
|
When such a page is paged back in, it will use the policy of
|
|
the process or memory range that is in effect at the time the
|
|
page is allocated.
|
|
|
|
For information on library support, see
|
|
.BR numa (7).
|
|
.SH SEE ALSO
|
|
.BR get_mempolicy (2),
|
|
.BR getcpu (2),
|
|
.BR mbind (2),
|
|
.BR mmap (2),
|
|
.BR numa (3),
|
|
.BR cpuset (7),
|
|
.BR numa (7),
|
|
.BR numactl (8)
|