man-pages/man2/set_mempolicy.2

243 lines
6.5 KiB
Groff
Raw Normal View History

.\" Copyright 2003,2004 Andi Kleen, SuSE Labs.
.\" and Copyright 2007 Lee Schermerhorn, Hewlett Packard
.\"
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Permission is granted to copy and distribute modified versions of this
.\" manual under the conditions for verbatim copying, provided that the
.\" entire resulting derived work is distributed under the terms of a
.\" permission notice identical to this one.
.\"
.\" Since the Linux kernel and libraries are constantly changing, this
.\" manual page may be incorrect or out-of-date. The author(s) assume no
.\" responsibility for errors or omissions, or for damages resulting from
.\" the use of the information contained herein.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
.\"
.\" 2006-02-03, mtk, substantial wording changes and other improvements
2007-08-27 11:34:07 +00:00
.\" 2007-08-27, Lee Schermerhorn <Lee.Schermerhorn@hp.com>
.\" more precise specification of behavior.
.\"
.TH SET_MEMPOLICY 2 2008-08-07 Linux "Linux Programmer's Manual"
.SH NAME
set_mempolicy \- set default NUMA memory policy for a process and its children
.SH SYNOPSIS
2007-04-03 14:04:54 +00:00
.nf
.B "#include <numaif.h>"
.sp
.BI "int set_mempolicy(int " mode ", unsigned long *" nodemask ,
2007-04-03 14:04:54 +00:00
.BI " unsigned long " maxnode );
.sp
2007-08-27 11:34:07 +00:00
Link with \fI\-lnuma\fP
2007-04-03 14:04:54 +00:00
.fi
.SH DESCRIPTION
.BR set_mempolicy ()
sets the NUMA memory policy of the calling process,
which consists of a policy mode and zero or more nodes,
to the values specified by the
.IR mode ,
.I nodemask
and
2007-09-20 16:26:31 +00:00
.I maxnode
arguments.
A NUMA machine has different
memory controllers with different distances to specific CPUs.
The memory policy defines from which node memory is allocated for
the process.
This system call defines the default policy for the process.
2008-03-19 08:56:26 +00:00
The process policy governs allocation of pages in the process's
address space outside of memory ranges
controlled by a more specific policy set by
.BR mbind (2).
The process default policy also controls allocation of any pages for
memory mapped files mapped using the
.BR mmap (2)
call with the
.B MAP_PRIVATE
flag and that are only read [loaded] from by the process
and of memory mapped files mapped using the
.BR mmap (2)
call with the
.B MAP_SHARED
flag, regardless of the access type.
The policy is only applied when a new page is allocated
for the process.
For anonymous memory this is when the page is first
touched by the application.
The
.I mode
argument must specify one of
.BR MPOL_DEFAULT ,
.BR MPOL_BIND ,
.B MPOL_INTERLEAVE
or
.BR MPOL_PREFERRED .
All modes except
.B MPOL_DEFAULT
require the caller to specify via the
.I nodemask
argument one or more nodes.
.I nodemask
2007-08-27 11:34:07 +00:00
points to a bit mask of node IDs that contains up to
.I maxnode
bits.
The bit mask size is rounded to the next multiple of
.IR "sizeof(unsigned long)" ,
but the kernel will only use bits up to
.IR maxnode .
A NULL value of
.I nodemask
or a
.I maxnode
value of zero specifies the empty set of nodes.
If the value of
.I maxnode
is zero,
the
.I nodemask
argument is ignored.
Where a
.I nodemask
is required, it must contain at least one node that is on-line,
allowed by the process's current cpuset context,
and contains memory.
The
.B MPOL_DEFAULT
mode is the default and means to allocate memory locally,
i.e., on the node of the CPU that triggered the allocation.
.I nodemask
must be specified as NULL.
If the "local node" contains no free memory, the system will
attempt to allocate memory from a "near by" node.
The
.B MPOL_BIND
mode defines a strict policy that restricts memory allocation to the
nodes specified in
.IR nodemask .
If
.I nodemask
specifies more than one node, page allocations will come from
2007-08-27 11:34:07 +00:00
the node with the lowest numeric node ID first, until that node
contains no free memory.
Allocations will then come from the node with the next highest
2007-08-27 11:34:07 +00:00
node ID specified in
.I nodemask
and so forth, until none of the specified nodes contain free memory.
Pages will not be allocated from any node not specified in the
.IR nodemask .
.B MPOL_INTERLEAVE
interleaves page allocations across the nodes specified in
.I nodemask
2007-08-27 11:34:07 +00:00
in numeric node ID order.
This optimizes for bandwidth instead of latency
by spreading out pages and memory accesses to those pages across
multiple nodes.
However, accesses to a single page will still be limited to
the memory bandwidth of a single node.
.\" NOTE: the following sentence doesn't make sense in the context
.\" of set_mempolicy() -- no memory area specified.
.\" To be effective the memory area should be fairly large,
.\" at least 1MB or bigger.
.B MPOL_PREFERRED
sets the preferred node for allocation.
The kernel will try to allocate pages from this node first
and fall back to "near by" nodes if the preferred node is low on free
memory.
If
.I nodemask
2007-08-27 11:34:07 +00:00
specifies more than one node ID, the first node in the
mask will be selected as the preferred node.
If the
.I nodemask
and
.I maxnode
arguments specify the empty set, then the memory is allocated on
2007-08-27 07:31:05 +00:00
the node of the CPU that triggered the allocation (like
.BR MPOL_DEFAULT ).
The process memory policy is preserved across an
.BR execve (2),
and is inherited by child processes created using
.BR fork (2)
or
.BR clone (2).
.SH RETURN VALUE
On success,
.BR set_mempolicy ()
returns 0;
on error, \-1 is returned and
.I errno
is set to indicate the error.
.SH ERRORS
.TP
2007-10-23 06:21:25 +00:00
.B EFAULT
Part of all of the memory range specified by
.I nodemask
and
.I maxnode
points outside your accessible address space.
.TP
.B EINVAL
2007-12-22 16:56:03 +00:00
.I mode
is invalid.
Or,
.I mode
is
2007-08-27 11:34:07 +00:00
.B MPOL_DEFAULT
and
.I nodemask
2008-03-19 13:11:38 +00:00
is non-empty,
or
.I mode
is
2007-08-27 11:34:07 +00:00
.B MPOL_BIND
or
2007-08-27 11:34:07 +00:00
.B MPOL_INTERLEAVE
and
.I nodemask
is empty.
Or,
.I maxnode
specifies more than a page worth of bits.
Or,
.I nodemask
2007-08-27 11:34:07 +00:00
specifies one or more node IDs that are
greater than the maximum supported node ID.
2007-08-27 11:34:07 +00:00
Or, none of the node IDs specified by
.I nodemask
are on-line and allowed by the process's current cpuset context,
or none of the specified nodes contain memory.
.TP
.B ENOMEM
Insufficient kernel memory was available.
.SH CONFORMING TO
2007-12-25 21:28:09 +00:00
This system call is Linux-specific.
.SH NOTES
Process policy is not remembered if the page is swapped out.
When such a page is paged back in, it will use the policy of
the process or memory range that is in effect at the time the
page is allocated.
.SS "Versions and Library Support"
See
.BR mbind (2).
.SH SEE ALSO
2008-07-03 12:20:42 +00:00
.BR get_mempolicy (2),
.BR getcpu (2),
.BR mbind (2),
.BR mmap (2),
2008-06-17 08:32:41 +00:00
.BR numa (3),
.BR cpuset (7),
.BR numactl (8)