_syscall.2, bpf.2, cacheflush.2, capget.2, chdir.2, chmod.2, chroot.2, clock_getres.2, clock_nanosleep.2, clone.2, close.2, connect.2, copy_file_range.2, create_module.2, delete_module.2, dup.2, epoll_create.2, epoll_ctl.2, epoll_wait.2, eventfd.2, execve.2, execveat.2, fallocate.2, flock.2, fork.2, fsync.2, futex.2, futimesat.2, get_kernel_syms.2, get_mempolicy.2, get_robust_list.2, getcpu.2, getdents.2, getdomainname.2, getgid.2, getgroups.2, gethostname.2, getitimer.2, getpagesize.2, getpeername.2, getpriority.2, getrandom.2, getresuid.2, getrlimit.2, getrusage.2, getsid.2, getsockname.2, getsockopt.2, gettid.2, gettimeofday.2, getuid.2, getunwind.2, init_module.2, inotify_add_watch.2, inotify_init.2, inotify_rm_watch.2, intro.2, io_cancel.2, io_destroy.2, io_getevents.2, io_setup.2, io_submit.2, ioctl_fat.2, ioctl_ficlonerange.2, ioctl_fideduperange.2, ioctl_tty.2, ioctl_userfaultfd.2, ioperm.2, iopl.2, ioprio_set.2, kcmp.2, kexec_load.2, keyctl.2, kill.2, link.2, listen.2, listxattr.2, llseek.2, lookup_dcookie.2, lseek.2, madvise.2, mbind.2, membarrier.2, memfd_create.2, migrate_pages.2, mincore.2, mkdir.2, mknod.2, mlock.2, mmap.2, mmap2.2, modify_ldt.2, move_pages.2, mprotect.2, mq_getsetattr.2, mremap.2, msgctl.2, msgget.2, msgop.2, msync.2, nanosleep.2, nfsservctl.2, nice.2, open_by_handle_at.2, outb.2, perf_event_open.2, perfmonctl.2, personality.2, pivot_root.2, pkey_alloc.2, poll.2, posix_fadvise.2, prctl.2, pread.2, process_vm_readv.2, ptrace.2, query_module.2, quotactl.2, read.2, readahead.2, readdir.2, readv.2, reboot.2, recv.2, recvmmsg.2, remap_file_pages.2, rename.2, request_key.2, restart_syscall.2, rt_sigqueueinfo.2, s390_pci_mmio_write.2, s390_runtime_instr.2, sched_get_priority_max.2, sched_rr_get_interval.2, sched_setaffinity.2, sched_setattr.2, sched_setparam.2, sched_setscheduler.2, sched_yield.2, seccomp.2, select.2, select_tut.2, semctl.2, semget.2, semop.2, send.2, sendfile.2, sendmmsg.2, set_mempolicy.2, set_thread_area.2, set_tid_address.2, seteuid.2, setfsgid.2, setfsuid.2, setgid.2, setns.2, setpgid.2, setresuid.2, setreuid.2, setsid.2, setuid.2, sgetmask.2, shmctl.2, shmget.2, shmop.2, sigaction.2, sigaltstack.2, sigpending.2, sigprocmask.2, sigreturn.2, sigsuspend.2, sigwaitinfo.2, socket.2, socketcall.2, socketpair.2, splice.2, spu_create.2, spu_run.2, stat.2, statfs.2, statx.2, subpage_prot.2, swapon.2, symlink.2, sync.2, sync_file_range.2, syscalls.2, sysctl.2, sysinfo.2, syslog.2, tee.2, time.2, timer_create.2, timer_getoverrun.2, timer_settime.2, timerfd_create.2, times.2, tkill.2, truncate.2, umask.2, umount.2, unimplemented.2, unlink.2, unshare.2, uselib.2, userfaultfd.2, utime.2, utimensat.2, vfork.2, vmsplice.2, wait.2, wait4.2, write.2: Formatting fix: replace blank lines with .PP/.IP

Blank lines shouldn't generally appear in *roff source (other
than in code examples), since they create large vertical
spaces between text blocks.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
This commit is contained in:
Michael Kerrisk 2017-08-16 09:30:51 +02:00
parent e3ec129351
commit efeece0465
213 changed files with 1937 additions and 1937 deletions

View File

@ -94,13 +94,13 @@ instead.
on those architectures,
.BR syscall (2)
was always required.)
.PP
The _syscall() macros
.I "do not"
produce a prototype.
You may have to
create one, especially for C++ users.
.PP
System calls are not required to return only positive or negative error
codes.
You need to read the source to be sure how it will return errors.
@ -121,7 +121,7 @@ when
is negative.
For the error codes, see
.BR errno (3).
.PP
When defining a system call, the argument types
.I must
be

View File

@ -146,7 +146,7 @@ The
.I size
argument is the size of the union pointed to by
.IR attr .
.PP
The value provided in
.IR cmd
is one of the following:
@ -919,7 +919,7 @@ to a perf event file descriptor,
.IR event_fd ,
that was created by a previous call to
.BR perf_event_open (2):
.PP
.in +4n
.nf
ioctl(event_fd, PERF_EVENT_IOC_SET_BPF, prog_fd);

View File

@ -93,7 +93,7 @@ and
.I nbytes
arguments, making this function fairly expensive.
Therefore, the whole cache is always flushed.
.PP
This function always behaves as if
.BR BCACHE
has been passed for the

View File

@ -101,7 +101,7 @@ To define the structures for passing to the system call, you have to use the
and
.I struct __user_cap_data_struct
names because the typedefs are only pointers.
.PP
Kernels prior to 2.6.25 prefer
32-bit capabilities with version
.BR _LINUX_CAPABILITY_VERSION_1 .
@ -110,19 +110,19 @@ Linux 2.6.25 added 64-bit capability sets, with version
There was, however, an API glitch, and Linux 2.6.26 added
.BR _LINUX_CAPABILITY_VERSION_3
to fix the problem.
.PP
Note that 64-bit capabilities use
.IR datap [0]
and
.IR datap [1],
whereas 32-bit capabilities use only
.IR datap [0].
.PP
On kernels that support file capabilities (VFS capability support),
these system calls behave slightly differently.
This support was added as an option in Linux 2.6.24,
and became fixed (nonoptional) in Linux 2.6.33.
.PP
For
.BR capget ()
calls, one can probe the capabilities of any process by specifying its
@ -167,7 +167,7 @@ caller and
.BR init (1);
or a value less than \-1, in which case the change is applied
to all members of the process group whose ID is \-\fIpid\fP.
.PP
For details on the data, see
.BR capabilities (7).
.SH RETURN VALUE
@ -175,7 +175,7 @@ On success, zero is returned.
On error, \-1 is returned, and
.I errno
is set appropriately.
.PP
The calls will fail with the error
.BR EINVAL ,
and set the

View File

@ -129,7 +129,7 @@ POSIX.1-2001, POSIX.1-2008, SVr4, 4.4BSD.
.SH NOTES
The current working directory is the starting point for interpreting
relative pathnames (those not starting with \(aq/\(aq).
.PP
A child process created via
.BR fork (2)
inherits its parent's current working directory.

View File

@ -164,7 +164,7 @@ The effective UID of the calling process must match the owner of the file,
or the process must be privileged (Linux: it must have the
.B CAP_FOWNER
capability).
.PP
If the calling process is not privileged (Linux: does not have the
.B CAP_FSETID
capability), and the group of the file does not match
@ -173,7 +173,7 @@ supplementary group IDs, the
.B S_ISGID
bit will be turned off,
but this will not cause an error to be returned.
.PP
As a security measure, depending on the filesystem,
the set-user-ID and set-group-ID execution bits
may be turned off if a file is written.
@ -185,7 +185,7 @@ which may have a special meaning.
For the sticky bit, and for set-user-ID and set-group-ID bits on
directories, see
.BR inode (7).
.PP
On NFS filesystems, restricting the permissions will immediately influence
already open files, because the access control is done on the server, but
open files are maintained by the client.
@ -199,7 +199,7 @@ The
system call operates in exactly the same way as
.BR chmod (),
except for the differences described here.
.PP
If the pathname given in
.I pathname
is relative, then it is interpreted relative to the directory
@ -209,7 +209,7 @@ referred to by the file descriptor
the calling process, as is done by
.BR chmod ()
for a relative pathname).
.PP
If
.I pathname
is relative and
@ -221,13 +221,13 @@ then
is interpreted relative to the current working
directory of the calling process (like
.BR chmod ()).
.PP
If
.I pathname
is absolute, then
.I dirfd
is ignored.
.PP
.I flags
can either be 0, or include the following flag:
.TP
@ -250,7 +250,7 @@ is set appropriately.
.SH ERRORS
Depending on the filesystem,
errors other than those listed below can be returned.
.PP
The more general errors for
.BR chmod ()
are listed below:
@ -350,7 +350,7 @@ library support was added to glibc in version 2.4.
.BR chmod (),
.BR fchmod ():
4.4BSD, SVr4, POSIX.1-2001i, POSIX.1-2008.
.PP
.BR fchmodat ():
POSIX.1-2008.
.SH NOTES

View File

@ -65,12 +65,12 @@ changes the root directory of the calling process to that specified in
.IR path .
This directory will be used for pathnames beginning with \fI/\fP.
The root directory is inherited by all children of the calling process.
.PP
Only a privileged process (Linux: one with the
.B CAP_SYS_CHROOT
capability in its user namespace) may call
.BR chroot ().
.PP
This call changes an ingredient in the pathname resolution process
and does nothing else.
In particular, it is not intended to be used
@ -87,7 +87,7 @@ The easiest way to do that is to
.BR chdir (2)
to the to-be-moved directory, wait for it to be moved out, then open a
path like ../../../etc/passwd.
.PP
.\" This is how the "slightly trickier variation" works:
.\" https://github.com/QubesOS/qubes-secpack/blob/master/QSBs/qsb-014-2015.txt#L142
A slightly
@ -98,7 +98,7 @@ If a daemon allows a "chroot directory" to be specified,
that usually means that if you want to prevent remote users from accessing
files outside the chroot directory, you must ensure that folders are never
moved out of it.
.PP
This call does not change the current working directory,
so that after the call \(aq\fI.\fP\(aq can
be outside the tree rooted at \(aq\fI/\fP\(aq.
@ -108,7 +108,7 @@ by doing:
mkdir foo; chroot foo; cd ..
.fi
.PP
This call does not close open file descriptors, and such file
descriptors may allow access to files outside the chroot tree.
.SH RETURN VALUE
@ -166,7 +166,7 @@ A child process created via
inherits its parent's root directory.
The root directory is left unchanged by
.BR execve (2).
.PP
FreeBSD has a stronger
.BR jail ()
system call.

View File

@ -224,7 +224,7 @@ T{
.BR clock_settime ()
T} Thread safety MT-Safe
.TE
.sp 1
.SH CONFORMING TO
POSIX.1-2001, POSIX.1-2008, SUSv2.
.SH AVAILABILITY
@ -289,7 +289,7 @@ Glibc contains no provisions to deal with these offsets (unlike the Linux
Kernel).
Typically these offsets are small and therefore the effects may be
negligible in most cases.
.PP
Since glibc 2.4,
the wrapper functions for the system calls described in this page avoid
the abovementioned problems by employing the kernel implementation of

View File

@ -58,7 +58,7 @@ It differs in allowing the caller to select the clock against
which the sleep interval is to be measured,
and in allowing the sleep interval to be specified as
either an absolute or a relative value.
.PP
The time values passed to and returned by this call are specified using
.I timespec
structures, defined as follows:
@ -71,7 +71,7 @@ struct timespec {
};
.fi
.in
.PP
The
.I clock_id
argument specifies the clock against which the sleep interval
@ -102,7 +102,7 @@ and
.BR pthread_getcpuclockid (3)
can also be passed in
.IR clock_id .
.PP
If
.I flags
is 0, then the value specified in
@ -110,7 +110,7 @@ is 0, then the value specified in
is interpreted as an interval relative to the current
value of the clock specified by
.IR clock_id .
.PP
If
.I flags
is
@ -125,7 +125,7 @@ is less than or equal to the current value of the clock,
then
.BR clock_nanosleep ()
returns immediately without suspending the calling thread.
.PP
.BR clock_nanosleep ()
suspends the execution of the calling thread
until either at least the time specified by
@ -133,7 +133,7 @@ until either at least the time specified by
has elapsed,
or a signal is delivered that causes a signal handler to be called or
that terminates the process.
.PP
If the call is interrupted by a signal handler,
.BR clock_nanosleep ()
fails with the error
@ -195,7 +195,7 @@ is not an exact multiple of the granularity underlying clock (see
then the interval will be rounded up to the next multiple.
Furthermore, after the sleep completes, there may still be a delay before
the CPU becomes free to once again execute the calling thread.
.PP
Using an absolute timer is useful for preventing
timer drift problems of the type described in
.BR nanosleep (2).
@ -210,14 +210,14 @@ and then call
with the
.B TIMER_ABSTIME
flag.
.PP
.BR clock_nanosleep ()
is never restarted after being interrupted by a signal handler,
regardless of the use of the
.BR sigaction (2)
.B SA_RESTART
flag.
.PP
The
.I remain
argument is unused, and unnecessary, when
@ -227,11 +227,11 @@ is
(An absolute sleep can be restarted using the same
.I request
argument.)
.PP
POSIX.1 specifies that
.BR clock_nanosleep ()
has no effect on signals dispositions or the signal mask.
.PP
POSIX.1 specifies that after changing the value of the
.B CLOCK_REALTIME
clock via
@ -243,7 +243,7 @@ will wake up;
if the new clock value falls past the end of the sleep interval, then the
.BR clock_nanosleep ()
call will return immediately.
.PP
POSIX.1 specifies that
changing the value of the
.B CLOCK_REALTIME

View File

@ -60,14 +60,14 @@ clone, __clone2 \- create a child process
.BR clone ()
creates a new process, in a manner similar to
.BR fork (2).
.PP
This page describes both the glibc
.BR clone ()
wrapper function and the underlying system call on which it is based.
The main text describes the wrapper function;
the differences for the raw system call
are described toward the end of this page.
.PP
Unlike
.BR fork (2),
.BR clone ()
@ -79,12 +79,12 @@ page, "calling process" normally corresponds to "parent process".
But see the description of
.B CLONE_PARENT
below.)
.PP
One use of
.BR clone ()
is to implement threads: multiple threads of control in a program that
run concurrently in a shared memory space.
.PP
When the child process is created with
.BR clone (),
it executes the function
@ -104,7 +104,7 @@ The
argument is passed to the
.I fn
function.
.PP
When the
.IR fn ( arg )
function application returns, the child process terminates.
@ -114,7 +114,7 @@ is the exit code for the child process.
The child process may also terminate explicitly by calling
.BR exit (2)
or after receiving a fatal signal.
.PP
The
.I child_stack
argument specifies the location of the stack used by the child process.
@ -130,7 +130,7 @@ Stacks grow downward on all processors that run Linux
.I child_stack
usually points to the topmost address of the memory space set up for
the child stack.
.PP
The low byte of
.I flags
contains the number of the
@ -146,7 +146,7 @@ options when waiting for the child with
.BR wait (2).
If no signal is specified, then the parent process is not signaled
when the child terminates.
.PP
.I flags
may also be bitwise-or'ed with zero or more of the following constants,
in order to specify what is shared between the calling process
@ -185,7 +185,7 @@ operation), the other process is also affected.
If a process sharing a file descriptor table calls
.BR execve (2),
its file descriptor table is duplicated (unshared).
.IP
If
.B CLONE_FILES
is not set, the child process inherits a copy of all file descriptors
@ -215,7 +215,7 @@ or
.BR umask (2)
performed by the calling process or the child process also affects the
other process.
.IP
If
.B CLONE_FS
is not set, the child process works on a copy of the filesystem
@ -236,7 +236,7 @@ the calling process.
If this flag is not set, then (as with
.BR fork (2))
the new process has its own I/O context.
.IP
.\" The following based on text from Jens Axboe
The I/O context is the I/O scope of the disk scheduler (i.e.,
what the I/O scheduler uses to model scheduling of a process's I/O).
@ -253,7 +253,7 @@ for instance), they should employ
.BR CLONE_IO
to get better I/O performance.
.\" with CFQ and AS.
.IP
If the kernel is not configured with the
.B CONFIG_BLOCK
option, this flag is a no-op.
@ -264,10 +264,10 @@ If this flag is not set, then (as with
.BR fork (2))
the process is created in the same cgroup namespaces as the calling process.
This flag is intended for the implementation of containers.
.IP
For further information on cgroup namespaces, see
.BR cgroup_namespaces (7).
.IP
Only a privileged process
.RB ( CAP_SYS_ADMIN )
can employ
@ -283,7 +283,7 @@ If this flag is not set, then (as with
the process is created in the same IPC namespace as
the calling process.
This flag is intended for the implementation of containers.
.IP
An IPC namespace provides an isolated view of System\ V IPC objects (see
.BR svipc (7))
and (since Linux 2.6.30)
@ -295,29 +295,29 @@ POSIX message queues
The common characteristic of these IPC mechanisms is that IPC
objects are identified by mechanisms other than filesystem
pathnames.
.IP
Objects created in an IPC namespace are visible to all other processes
that are members of that namespace,
but are not visible to processes in other IPC namespaces.
.IP
When an IPC namespace is destroyed
(i.e., when the last process that is a member of the namespace terminates),
all IPC objects in the namespace are automatically destroyed.
.IP
Only a privileged process
.RB ( CAP_SYS_ADMIN )
can employ
.BR CLONE_NEWIPC .
This flag can't be specified in conjunction with
.BR CLONE_SYSVSEM .
.IP
For further information on IPC namespaces, see
.BR namespaces (7).
.TP
.BR CLONE_NEWNET " (since Linux 2.6.24)"
(The implementation of this flag was completed only
by about kernel version 2.6.29.)
.IP
If
.B CLONE_NEWNET
is set, then create the process in a new network namespace.
@ -326,7 +326,7 @@ If this flag is not set, then (as with
the process is created in the same network namespace as
the calling process.
This flag is intended for the implementation of containers.
.IP
A network namespace provides an isolated view of the networking stack
(network device interfaces, IPv4 and IPv6 protocol stacks,
IP routing tables, firewall rules, the
@ -341,14 +341,14 @@ A virtual network device ("veth") pair provides a pipe-like abstraction
that can be used to create tunnels between network namespaces,
and can be used to create a bridge to a physical network device
in another namespace.
.IP
When a network namespace is freed
(i.e., when the last process in the namespace terminates),
its physical network devices are moved back to the
initial network namespace (not to the parent of the process).
For further information on network namespaces, see
.BR namespaces (7).
.IP
Only a privileged process
.RB ( CAP_SYS_ADMIN )
can employ
@ -363,7 +363,7 @@ If
.B CLONE_NEWNS
is not set, the child lives in the same mount
namespace as the parent.
.IP
Only a privileged process
.RB ( CAP_SYS_ADMIN )
can employ
@ -376,7 +376,7 @@ and
in the same
.BR clone ()
call.
.IP
For further information on mount namespaces, see
.BR namespaces (7)
and
@ -398,12 +398,12 @@ If this flag is not set, then (as with
the process is created in the same PID namespace as
the calling process.
This flag is intended for the implementation of containers.
.IP
For further information on PID namespaces, see
.BR namespaces (7)
and
.BR pid_namespaces (7).
.IP
Only a privileged process
.RB ( CAP_SYS_ADMIN )
can employ
@ -422,19 +422,19 @@ the current
semantics were merged in Linux 3.5,
and the final pieces to make the user namespaces completely usable were
merged in Linux 3.8.)
.IP
If
.B CLONE_NEWUSER
is set, then create the process in a new user namespace.
If this flag is not set, then (as with
.BR fork (2))
the process is created in the same user namespace as the calling process.
.IP
For further information on user namespaces, see
.BR namespaces (7)
and
.BR user_namespaces (7)
.IP
Before Linux 3.8, use of
.BR CLONE_NEWUSER
required that the caller have three capabilities:
@ -445,7 +445,7 @@ and
.\" Before Linux 2.6.29, it appears that only CAP_SYS_ADMIN was needed
Starting with Linux 3.8,
no privileges are needed to create a user namespace.
.IP
This flag can't be specified in conjunction with
.BR CLONE_THREAD
or
@ -459,7 +459,7 @@ For security reasons,
.BR CLONE_NEWUSER
cannot be specified in conjunction with
.BR CLONE_FS .
.IP
For further information on user namespaces, see
.BR user_namespaces (7).
.TP
@ -474,7 +474,7 @@ If this flag is not set, then (as with
the process is created in the same UTS namespace as
the calling process.
This flag is intended for the implementation of containers.
.IP
A UTS namespace is the set of identifiers returned by
.BR uname (2);
among these, the domain name and the hostname can be modified by
@ -485,12 +485,12 @@ respectively.
Changes made to the identifiers in a UTS namespace
are visible to all other processes in the same namespace,
but are not visible to processes in other UTS namespaces.
.IP
Only a privileged process
.RB ( CAP_SYS_ADMIN )
can employ
.BR CLONE_NEWUTS .
.IP
For further information on UTS namespaces, see
.BR namespaces (7).
.TP
@ -500,13 +500,13 @@ If
is set, then the parent of the new child (as returned by
.BR getppid (2))
will be the same as that of the calling process.
.IP
If
.B CLONE_PARENT
is not set, then (as with
.BR fork (2))
the child's parent is the calling process.
.IP
Note that it is the parent process, as returned by
.BR getppid (2),
which is signaled when the child terminates, so that
@ -548,7 +548,7 @@ then trace the child also (see
.BR CLONE_SETTLS " (since Linux 2.5.32)"
The TLS (Thread Local Storage) descriptor is set to
.I newtls.
.IP
The interpretation of
.I newtls
and the resulting effect is architecture dependent.
@ -581,7 +581,7 @@ signals.
So, one of them may block or unblock some signals using
.BR sigprocmask (2)
without affecting the other process.
.IP
If
.B CLONE_SIGHAND
is not set, the child process inherits a copy of the signal handlers
@ -592,7 +592,7 @@ Calls to
.BR sigaction (2)
performed later by one of the processes have no effect on the other
process.
.IP
Since Linux 2.6.0-test6,
.I flags
must also include
@ -609,7 +609,7 @@ is set, then the child is initially stopped (as though it was sent a
signal), and must be resumed by sending it a
.B SIGCONT
signal.
.IP
This flag was
.I deprecated
from Linux 2.6.25 onward,
@ -648,7 +648,7 @@ To make the remainder of the discussion of
.B CLONE_THREAD
more readable, the term "thread" is used to refer to the
processes within a thread group.
.IP
Thread groups were a feature added in Linux 2.4 to support the
POSIX threads notion of a set of threads that share a single PID.
Internally, this shared PID is the so-called
@ -656,7 +656,7 @@ thread group identifier (TGID) for the thread group.
Since Linux 2.4, calls to
.BR getpid (2)
return the TGID of the caller.
.IP
The threads within a group can be distinguished by their (system-wide)
unique thread IDs (TID).
A new thread's TID is available as the function result
@ -665,7 +665,7 @@ returned to the caller of
and a thread can obtain
its own TID using
.BR gettid (2).
.IP
When a call is made to
.BR clone ()
without specifying
@ -675,7 +675,7 @@ whose TGID is the same as the thread's TID.
This thread is the
.I leader
of the new thread group.
.IP
A new thread created with
.B CLONE_THREAD
has the same parent process as the caller of
@ -697,23 +697,23 @@ using
.BR wait (2).
(The thread is said to be
.IR detached .)
.IP
After all of the threads in a thread group terminate
the parent process of the thread group is sent a
.B SIGCHLD
(or other termination) signal.
.IP
If any of the threads in a thread group performs an
.BR execve (2),
then all threads other than the thread group leader are terminated,
and the new program is executed in the thread group leader.
.IP
If one of the threads in a thread group creates a child using
.BR fork (2),
then any thread in the group can
.BR wait (2)
for that child.
.IP
Since Linux 2.5.35,
.I flags
must also include
@ -726,17 +726,17 @@ is specified
also requires
.BR CLONE_VM
to be included).
.IP
Signals may be sent to a thread group as a whole (i.e., a TGID) using
.BR kill (2),
or to a specific thread (i.e., TID) using
.BR tgkill (2).
.IP
Signal dispositions and actions are process-wide:
if an unhandled signal is delivered to a thread, then
it will affect (terminate, stop, continue, be ignored in)
all members of the thread group.
.IP
Each thread has its own signal mask, as set by
.BR sigprocmask (2),
but signals can be pending either: for the whole process
@ -749,7 +749,7 @@ A call to
.BR sigpending (2)
returns a signal set that is the union of the signals pending for the
whole process and the signals that are pending for the calling thread.
.IP
If
.BR kill (2)
is used to send a signal to a thread group,
@ -780,7 +780,7 @@ or
.BR _exit (2)
(as with
.BR vfork (2)).
.IP
If
.B CLONE_VFORK
is not set, then both the calling process and the child are schedulable
@ -799,7 +799,7 @@ Moreover, any memory mapping or unmapping performed with
or
.BR munmap (2)
by the child or calling process also affects the other process.
.IP
If
.B CLONE_VM
is not set, the child process runs in a separate copy of the memory
@ -824,10 +824,10 @@ arguments of the
wrapper function are omitted.
Furthermore, the argument order changes.
In addition, there are variations across architectures.
.PP
The raw system call interface on x86-64 and some other architectures
(including sh, tile, and alpha) is roughly:
.PP
.in +4
.nf
.BI "long clone(unsigned long " flags ", void *" child_stack ,
@ -835,13 +835,13 @@ The raw system call interface on x86-64 and some other architectures
.BI " unsigned long " newtls );
.fi
.in
.PP
On x86-32, and several other common architectures
(including score, ARM, ARM 64, PA-RISC, arc, Power PC, xtensa,
and MIPS),
.\" CONFIG_CLONE_BACKWARDS
the order of the last two arguments is reversed:
.PP
.in +4
.nf
.BI "long clone(unsigned long " flags ", void *" child_stack ,
@ -849,11 +849,11 @@ the order of the last two arguments is reversed:
.BI " int *" ctid );
.fi
.in
.PP
On the cris and s390 architectures,
.\" CONFIG_CLONE_BACKWARDS2
the order of the first two arguments is reversed:
.PP
.in +4
.nf
.BI "long clone(void *" child_stack ", unsigned long " flags ,
@ -861,11 +861,11 @@ the order of the first two arguments is reversed:
.BI " unsigned long " newtls );
.fi
.in
.PP
On the microblaze architecture,
.\" CONFIG_CLONE_BACKWARDS3
an additional argument is supplied:
.PP
.in +4
.nf
.BI "long clone(unsigned long " flags ", void *" child_stack ,
@ -874,7 +874,7 @@ an additional argument is supplied:
.BI " unsigned long " newtls );
.fi
.in
.PP
Another difference for the raw system call is that the
.I child_stack
argument may be zero, in which case copy-on-write semantics ensure that the
@ -1076,7 +1076,7 @@ and the call would cause the limit on the number of
nested user namespaces to be exceeded.
See
.BR user_namespaces (7).
.IP
From Linux 3.11 to Linux 4.8, the error diagnosed in this case was
.BR EUSERS .
.TP
@ -1152,13 +1152,13 @@ The
system call can be used to test whether two processes share various
resources such as a file descriptor table,
System V semaphore undo operations, or a virtual address space.
.PP
.PP
Handlers registered using
.BR pthread_atfork (3)
are not executed during a call to
.BR clone ().
.PP
In the Linux 2.4.x series,
.B CLONE_THREAD
generally does not make the parent of the new thread the same
@ -1168,7 +1168,7 @@ However, for kernel versions 2.4.7 to 2.4.18 the
flag implied the
.B CLONE_PARENT
flag (as in Linux 2.6.0 and later).
.PP
For a while there was
.B CLONE_DETACHED
(introduced in 2.5.32):
@ -1177,7 +1177,7 @@ In Linux 2.6.2, the need to give this flag together with
.B CLONE_THREAD
disappeared.
This flag is still defined, but has no effect.
.PP
On i386,
.BR clone ()
should not be called through vsyscall, but directly through

View File

@ -132,13 +132,13 @@ Failing to check the return value when closing a file may lead to
.I silent
loss of data.
This can especially be observed with NFS and with disk quota.
.PP
Note, however, that a failure return should be used only for
diagnostic purposes (i.e., a warning to the application that there
may still be I/O pending or there may have been failed I/O)
or remedial purposes
(e.g., writing the file once more or creating a backup).
.PP
Retrying the
.BR close ()
after a failure return is the wrong thing to do,
@ -159,7 +159,7 @@ the steps that may return an error,
.\" filp_close()
such as flushing data to the filesystem or device,
occur only later in the close operation.
.PP
Many other implementations similarly always close the file descriptor
.\" FreeBSD documents this explicitly. From the look of the source code
.\" SVR4, ancient SunOS, later Solaris, and AIX all do this.
@ -172,19 +172,19 @@ POSIX.1 is currently silent on this point,
but there are plans to mandate this behavior in the next major release
.\" Issue 8
of the standard
.PP
A careful programmer who wants to know about I/O errors may precede
.BR close ()
with a call to
.BR fsync (2).
.PP
The
.B EINTR
error is a somewhat special case.
Regarding the
.B EINTR
error, POSIX.1-2013 says:
.PP
.RS
If
.BR close ()
@ -196,7 +196,7 @@ and the state of
.I fildes
is unspecified.
.RE
.PP
This permits the behavior that occurs on Linux and
many other implementations, where,
as with other errors that may be reported by

View File

@ -94,7 +94,7 @@ is determined by the address space of the socket
see
.BR socket (2)
for further details.
.PP
If the socket
.I sockfd
is of type
@ -259,12 +259,12 @@ POSIX.1 does not require the inclusion of
and this header file is not required on Linux.
However, some historical (BSD) implementations required this header
file, and portable applications are probably wise to include it.
.PP
For background on the
.I socklen_t
type, see
.BR accept (2).
.PP
If
.BR connect ()
fails, consider the state of the socket as unspecified.

View File

@ -47,7 +47,7 @@ bytes of data from file descriptor
to file descriptor
.IR fd_out ,
overwriting any data that exists within the requested range of the target file.
.PP
The following semantics apply for
.IR off_in ,
and similar statements apply to
@ -74,7 +74,7 @@ is not changed, but
.I off_in
is adjusted appropriately.
.PP
.PP
The
.I flags
argument is provided to allow for future extensions
@ -84,7 +84,7 @@ Upon successful completion,
.BR copy_file_range ()
will return the number of bytes copied between files.
This could be less than the length originally requested.
.PP
On error,
.BR copy_file_range ()
returns \-1 and
@ -143,7 +143,7 @@ in a loop, and using the
and
.BR SEEK_HOLE
operations to find the locations of data segments.
.PP
.BR copy_file_range ()
gives filesystems an opportunity to implement "copy acceleration" techniques,
such as the use of reflinks (i.e., two or more i-nodes that share

View File

@ -22,7 +22,7 @@ No declaration of this system call is provided in glibc headers; see NOTES.
.SH DESCRIPTION
.IR Note :
This system call is present only in kernels before Linux 2.6.
.PP
.BR create_module ()
attempts to create a loadable module entry and reserve the kernel memory
that will be needed to hold the module.

View File

@ -46,7 +46,7 @@ The
argument is used to modify the behavior of the system call,
as described below.
This system call requires privilege.
.PP
Module removal is attempted according to the following rules:
.IP 1. 4
If there are other loaded modules that depend on
@ -67,7 +67,7 @@ flag is always specified, and the
flag may additionally be specified.
.\" O_TRUNC == KMOD_REMOVE_FORCE in kmod library
.\" O_NONBLOCK == KMOD_REMOVE_NOWAIT in kmod library
.IP
The various combinations for
.I flags
have the following effect:
@ -183,7 +183,7 @@ it is (before glibc 2.23) sufficient to
manually declare the interface in your code;
alternatively, you can invoke the system call using
.BR syscall (2).
.PP
The uninterruptible sleep that may occur if
.BR O_NONBLOCK
is omitted from
@ -195,13 +195,13 @@ As at Linux 3.7, specifying
is optional, but in future kernels it is likely to become mandatory.
.SS Linux 2.4 and earlier
In Linux 2.4 and earlier, the system call took only one argument:
.PP
.BI " int delete_module(const char *" name );
.PP
If
.I name
is NULL, all unused modules marked auto-clean are removed.
.PP
Some further details of differences in the behavior of
.BR delete_module ()
in Linux 2.4 and earlier are

View File

@ -56,7 +56,7 @@ The
system call creates a copy of the file descriptor
.IR oldfd ,
using the lowest-numbered unused file descriptor for the new descriptor.
.PP
After a successful return,
the old and new file descriptors may be used interchangeably.
They refer to the same open file description (see
@ -65,7 +65,7 @@ and thus share file offset and file status flags;
for example, if the file offset is modified by using
.BR lseek (2)
on one of the file descriptors, the offset is also changed for the other.
.PP
The two file descriptors do not share file descriptor flags
(the close-on-exec flag).
The close-on-exec flag
@ -85,7 +85,7 @@ it uses the file descriptor number specified in
If the file descriptor
.IR newfd
was previously open, it is silently closed before being reused.
.PP
The steps of closing and reusing the file descriptor
.IR newfd
are performed
@ -101,7 +101,7 @@ might be reused between the two steps.
Such reuse could happen because the main program is interrupted
by a signal handler that allocates a file descriptor,
or because a parallel thread allocates a file descriptor.
.PP
Note the following points:
.IP * 3
If
@ -210,7 +210,7 @@ version 2.9.
.BR dup (),
.BR dup2 ():
POSIX.1-2001, POSIX.1-2008, SVr4, 4.3BSD.
.PP
.BR dup3 ()
is Linux-specific.
.\" SVr4 documents additional
@ -230,7 +230,7 @@ also sometimes returns
.B EINVAL
like
.BR F_DUPFD .
.PP
If
.I newfd
was open, any errors that would have been reported at
@ -246,7 +246,7 @@ before calling
.BR dup2 (),
because of the race condition described above.
Instead, code something like the following could be used:
.PP
.nf
/* Obtain a duplicate of 'newfd' that can subsequently
be used to check for close() errors; an EBADF error

View File

@ -39,7 +39,7 @@ instance.
Since Linux 2.6.8, the
.I size
argument is ignored, but must be greater than zero; see NOTES below.
.PP
.BR epoll_create ()
returns a file descriptor referring to the new epoll instance.
This file descriptor is used for all the subsequent calls to the
@ -112,7 +112,7 @@ There was insufficient memory to create the kernel object.
.BR epoll_create ()
was added to the kernel in version 2.6.
Library support is provided in glibc starting with version 2.3.2.
.PP
.\" To be precise: kernel 2.5.44.
.\" The interface should be finalized by Linux kernel 2.5.66.
.BR epoll_create1 ()

View File

@ -35,7 +35,7 @@ It requests that the operation
.I op
be performed for the target file descriptor,
.IR fd .
.PP
Valid values for the
.I op
argument are:
@ -92,7 +92,7 @@ struct epoll_event {
};
.fi
.in
.PP
The
.I events
member is a bit mask composed by ORing together zero or more of
@ -134,7 +134,7 @@ Hang up happened on the associated file descriptor.
.BR epoll_wait (2)
will always wait for this event; it is not necessary to set it in
.IR events .
.IP
Note that when reading from a channel such as a pipe or a stream socket,
this event merely indicates that the peer closed its end of the channel.
Subsequent reads from the channel will return 0 (end of file)
@ -206,7 +206,7 @@ The default in this scenario (when
is not set) is for all epoll file descriptors to receive an event.
.BR EPOLLEXCLUSIVE
is thus useful for avoiding thundering herd problems in certain scenarios.
.IP
If the same file descriptor is in multiple epoll instances,
some with the
.BR EPOLLEXCLUSIVE
@ -215,7 +215,7 @@ instances that did not specify
.BR EPOLLEXCLUSIVE ,
and at least one of the epoll instances that did specify
.BR EPOLLEXCLUSIVE .
.IP
The following values may be specified in conjunction with
.BR EPOLLEXCLUSIVE :
.BR EPOLLIN ,
@ -398,7 +398,7 @@ when using
Applications that need to be portable to kernels before 2.6.9
should specify a non-null pointer in
.IR event .
.PP
If
.B EPOLLWAKEUP
is specified in

View File

@ -100,7 +100,7 @@ struct epoll_event {
};
.fi
.in
.PP
The
.I data
field of each returned structure contains the same data as was specified

View File

@ -36,7 +36,7 @@ The object contains an unsigned 64-bit integer
counter that is maintained by the kernel.
This counter is initialized with the value specified in the argument
.IR initval .
.PP
The following values may be bitwise ORed in
.IR flags
to change the behavior of
@ -67,7 +67,7 @@ See below.
In Linux up to version 2.6.26, the
.I flags
argument is unused, and must be specified as zero.
.PP
As its return value,
.BR eventfd ()
returns a new file descriptor that can be used to refer to the
@ -275,7 +275,7 @@ T{
.BR eventfd ()
T} Thread safety MT-Safe
.TE
.sp 1
.SH CONFORMING TO
.BR eventfd ()
and
@ -289,13 +289,13 @@ The kernel overhead of an eventfd file descriptor
is much lower than that of a pipe,
and only one file descriptor is
required (versus the two required for a pipe).
.PP
When used in the kernel, an eventfd
file descriptor can provide a bridge from kernel to user space, allowing,
for example, functionalities like KAIO (kernel AIO)
.\" or eventually syslets/threadlets
to signal to a file descriptor that some operation is complete.
.PP
A key point about an eventfd file descriptor is that it can be
monitored just like any other file descriptor using
.BR select (2),
@ -312,7 +312,7 @@ interface, these mechanisms could not be multiplexed via
.BR poll (2),
or
.BR epoll (7).)
.PP
The current value of an eventfd counter can be viewed
via the entry for the corresponding file descriptor in the process's
.IR /proc/[pid]/fdinfo
@ -348,7 +348,7 @@ int eventfd_read(int fd, eventfd_t *value);
int eventfd_write(int fd, eventfd_t value);
.fi
.in
.PP
The functions perform the read and write operations on an
eventfd file descriptor,
returning 0 if the correct number of bytes was transferred,
@ -362,7 +362,7 @@ the child writes each of the integers supplied in the program's
command-line arguments to the eventfd file descriptor.
When the parent has finished sleeping,
it reads from the eventfd file descriptor.
.PP
The following shell session shows a sample run of the program:
.in +4n
.nf

View File

@ -48,15 +48,15 @@ execve \- execute program
executes the program pointed to by \fIfilename\fP.
\fIfilename\fP must be either a binary executable, or a script
starting with a line of the form:
.PP
.in +4n
.nf
\fB#!\fP \fIinterpreter \fP[optional-arg]
.fi
.in
.PP
For details of the latter case, see "Interpreter scripts" below.
.PP
\fIargv\fP is an array of argument strings passed to the new program.
By convention, the first of these strings (i.e.,
.IR argv[0] )
@ -65,31 +65,31 @@ should contain the filename associated with the file being executed.
\fBkey=value\fP, which are passed as environment to the new program.
The \fIargv\fP and \fIenvp\fP arrays must each include a null pointer
at the end of the array.
.PP
The argument vector and environment can be accessed by the
called program's main function, when it is defined as:
.PP
.in +4n
.nf
int main(int argc, char *argv[], char *envp[])
.fi
.in
.PP
Note, however, that the use of a third argument to the main function
is not specified in POSIX.1;
according to POSIX.1,
the environment should be accessed via the external variable
.BR environ (7).
.PP
.BR execve ()
does not return on success, and the text, initialized data,
uninitialized data (bss), and stack of the calling process are overwritten
according to the contents of the newly loaded program.
.PP
If the current program is being ptraced, a \fBSIGTRAP\fP signal is sent to it
after a successful
.BR execve ().
.PP
If the set-user-ID bit is set on the program file pointed to by
\fIfilename\fP,
then the effective user ID of the calling process is changed
@ -97,7 +97,7 @@ to that of the owner of the program file.
Similarly, when the set-group-ID
bit of the program file is set the effective group ID of the calling
process is set to the group of the program file.
.PP
The aforementioned transformations of the effective IDs are
.I not
performed (i.e., the set-user-ID and set-group-ID bits are ignored)
@ -126,11 +126,11 @@ The effective user ID of the process is copied to the saved set-user-ID;
similarly, the effective group ID is copied to the saved set-group-ID.
This copying takes place after any effective ID changes that occur
because of the set-user-ID and set-group-ID mode bits.
.PP
The process's real UID and real GID, as well its supplementary group IDs,
are unchanged by a call to
.BR execve ().
.PP
If the executable is an a.out dynamically linked
binary executable containing
shared-library stubs, the Linux dynamic linker
@ -138,7 +138,7 @@ shared-library stubs, the Linux dynamic linker
is called at the start of execution to bring
needed shared objects into memory
and link the executable with them.
.PP
If the executable is a dynamically linked ELF executable, the
interpreter named in the PT_INTERP segment is used to load the needed
shared objects.
@ -146,7 +146,7 @@ This interpreter is typically
.I /lib/ld-linux.so.2
for binaries linked with glibc (see
.BR ld-linux.so (8)).
.PP
All process attributes are preserved during an
.BR execve (),
except the following:
@ -294,13 +294,13 @@ closed across an
.SS Interpreter scripts
An interpreter script is a text file that has execute
permission enabled and whose first line is of the form:
.PP
.in +4n
.nf
\fB#!\fP \fIinterpreter \fP[optional-arg]
.fi
.in
.PP
The
.I interpreter
must be a valid pathname for an executable file.
@ -311,13 +311,13 @@ argument of
specifies an interpreter script, then
.I interpreter
will be invoked with the following arguments:
.PP
.in +4n
.nf
\fIinterpreter\fP [optional-arg] \fIfilename\fP arg...
.fi
.in
.PP
where
.I arg...
is the series of words pointed to by the
@ -326,12 +326,12 @@ argument of
.BR execve (),
starting at
.IR argv [1].
.PP
For portable use,
.I optional-arg
should either be absent, or be specified as a single word (i.e., it
should not contain white space); see NOTES below.
.PP
Since Linux 2.6.28,
.\" commit bf2a9a39639b8b51377905397a5005f444e9a892
the kernel permits the interpreter of a script to itself be a script.
@ -351,14 +351,14 @@ constant (either defined in
.I <limits.h>
or available at run time using the call
.IR "sysconf(_SC_ARG_MAX)" ).
.PP
On Linux prior to kernel 2.6.23, the memory used to store the
environment and argument strings was limited to 32 pages
(defined by the kernel constant
.BR MAX_ARG_PAGES ).
On architectures with a 4-kB page size,
this yields a maximum size of 128 kB.
.PP
On kernel 2.6.23 and later, most architectures support a size limit
derived from the soft
.B RLIMIT_STACK
@ -529,7 +529,7 @@ POSIX does not document the #! behavior, but it exists
.SH NOTES
Set-user-ID and set-group-ID processes can not be
.BR ptrace (2)d.
.PP
The result of mounting a filesystem
.I nosuid
varies across Linux kernel versions:
@ -540,7 +540,7 @@ give the user powers she did not have already (and return
some will just ignore the set-user-ID and set-group-ID bits and
.BR exec ()
successfully.
.PP
On Linux,
.I argv
and
@ -562,7 +562,7 @@ case the same as Linux.
.\" Bug filed 30 Apr 2007: http://bugzilla.kernel.org/show_bug.cgi?id=8408
.\" Bug rejected (because fix would constitute an ABI change).
.\"
.PP
POSIX.1 says that values returned by
.BR sysconf (3)
should be invariant over the lifetime of a process.
@ -573,7 +573,7 @@ resource limit changes, then the value reported by
will also change,
to reflect the fact that the limit on space for holding
command-line arguments and environment variables has changed.
.PP
In most cases where
.BR execve ()
fails, control returns to the original executable image,
@ -591,7 +591,7 @@ signal.
.SS Interpreter scripts
A maximum line length of 127 characters is allowed for the first line in
an interpreter script.
.PP
The semantics of the
.I optional-arg
argument of an interpreter script vary across implementations.
@ -610,7 +610,7 @@ an interpreter script can have multiple arguments,
and white spaces in
.I optional-arg
are used to delimit the arguments.
.PP
Linux ignores the set-user-ID and set-group-ID bits on scripts.
.\"
.\" .SH BUGS
@ -627,7 +627,7 @@ A more detailed explanation of the
error that can occur (since Linux 3.1) when calling
.BR execve ()
is as follows.
.PP
The
.BR EAGAIN
error can occur when a
@ -649,7 +649,7 @@ call to fail.
.\" commit 909cc4ae86f3380152a18e2a3c44523893ee11c4
the resource limit was not imposed on processes that
changed their user IDs.)
.PP
Since Linux 3.1, the scenario just described no longer causes the
.BR set*uid ()
call to fail,
@ -680,7 +680,7 @@ common privileged daemon workflow\(emnamely,
.BR set*uid ()
+
.BR execve ().
.PP
If the resource limit was not still exceeded at the time of the
.BR execve ()
call
@ -719,7 +719,7 @@ Since UNIX\ V7, both are NULL.
.SH EXAMPLE
The following program is designed to be execed by the second program below.
It just echoes its command-line arguments, one per line.
.PP
.in +4n
.nf
/* myecho.c */
@ -739,7 +739,7 @@ main(int argc, char *argv[])
}
.fi
.in
.PP
This program can be used to exec the program named in its command-line
argument:
.in +4n
@ -770,9 +770,9 @@ main(int argc, char *argv[])
}
.fi
.in
.PP
We can use the second program to exec the first as follows:
.PP
.in +4n
.nf
.RB "$" " cc myecho.c \-o myecho"
@ -783,13 +783,13 @@ argv[1]: hello
argv[2]: world
.fi
.in
.PP
We can also use these programs to demonstrate the use of a script
interpreter.
To do this we create a script whose "interpreter" is our
.I myecho
program:
.PP
.in +4n
.nf
.RB "$" " cat > script"
@ -798,9 +798,9 @@ program:
.RB "$" " chmod +x script"
.fi
.in
.PP
We can then use our program to exec the script:
.PP
.in +4n
.nf
.RB "$" " ./execve ./script"

View File

@ -45,7 +45,7 @@ and
It operates in exactly the same way as
.BR execve (2),
except for the differences described in this manual page.
.PP
If the pathname given in
.I pathname
is relative, then it is interpreted relative to the directory
@ -55,7 +55,7 @@ referred to by the file descriptor
the calling process, as is done by
.BR execve (2)
for a relative pathname).
.PP
If
.I pathname
is relative and
@ -67,13 +67,13 @@ then
is interpreted relative to the current working
directory of the calling process (like
.BR execve (2)).
.PP
If
.I pathname
is absolute, then
.I dirfd
is ignored.
.PP
If
.I pathname
is an empty string and the
@ -83,7 +83,7 @@ flag is specified, then the file descriptor
specifies the file to be executed (i.e.,
.IR dirfd
refers to an executable file, rather than a directory).
.PP
The
.I flags
argument is a bit mask that can include zero or more of the following flags:
@ -176,7 +176,7 @@ system call is also needed to allow
to be implemented on systems that do not have the
.I /proc
filesystem mounted.
.PP
When asked to execute a script file, the
.IR argv[0]
that is passed to the script interpreter is a string of the form
@ -199,7 +199,7 @@ in this case,
.IR P
is the value given in
.IR pathname .
.PP
For the same reasons described in
.BR fexecve (3),
the natural idiom when using
@ -212,9 +212,9 @@ The
.B ENOENT
error described above means that it is not possible to set the
close-on-exec flag on the file descriptor given to a call of the form:
.PP
execveat(fd, "", argv, envp, AT_EMPTY_PATH);
.PP
However, the inability to set the close-on-exec flag means that a file
descriptor referring to the script leaks through to the script itself.
As well as wasting a file descriptor,

View File

@ -24,7 +24,7 @@ This is a nonportable, Linux-specific system call.
For the portable, POSIX.1-specified method of ensuring that space
is allocated for a file, see
.BR posix_fallocate (3).
.PP
.BR fallocate ()
allows the caller to directly manipulate the allocated disk space
for the file referred to by
@ -34,7 +34,7 @@ for the byte range starting at
and continuing for
.I len
bytes.
.PP
The
.I mode
argument determines the operation to be performed on the given range.
@ -62,13 +62,13 @@ This default behavior closely resembles the behavior of the
.BR posix_fallocate (3)
library function,
and is intended as a method of optimally implementing that function.
.PP
After a successful call, subsequent writes into the range specified by
.IR offset
and
.IR len
are guaranteed not to fail because of lack of disk space.
.PP
If the
.B FALLOC_FL_KEEP_SIZE
flag is specified in
@ -79,7 +79,7 @@ but the file size will not be changed even if
is greater than the file size.
Preallocating zeroed blocks beyond the end of the file in this manner
is useful for optimizing append workloads.
.PP
If the
.B FALLOC_FL_UNSHARE
flag is specified in
@ -108,7 +108,7 @@ Within the specified range, partial filesystem blocks are zeroed,
and whole filesystem blocks are removed from the file.
After a successful call,
subsequent reads from this range will return zeroes.
.PP
The
.BR FALLOC_FL_PUNCH_HOLE
flag must be ORed with
@ -119,7 +119,7 @@ in other words, even when punching off the end of the file, the file size
(as reported by
.BR stat (2))
does not change.
.PP
Not all filesystems support
.BR FALLOC_FL_PUNCH_HOLE ;
if a filesystem doesn't support the operation, an error is returned.
@ -154,7 +154,7 @@ will be appended at the location
and the file will be
.I len
bytes smaller.
.PP
A filesystem may place limitations on the granularity of the operation,
in order to ensure efficient implementation.
Typically,
@ -168,7 +168,7 @@ If a filesystem has such a requirement,
will fail with the error
.BR EINVAL
if this requirement is violated.
.PP
If the region specified by
.I offset
plus
@ -177,12 +177,12 @@ reaches or passes the end of file, an error is returned;
instead, use
.BR ftruncate (2)
to truncate a file.
.PP
No other flags may be specified in
.IR mode
in conjunction with
.BR FALLOC_FL_COLLAPSE_RANGE .
.PP
As at Linux 3.15,
.B FALLOC_FL_COLLAPSE_RANGE
is supported by
@ -206,13 +206,13 @@ Within the specified range, blocks are preallocated for the regions
that span the holes in the file.
After a successful call, subsequent
reads from this range will return zeroes.
.PP
Zeroing is done within the filesystem preferably by converting the range into
unwritten extents.
This approach means that the specified range will not be physically zeroed
out on the device (except for partial blocks at the either end of the range),
and I/O is (otherwise) required only to update metadata.
.PP
If the
.B FALLOC_FL_KEEP_SIZE
flag is additionally specified in
@ -224,7 +224,7 @@ is greater than the file size.
This behavior is the same as when preallocating space with
.B FALLOC_FL_KEEP_SIZE
specified.
.PP
Not all filesystems support
.BR FALLOC_FL_ZERO_RANGE ;
if a filesystem doesn't support the operation, an error is returned.
@ -261,7 +261,7 @@ bytes.
Inserting a hole inside a file increases the file size by
.I len
bytes.
.PP
This mode has the same limitations as
.BR FALLOC_FL_COLLAPSE_RANGE
regarding the granularity of the operation.
@ -275,12 +275,12 @@ is equal to or greater than the end of file, an error is returned.
For such operations (i.e., inserting a hole at the end of file),
.BR ftruncate (2)
should be used.
.PP
No other flags may be specified in
.IR mode
in conjunction with
.BR FALLOC_FL_INSERT_RANGE .
.PP
.B FALLOC_FL_INSERT_RANGE
requires filesystem support.
Filesystems that support this operation include

View File

@ -68,9 +68,9 @@ To make a nonblocking request, include
.B LOCK_NB
(by ORing)
with any of the above operations.
.PP
A single file may not simultaneously have both shared and exclusive locks.
.PP
Locks created by
.BR flock ()
are associated with an open file description (see
@ -85,7 +85,7 @@ Furthermore, the lock is released either by an explicit
.B LOCK_UN
operation on any of these duplicate file descriptors, or when all
such file descriptors have been closed.
.PP
If a process uses
.BR open (2)
(or similar) to obtain more than one file descriptor for the same file,
@ -94,19 +94,19 @@ these file descriptors are treated independently by
An attempt to lock the file using one of these file descriptors
may be denied by a lock that the calling process has
already placed via another file descriptor.
.PP
A process may hold only one type of lock (shared or exclusive)
on a file.
Subsequent
.BR flock ()
calls on an already locked file will convert an existing lock to the new
lock mode.
.PP
Locks created by
.BR flock ()
are preserved across an
.BR execve (2).
.PP
A shared or exclusive lock can be placed on a file regardless of the
mode in which the file was opened.
.SH RETURN VALUE
@ -241,7 +241,7 @@ and occurs on many other implementations.)
.BR open (2),
.BR lockf (3),
.BR lslocks (8)
.PP
.I Documentation/filesystems/locks.txt
in the Linux kernel source tree
.RI ( Documentation/locks.txt

View File

@ -52,7 +52,7 @@ process.
The calling process is referred to as the
.I parent
process.
.PP
The child process and the parent process run in separate memory spaces.
At the time of
.BR fork ()
@ -62,7 +62,7 @@ Memory writes, file mappings
and unmappings
.RB ( munmap (2))
performed by one of the processes do not affect the other.
.PP
The child process is an exact duplicate of the parent
process except for the following points:
.IP * 3

View File

@ -72,7 +72,7 @@ This includes writing through or flushing a disk cache if present.
The call blocks until the device reports that the transfer has completed.
It also flushes metadata information associated with the file (see
.BR inode (7)).
.PP
Calling
.BR fsync ()
does not necessarily ensure
@ -80,7 +80,7 @@ that the entry in the directory containing the file has also reached disk.
For that an explicit
.BR fsync ()
on a file descriptor for the directory is also needed.
.PP
.BR fdatasync ()
is similar to
.BR fsync (),
@ -101,7 +101,7 @@ On the other hand, a change to the file size
as made by say
.BR ftruncate (2)),
would require a metadata flush.
.PP
The aim of
.BR fdatasync ()
is to reduce disk activity for applications that do not
@ -145,13 +145,13 @@ On some UNIX systems (but not Linux),
must be a
.I writable
file descriptor.
.PP
In Linux 2.2 and earlier,
.BR fdatasync ()
is equivalent to
.BR fsync (),
and so has no performance advantage.
.PP
The
.BR fsync ()
implementations in older kernels and lesser used filesystems

View File

@ -54,7 +54,7 @@ Other
.BR futex ()
operations can be used to wake any processes or threads waiting
for a particular condition.
.PP
A futex is a 32-bit value\(emreferred to below as a
.IR "futex word" \(emwhose
address is supplied to the
@ -73,7 +73,7 @@ virtual addresses in different processes,
but these addresses all refer to the same location in physical memory.)
In a multithreaded program, it is sufficient to place the futex word
in a global variable shared by all threads.
.PP
When executing a futex operation that requests to block a thread,
the kernel will block only if the futex word has the value that the
calling thread supplied (as one of the arguments of the
@ -106,7 +106,7 @@ blocking via a futex is an atomic compare-and-block operation.
.\" the reference in the following sentence
.\" See NOTES for a detailed specification of
.\" the synchronization semantics.
.PP
One use of futexes is for implementing locks.
The state of the lock (i.e., acquired or not acquired)
can be represented as an atomically accessed flag in shared memory.
@ -133,10 +133,10 @@ operation that wakes threads blocked on the lock flag used as a futex word
See
.BR futex (7)
for more detail on how to use futexes.
.PP
Besides the basic wait and wake-up futex functionality, there are further
futex operations aimed at supporting more complex use cases.
.PP
Note that
no explicit initialization or destruction is necessary to use futexes;
the kernel maintains a futex
@ -157,7 +157,7 @@ argument;
.IR val
is a value whose meaning and purpose depends on
.IR futex_op .
.PP
The remaining arguments
.RI ( timeout ,
.IR uaddr2 ,
@ -165,7 +165,7 @@ and
.IR val3 )
are required only for certain of the futex operations described below.
Where one of these arguments is not required, it is ignored.
.PP
For several blocking operations, the
.I timeout
argument is a pointer to a
@ -183,12 +183,12 @@ then to
and in the remainder of this page, this argument is referred to as
.I val2
when interpreted in this fashion.
.PP
Where it is required, the
.IR uaddr2
argument is a pointer to a second futex word that is employed
by the operation.
.PP
The interpretation of the final integer argument,
.IR val3 ,
depends on the operation.
@ -216,7 +216,7 @@ This allows the kernel to make some additional performance optimizations.
.\" I.e., It allows the kernel choose the fast path for validating
.\" the user-space address and avoids expensive VMA lookups,
.\" taking reference counts on file backing store, and so on.
.IP
As a convenience,
.IR <linux/futex.h>
defines a set of constants with the suffix
@ -242,13 +242,13 @@ and
.\" commit 337f13046ff03717a9e99675284a817527440a49
.BR FUTEX_WAIT
operations.
.IP
If this option is set, the kernel measures the
.I timeout
against the
.BR CLOCK_REALTIME
clock.
.IP
If this option is not set, the kernel measures the
.I timeout
against the
@ -287,7 +287,7 @@ If the futex value does not match
.IR val ,
then the call fails immediately with the error
.BR EAGAIN .
.IP
The purpose of the comparison with the expected value is to prevent lost
wake-ups.
If another thread changed the value of the futex word after the
@ -298,7 +298,7 @@ operation (or similar wake-up) after the value change and before this
.BR FUTEX_WAIT
operation, then the calling thread will observe the
value change and will not start to sleep.
.IP
If the
.I timeout
is not NULL, the structure it points to specifies a
@ -316,7 +316,7 @@ in
If
.I timeout
is NULL, the call blocks indefinitely.
.IP
.IR Note :
for
.BR FUTEX_WAIT ,
@ -335,7 +335,7 @@ with
.IR val3
specified as
.BR FUTEX_BITSET_MATCH_ANY .
.IP
The arguments
.I uaddr2
and
@ -372,7 +372,7 @@ is specified as either 1 (wake up a single waiter) or
No guarantee is provided about which waiters are awoken
(e.g., a waiter with a higher scheduling priority is not guaranteed
to be awoken in preference to a waiter with a lower priority).
.IP
The arguments
.IR timeout ,
.IR uaddr2 ,
@ -407,21 +407,21 @@ on the futex word, the file descriptor indicates as being readable with
.BR poll (2),
and
.BR epoll (7)
.IP
The file descriptor can be used to obtain asynchronous notifications: if
.I val
is nonzero, then, when another process or thread executes a
.BR FUTEX_WAKE ,
the caller will receive the signal number that was passed in
.IR val .
.IP
The arguments
.IR timeout ,
.I uaddr2
and
.I val3
are ignored.
.IP
Because it was inherently racy,
.B FUTEX_FD
has been removed
@ -466,7 +466,7 @@ The
argument specifies an upper limit on the number of waiters
that are requeued to the futex at
.IR uaddr2 .
.IP
.\" FIXME(Torvald) Is the following correct? Or is just the decision
.\" which threads to wake or requeue part of the atomic operation?
The load from
@ -482,7 +482,7 @@ ordered with respect to other operations on the same futex word.
.\" source and target futex. No other waiter can enqueue itself
.\" for waiting and no other waiter can dequeue itself because of
.\" a timeout or signal.
.IP
Typical values to specify for
.I val
are 0 or 1.
@ -500,7 +500,7 @@ is typically either 1 or
.BR FUTEX_CMP_REQUEUE
operation equivalent to
.BR FUTEX_WAIT .)
.IP
The
.B FUTEX_CMP_REQUEUE
operation was added as a replacement for the earlier
@ -517,7 +517,7 @@ conditions, which allows race conditions to be avoided in certain use cases.
.\" To: Darren Hart <dvhart@infradead.org>
.\" CC: libc-alpha@sourceware.org, ...
.\" Subject: Re: Add futex wrapper to glibc?
.IP
Both
.BR FUTEX_REQUEUE
and
@ -529,7 +529,7 @@ another futex.
Consider the following scenario,
where multiple waiter threads are waiting on B,
a wait queue implemented using a futex:
.IP
.in +4n
.nf
lock(A)
@ -541,7 +541,7 @@ while (!check_value(V)) {
unlock(A);
.fi
.in
.IP
If a waker thread used
.BR FUTEX_WAKE ,
then all waiters waiting on B would be woken up,
@ -573,13 +573,13 @@ of the wait queue associated with the condition variable.
.BR FUTEX_WAKE_OP
allows such cases to be implemented without leading to
high rates of contention and context switching.
.IP
The
.BR FUTEX_WAKE_OP
operation is equivalent to executing the following code atomically
and totally ordered with respect to other futex operations on
any of the two supplied futex words:
.IP
.in +4n
.nf
int oldval = *(int *) uaddr2;
@ -589,7 +589,7 @@ if (oldval \fIcmp\fP \fIcmparg\fP)
futex(uaddr2, FUTEX_WAKE, val2, 0, 0, 0);
.fi
.in
.IP
In other words,
.BR FUTEX_WAKE_OP
does the following:
@ -621,7 +621,7 @@ The operation and comparison that are to be performed are encoded
in the bits of the argument
.IR val3 .
Pictorially, the encoding is:
.IP
.in +8n
.nf
+---+---+-----------+-----------+
@ -630,9 +630,9 @@ Pictorially, the encoding is:
4 4 12 12 <== # of bits
.fi
.in
.IP
Expressed in code, the encoding is:
.IP
.in +4n
.nf
#define FUTEX_OP(op, oparg, cmp, cmparg) \\
@ -642,7 +642,7 @@ Expressed in code, the encoding is:
(cmparg & 0xfff))
.fi
.in
.IP
In the above,
.I op
and
@ -653,11 +653,11 @@ The
and
.I cmparg
components are literal numeric values, except as noted below.
.IP
The
.I op
component has one of the following values:
.IP
.in +4n
.nf
FUTEX_OP_SET 0 /* uaddr2 = oparg; */
@ -667,23 +667,23 @@ FUTEX_OP_ANDN 3 /* uaddr2 &= ~oparg; */
FUTEX_OP_XOR 4 /* uaddr2 ^= oparg; */
.fi
.in
.IP
In addition, bit-wise ORing the following value into
.I op
causes
.IR "(1\ <<\ oparg)"
to be used as the operand:
.IP
.in +4n
.nf
FUTEX_OP_ARG_SHIFT 8 /* Use (1 << oparg) as operand */
.fi
.in
.IP
The
.I cmp
field is one of the following:
.IP
.in +4n
.nf
FUTEX_OP_CMP_EQ 0 /* if (oldval == cmparg) wake */
@ -694,7 +694,7 @@ FUTEX_OP_CMP_GT 4 /* if (oldval > cmparg) wake */
FUTEX_OP_CMP_GE 5 /* if (oldval >= cmparg) wake */
.fi
.in
.IP
The return value of
.BR FUTEX_WAKE_OP
is the sum of the number of waiters woken on the futex
@ -717,7 +717,7 @@ is stored in the kernel-internal state of the waiter.
See the description of
.BR FUTEX_WAKE_BITSET
for further details.
.IP
If
.I timeout
is not NULL, the structure it points to specifies
@ -725,8 +725,8 @@ an absolute timeout for the wait operation.
If
.I timeout
is NULL, the operation can block indefinitely.
.IP
.IP
The
.I uaddr2
argument is ignored.
@ -751,7 +751,7 @@ state of the waiter (the "wait" bit mask that is set using
.BR FUTEX_WAIT_BITSET ).
All of the waiters for which the result of the AND is nonzero are woken up;
the remaining waiters are left sleeping.
.IP
The effect of
.BR FUTEX_WAIT_BITSET
and
@ -778,7 +778,7 @@ including those that are not interested in being woken up
.\" obtain the absolute timeout functionality that is useful
.\" for efficiently implementing Pthreads APIs (which use absolute
.\" timeouts); FUTEX_WAIT provides only relative timeouts.
.IP
The constant
.BR FUTEX_BITSET_MATCH_ANY ,
which corresponds to all 32 bits set in the bit mask, can be used as the
@ -807,7 +807,7 @@ with
specified as
.BR FUTEX_BITSET_MATCH_ANY ;
that is, wake up any waiter(s).
.IP
The
.I uaddr2
and
@ -826,7 +826,7 @@ while tasks at an intermediate priority continuously preempt
the low-priority task from the CPU.
Consequently, the low-priority task makes no progress toward
releasing the lock, and the high-priority task remains blocked.
.PP
Priority inheritance is a mechanism for dealing with
the priority-inversion problem.
With this mechanism, when a high-priority task becomes blocked
@ -843,7 +843,7 @@ held by another intermediate-priority task
then both of those tasks
(or more generally, all of the tasks in a lock chain)
have their priorities raised to be the same as the high-priority task.
.PP
From a user-space perspective,
what makes a futex PI-aware is a policy agreement (described below)
between user space and the kernel about the value of the futex word,
@ -859,7 +859,7 @@ for the implementation of very specific IPC mechanisms.)
.\" talk about a PI aware pthread_mutex, than a PI aware futex, since
.\" there is a lot of policy and scaffolding that has to be built up
.\" around it to use it properly (this is what a PI pthread_mutex is).
.PP
.\" mtk: The following text is drawn from the Hart/Guniguntala paper
.\" (listed in SEE ALSO), but I have reworded some pieces
.\" significantly.
@ -880,7 +880,7 @@ If the lock is owned and there are threads contending for the lock,
then the
.B FUTEX_WAITERS
bit shall be set in the futex word's value; in other words, this value is:
.IP
FUTEX_WAITERS | TID
.IP
(Note that is invalid for a PI futex word to have no owner and
@ -897,7 +897,7 @@ Acquiring a lock simply consists of using compare-and-swap to atomically
set the futex word's value to the caller's TID if its previous value was 0.
Releasing a lock requires using compare-and-swap to set the futex word's
value to 0 if the previous value was the expected TID.
.PP
If a futex is already acquired (i.e., has a nonzero value),
waiters must employ the
.B FUTEX_LOCK_PI
@ -908,7 +908,7 @@ bit is set in the futex value;
in this case, the lock owner must employ the
.B FUTEX_UNLOCK_PI
operation to release the lock.
.PP
In the cases where callers are forced into the kernel
(i.e., required to perform a
.BR futex ()
@ -918,7 +918,7 @@ a kernel locking mechanism which implements the required
priority-inheritance semantics.
After the RT-mutex is acquired, the futex value is updated accordingly,
before the calling thread returns to user space.
.PP
It is important to note
.\" tglx (July 2015):
.\" If there are multiple waiters on a pi futex then a wake pi operation
@ -935,7 +935,7 @@ up in an invalid state, such as having an owner but the value being 0,
or having waiters but not having the
.B FUTEX_WAITERS
bit set.)
.PP
If a futex has an associated RT-mutex in the kernel
(i.e., there are blocked waiters)
and the owner of the futex/RT-mutex dies unexpectedly,
@ -954,7 +954,7 @@ the dead owner.
.\" mechanism. In that case the futex value will be set to
.\" FUTEX_OWNER_DIED. The robust futex mechanism is also available for non
.\" PI futexes.
.PP
PI futexes are operated on by specifying one of the values listed below in
.IR futex_op .
Note that the PI futex operations must be used as paired operations
@ -996,7 +996,7 @@ This operation is used after an attempt to acquire
the lock via an atomic user-mode instruction failed
because the futex word has a nonzero value\(emspecifically,
because it contained the (PID-namespace-specific) TID of the lock owner.
.IP
The operation checks the value of the futex word at the address
.IR uaddr .
If the value is 0, then the kernel tries to atomically set
@ -1079,7 +1079,7 @@ This inheritance follows the lock chain in the case of nested locking
.\" (i.e., task 1 blocks on lock A, held by task 2,
.\" while task 2 blocks on lock B, held by task 3)
and performs deadlock detection.
.IP
The
.I timeout
argument provides a timeout for the lock attempt.
@ -1108,7 +1108,7 @@ clock.
If
.I timeout
is NULL, the operation will block indefinitely.
.IP
The
.IR uaddr2 ,
.IR val ,
@ -1125,7 +1125,7 @@ This operation tries to acquire the lock at
.IR uaddr .
It is invoked when a user-space atomic acquire did not
succeed because the futex word was not 0.
.IP
Because the kernel has access to more state information than user space,
acquisition of the lock might succeed if performed by the
kernel in cases where the futex word
@ -1149,7 +1149,7 @@ but the kernel can fix this up and acquire the futex.
.\" Darren Hart (Oct 2015):
.\" The trylock in the kernel has more state, so it can independently
.\" verify the flags that userspace must trust implicitly.
.IP
The
.IR uaddr2 ,
.IR val ,
@ -1168,11 +1168,11 @@ This operation wakes the top priority waiter that is waiting in
on the futex address provided by the
.I uaddr
argument.
.IP
This is called when the user-space value at
.I uaddr
cannot be changed atomically from a TID (of the owner) to 0.
.IP
The
.IR uaddr2 ,
.IR val ,
@ -1196,7 +1196,7 @@ from a non-PI source futex
.RI ( uaddr )
to a PI target futex
.RI ( uaddr2 ).
.IP
As with
.BR FUTEX_CMP_REQUEUE ,
this operation wakes up a maximum of
@ -1212,7 +1212,7 @@ The remaining waiters are removed from the wait queue of the source futex at
.I uaddr
and added to the wait queue of the target futex at
.IR uaddr2 .
.IP
The
.I val2
.\" val2 is the cap on the number of requeued waiters.
@ -1246,7 +1246,7 @@ The wait operation on
.I uaddr
is the same as for
.BR FUTEX_WAIT .
.IP
The waiter can be removed from the wait on
.I uaddr
without requeueing on
@ -1258,7 +1258,7 @@ In this case, the
.BR FUTEX_WAIT_REQUEUE_PI
operation fails with the error
.BR EAGAIN .
.IP
If
.I timeout
is not NULL, the structure it points to specifies
@ -1266,11 +1266,11 @@ an absolute timeout for the wait operation.
If
.I timeout
is NULL, the operation can block indefinitely.
.IP
The
.I val3
argument is ignored.
.IP
The
.BR FUTEX_WAIT_REQUEUE_PI
and
@ -1323,7 +1323,7 @@ was invoked via
all operations return \-1 and set
.I errno
to indicate the cause of the error.
.PP
The return value on success depends on the operation,
as described in the following list:
.TP
@ -1414,7 +1414,7 @@ The value pointed to by
was not equal to the expected value
.I val
at the time of the call.
.IP
.BR Note :
on Linux, the symbolic names
.B EAGAIN
@ -1688,7 +1688,7 @@ and the timeout expired before the operation completed.
.PP
Futexes were first made available in a stable kernel release
with Linux 2.6.0.
.PP
Initial futex support was merged in Linux 2.5.7 but with different
semantics from what was described above.
A four-argument system call with the semantics
@ -1700,7 +1700,7 @@ This system call is Linux-specific.
.SH NOTES
Glibc does not provide a wrapper for this system call; call it using
.BR syscall (2).
.PP
Several higher-level programming abstractions are implemented via futexes,
including POSIX semaphores and
various POSIX threads synchronization mechanisms
@ -1722,7 +1722,7 @@ The two processes each write
messages to the terminal and employ a synchronization protocol
that ensures that they alternate in writing messages.
Upon running this program we see output such as the following:
.PP
.in +4n
.nf
$ \fB./futex_demo\fP
@ -1912,17 +1912,17 @@ Franke, H., Russell, R., and Kirwood, M., 2002.
.br
.UR http://kernel.org\:/doc\:/ols\:/2002\:/ols2002\-pages\-479\-495.pdf
.UE
.PP
Hart, D., 2009. \fIA futex overview and update\fP,
.UR http://lwn.net/Articles/360699/
.UE
.PP
Hart, D. and Guniguntala, D., 2009.
\fIRequeue-PI: Making Glibc Condvars PI-Aware\fP
(from proceedings of the 2009 Real-Time Linux Workshop),
.UR http://lwn.net/images/conf/rtlws11/papers/proc/p10.pdf
.UE
.PP
Drepper, U., 2011. \fIFutexes Are Tricky\fP,
.UR http://www.akkadia.org/drepper/futex.pdf
.UE

View File

@ -47,13 +47,13 @@ This system call is obsolete.
Use
.BR utimensat (2)
instead.
.PP
The
.BR futimesat ()
system call operates in exactly the same way as
.BR utimes (2),
except for the differences described in this manual page.
.PP
If the pathname given in
.I pathname
is relative, then it is interpreted relative to the directory
@ -63,7 +63,7 @@ referred to by the file descriptor
the calling process, as is done by
.BR utimes (2)
for a relative pathname).
.PP
If
.I pathname
is relative and
@ -75,7 +75,7 @@ then
is interpreted relative to the current working
directory of the calling process (like
.BR utimes (2)).
.PP
If
.I pathname
is absolute, then
@ -114,7 +114,7 @@ This system call is nonstandard.
It was implemented from a specification that was proposed for POSIX.1,
but that specification was replaced by the one for
.BR utimensat (2).
.PP
A similar system call exists on Solaris.
.SH NOTES
.SS Glibc notes

View File

@ -22,7 +22,7 @@ No declaration of this system call is provided in glibc headers; see NOTES.
.SH DESCRIPTION
.BR Note :
This system call is present only in kernels before Linux 2.6.
.PP
If
.I table
is NULL,

View File

@ -42,12 +42,12 @@ Link with \fI\-lnuma\fP.
retrieves the NUMA policy of the calling thread or of a memory address,
depending on the setting of
.IR flags .
.PP
A NUMA machine has different
memory controllers with different distances to specific CPUs.
The memory policy defines from which node memory is allocated for
the thread.
.PP
If
.I flags
is specified as 0,
@ -69,7 +69,7 @@ When
is 0,
.I addr
must be specified as NULL.
.PP
If
.I flags
specifies
@ -91,7 +91,7 @@ with either
.B MPOL_F_ADDR
or
.BR MPOL_F_NODE .
.PP
If
.I flags
specifies
@ -105,7 +105,7 @@ or one of the helper functions described in
.BR numa (3)
has been used to establish a policy for the memory range containing
.IR addr .
.PP
If the
.I mode
argument is not NULL, then
@ -126,7 +126,7 @@ The value specified by
.I maxnode
is always rounded to a multiple of
.IR "sizeof(unsigned\ long)*8" .
.PP
If
.I flags
specifies both
@ -143,7 +143,7 @@ If no page has yet been allocated for the specified address,
will allocate a page as if the thread had performed a read
(load) access to that address, and return the ID of the node
where that page was allocated.
.PP
If
.I flags
specifies
@ -168,9 +168,9 @@ call with the
flag for read accesses, and in memory ranges mapped with the
.B MAP_SHARED
flag for all accesses.
.PP
Other flag values are reserved.
.PP
For an overview of the possible policies see
.BR set_mempolicy (2).
.SH RETURN VALUE

View File

@ -47,7 +47,7 @@ The robust futex implementation needs to maintain per-thread lists of
the robust futexes which are to be unlocked when the thread exits.
These lists are managed in user space; the kernel is notified about only
the location of the head of the list.
.PP
The
.BR get_robust_list ()
system call returns the head of the robust futex list of the thread
@ -63,14 +63,14 @@ The size of the object pointed to by
.I **head_ptr
is stored in
.IR len_ptr .
.PP
Permission to employ
.BR get_robust_list ()
is governed by a ptrace access mode
.B PTRACE_MODE_READ_REALCREDS
check; see
.BR ptrace (2).
.PP
The
.BR set_robust_list ()
system call requests the kernel to record the head of the list of
@ -126,14 +126,14 @@ These system calls are not needed by normal applications.
No support for them is provided in glibc.
In the unlikely event that you want to call them directly, use
.BR syscall (2).
.PP
A thread can have only one robust futex list;
therefore applications that wish
to use this functionality should use the robust mutexes provided by glibc.
.SH SEE ALSO
.BR futex (2)
.\" .BR pthread_mutexattr_setrobust_np (3)
.PP
.IR Documentation/robust-futexes.txt
and
.IR Documentation/robust-futex-ABI.txt

View File

@ -39,11 +39,11 @@ When either
or
.I node
is NULL nothing is written to the respective pointer.
.PP
The third argument to this system call is nowadays unused,
and should be specified as NULL
unless portability to Linux 2.6.23 or earlier is required (see NOTES).
.PP
The information placed in
.I cpu
is guaranteed to be current only at the time of the call:
@ -79,13 +79,13 @@ The intention of
.BR getcpu ()
is to allow programs to make optimizations with per-CPU data
or for NUMA optimization.
.PP
Glibc does not provide a wrapper for this system call; call it using
.BR syscall (2);
or use
.BR sched_getcpu (3)
instead.
.PP
The
.I tcache
argument is unused since Linux 2.6.24.

View File

@ -93,7 +93,7 @@ is the size of this entire
.IR linux_dirent .
.I d_name
is a null-terminated filename.
.PP
.I d_type
is a byte at the end of the structure that indicates the file type.
It contains one of the following values (defined in
@ -157,14 +157,14 @@ In addition,
supports an explicit
.I d_type
field.
.PP
The
.BR getdents64 ()
system call is like
.BR getdents (),
except that its second argument is a pointer to a buffer containing
structures of the following type:
.PP
.nf
.in +4n
struct linux_dirent64 {
@ -213,7 +213,7 @@ structure yourself.
However, you probably want to use
.BR readdir (3)
instead.
.PP
These calls supersede
.BR readdir (2).
.SH EXAMPLE
@ -223,7 +223,7 @@ The program below demonstrates the use of
.BR getdents ().
The following output shows an example of what we see when running this
program on an ext2 directory:
.PP
.in +4n
.nf
.RB "$" " ./a.out /testfs/"

View File

@ -57,7 +57,7 @@ Feature Test Macro Requirements for glibc (see
.SH DESCRIPTION
These functions are used to access or to change the NIS domain name of the
host system.
.PP
.BR setdomainname ()
sets the domain name to the value given in the character array
.IR name .
@ -68,7 +68,7 @@ argument specifies the number of bytes in
(Thus,
.I name
does not require a terminating null byte.)
.PP
.BR getdomainname ()
returns the null-terminated domain name in the character array
.IR name ,
@ -121,7 +121,7 @@ POSIX does not specify these calls.
Since Linux 1.0, the limit on the length of a domain name,
including the terminating null byte, is 64 bytes.
In older kernels, it was 8 bytes.
.PP
On most Linux architectures (including x86),
there is no
.BR getdomainname ()

View File

@ -36,7 +36,7 @@ getgid, getegid \- get group identity
.SH DESCRIPTION
.BR getgid ()
returns the real group ID of the calling process.
.PP
.BR getegid ()
returns the effective group ID of the calling process.
.SH ERRORS

View File

@ -71,7 +71,7 @@ is included in the returned list.
(Thus, an application should also call
.BR getegid (2)
and add or remove the resulting value.)
.PP
If
.I size
is zero,
@ -100,7 +100,7 @@ returns the number of supplementary group IDs.
On error, \-1 is returned, and
.I errno
is set appropriately.
.PP
On success,
.BR setgroups ()
returns 0.
@ -166,7 +166,7 @@ is defined in
The set of supplementary group IDs
is inherited from the parent process, and preserved across an
.BR execve (2).
.PP
The maximum number of supplementary group IDs can be found at run time using
.BR sysconf (3):
.nf
@ -181,7 +181,7 @@ cannot be larger than one more than this value.
Since Linux 2.6.4, the maximum number of supplementary group IDs is also
exposed via the Linux-specific read-only file,
.IR /proc/sys/kernel/ngroups_max .
.PP
The original Linux
.BR getgroups ()
system call supported only 16-bit group IDs.

View File

@ -69,7 +69,7 @@ _BSD_SOURCE || _XOPEN_SOURCE\ >=\ 500
.SH DESCRIPTION
These system calls are used to access or to change the hostname of the
current processor.
.PP
.BR sethostname ()
sets the hostname to the value given in the character array
.IR name .
@ -80,7 +80,7 @@ argument specifies the number of bytes in
(Thus,
.I name
does not require a terminating null byte.)
.PP
.BR gethostname ()
returns the null-terminated hostname in the character array
.IR name ,
@ -167,7 +167,7 @@ set to
.BR ENAMETOOLONG ;
in this case, a terminating null byte is not included in the returned
.IR name .
.PP
Versions of glibc before 2.2
.\" At least glibc 2.0 and 2.1, older versions not checked
handle the case where the length of the

View File

@ -29,7 +29,7 @@ and (optionally) at regular intervals after that.
When a timer expires, a signal is generated for the calling process,
and the timer is reset to the specified interval
(if the interval is nonzero).
.PP
Three types of timers\(emspecified via the
.IR which
argument\(emare provided,
@ -56,14 +56,14 @@ CPU time consumed by the process.
At each expiration, a
.B SIGPROF
signal is generated.
.IP
In conjunction with
.BR ITIMER_VIRTUAL ,
this timer can be used to profile user and system CPU time
consumed by the process.
.LP
A process has only one of each of the three types of timers.
.PP
Timer values are defined by the following structures:
.PD 0
.in +4n
@ -88,7 +88,7 @@ places the current value of the timer specified by
.IR which
in the buffer pointed to by
.IR curr_value .
.PP
The
.IR it_value
substructure is populated with the amount of time remaining until
@ -99,7 +99,7 @@ when the timer expires.
If both fields of
.IR it_value
are zero, then this timer is currently disarmed (inactive).
.PP
The
.IR it_interval
substructure is populated with the timer interval.
@ -119,7 +119,7 @@ is non-NULL,
the buffer it points to is used to return the previous value of the timer
(i.e., the same information that is returned by
.BR getitimer ()).
.PP
If either field in
.IR new_value.it_value
is nonzero,
@ -127,7 +127,7 @@ then the timer is armed to initially expire at the specified time.
If both fields in
.IR new_value.it_value
are zero, then the timer is disarmed.
.PP
The
.IR new_value.it_interval
field specifies the new interval for the timer;
@ -177,13 +177,13 @@ on the system timer resolution and on the system load; see
If the timer expires while the process is active (always true for
.BR ITIMER_VIRTUAL ),
the signal will be delivered immediately when generated.
.PP
A child created via
.BR fork (2)
does not inherit its parent's interval timers.
Interval timers are preserved across an
.BR execve (2).
.PP
POSIX.1 leaves the
interaction between
.BR setitimer ()
@ -193,16 +193,16 @@ and the three interfaces
and
.BR usleep (3)
unspecified.
.PP
The standards are silent on the meaning of the call:
.PP
setitimer(which, NULL, &old_value);
.PP
Many systems (Solaris, the BSDs, and perhaps others)
treat this as equivalent to:
.PP
getitimer(which, &old_value);
.PP
In Linux, this is treated as being equivalent to a call in which the
.I new_value
fields are zero; that is, the timer is disabled.
@ -217,7 +217,7 @@ Under very heavy loading, an
timer may expire before the signal from a previous expiration
has been delivered.
The second signal in such an event will be lost.
.PP
On Linux kernels before 2.6.16, timer values are represented in jiffies.
If a request is made set a timer with a value whose jiffies
representation exceeds
@ -232,14 +232,14 @@ approximately 99.42 days.
Since Linux 2.6.16,
the kernel uses a different internal representation for times,
and this ceiling is removed.
.PP
On certain systems (including i386),
Linux kernels before version 2.6.12 have a bug which will produce
premature timer expirations of up to one jiffy under some circumstances.
This bug is fixed in kernel 2.6.12.
.\" 4 Jul 2005: It looks like this bug may remain in 2.4.x.
.\" http://lkml.org/lkml/2005/7/1/165
.PP
POSIX.1-2001 says that
.BR setitimer ()
should fail if a

View File

@ -84,12 +84,12 @@ instead of
long sz = sysconf(_SC_PAGESIZE);
.fi
.in
.PP
(Most systems allow the synonym
.B _SC_PAGE_SIZE
for
.BR _SC_PAGESIZE .)
.PP
Whether
.BR getpagesize ()
is present as a Linux system call depends on the architecture.

View File

@ -60,7 +60,7 @@ by
.IR addr .
On return it contains the actual size of the name returned (in bytes).
The name is truncated if the buffer provided is too small.
.PP
The returned address is truncated if the buffer provided is too small;
in this case,
.I addrlen
@ -107,7 +107,7 @@ For background on the
.I socklen_t
type, see
.BR accept (2).
.PP
For stream sockets, once a
.BR connect (2)
has been performed, either socket can call

View File

@ -67,7 +67,7 @@ call.
The process attribute dealt with by these system calls is
the same attribute (also known as the "nice" value) that is dealt with by
.BR nice (2).
.PP
The value
.I which
is one of
@ -90,7 +90,7 @@ A zero value for
.I who
denotes (respectively) the calling process, the process group of the
calling process, or the real user ID of the calling process.
.PP
The
.I prio
argument is a value in the range \-20 to 19 (but see NOTES below).
@ -99,7 +99,7 @@ Attempts to set a priority outside this range
are silently clamped to the range.
The default priority is 0;
lower values give a process a higher scheduling priority.
.PP
The
.BR getpriority ()
call returns the highest priority (lowest numerical value)
@ -108,7 +108,7 @@ The
.BR setpriority ()
call sets the priorities of all of the specified processes
to the specified value.
.PP
Traditionally, only a privileged process could lower the nice value
(i.e., set a higher priority).
However, since Linux 2.6.12, an unprivileged process can decrease
@ -132,7 +132,7 @@ to clear the external variable
prior to the
call, then check it afterward to determine
if \-1 is an error or a legitimate value.
.PP
.BR setpriority ()
returns 0 on success.
On error, it returns \-1 and sets
@ -179,19 +179,19 @@ SVr4, 4.4BSD (these interfaces first appeared in 4.2BSD).
.SH NOTES
For further details on the nice value, see
.BR sched (7).
.PP
.IR Note :
the addition of the "autogroup" feature in Linux 2.6.38 means that
the nice value no longer has its traditional effect in many circumstances.
For details, see
.BR sched (7).
.PP
A child created by
.BR fork (2)
inherits its parent's nice value.
The nice value is preserved across
.BR execve (2).
.PP
The details on the condition for
.B EPERM
depend on the system.
@ -206,7 +206,7 @@ the real or effective user ID of the process \fIwho\fP.
All BSD-like systems (SunOS 4.1.3, Ultrix 4.2,
4.3BSD, FreeBSD 4.3, OpenBSD-2.5, ...) behave in the same
manner as Linux 2.6.12 and later.
.PP
Including
.I <sys/time.h>
is not required these days, but increases portability.
@ -247,6 +247,6 @@ which may be made standards conformant in the future.
.BR fork (2),
.BR capabilities (7),
.BR sched (7)
.PP
.I Documentation/scheduler/sched-nice-design.txt
in the Linux kernel source tree (since Linux 2.6.23)

View File

@ -41,7 +41,7 @@ with up to
random bytes.
These bytes can be used to seed user-space random number generators
or for cryptographic purposes.
.PP
By default,
.BR getrandom ()
draws entropy from the
@ -52,7 +52,7 @@ device).
This behavior can be changed via the
.I flags
argument.
.PP
If the
.I urandom
source has been initialized,
@ -62,7 +62,7 @@ No such guarantees apply for larger buffer sizes.
For example, if the call is interrupted by a signal handler,
it may return a partially filled buffer, or fail with the error
.BR EINTR .
.PP
If the
.I urandom
source has not yet been initialized, then
@ -71,7 +71,7 @@ will block, unless
.B GRND_NONBLOCK
is specified in
.IR flags .
.PP
The
.I flags
argument is a bit mask that can contain zero or more of the following values
@ -174,7 +174,7 @@ This system call is Linux-specific.
For an overview and comparison of the various interfaces that
can be used to obtain randomness, see
.BR random (7).
.PP
Unlike
.IR /dev/random
and
@ -232,7 +232,7 @@ will block until some random bytes become available
(unless the
.BR GRND_NONBLOCK
flag was specified).
.PP
The behavior when a call to
.BR getrandom ()
that is blocked while reading from the
@ -257,19 +257,19 @@ then
will not fail with
.BR EINTR .
Instead, it will return all of the bytes that have been requested.
.PP
When reading from the
.IR random
source, blocking requests of any size can be interrupted by a signal handler
(the call fails with the error
.BR EINTR ).
.PP
Using
.BR getrandom ()
to read small buffers (<=\ 256 bytes) from the
.I urandom
source is the preferred mode of usage.
.PP
The special treatment of small values of
.I buflen
was designed for compatibility with

View File

@ -59,7 +59,7 @@ One of the arguments specified an address outside the calling program's
address space.
.SH VERSIONS
These system calls appeared on Linux starting with kernel 2.1.44.
.PP
The prototypes are given by glibc since version 2.3.2,
provided
.B _GNU_SOURCE

View File

@ -237,7 +237,7 @@ and
.BR MAP_LOCKED ;
a process can lock bytes up to this limit in each of these
two categories.
.IP
In Linux kernels before 2.6.9, this limit controlled the amount of
memory that could be locked by a privileged process.
Since Linux 2.6.9, no limits are placed on the amount of memory
@ -281,7 +281,7 @@ and the
and
.I posix_msg_tree_node
structures are kernel-internal structures.
.IP
The "overhead" addend in the formula accounts for overhead
bytes required by the implementation
and ensures that the user cannot
@ -320,7 +320,7 @@ to exceed this limit yield the error
(Historically, this limit was named
.B RLIMIT_OFILE
on BSD.)
.IP
Since Linux 4.5,
this limit also defines the maximum number of file descriptors that
an unprivileged process (one without the
@ -365,7 +365,7 @@ this process using
.BR sched_setscheduler (2)
and
.BR sched_setparam (2).
.IP
For further details on real-time scheduling policies, see
.BR sched (7)
.TP
@ -380,7 +380,7 @@ the count of its consumed CPU time is reset to zero.
The CPU time count is not reset if the process continues trying to
use the CPU but is preempted, its time slice expires, or it calls
.BR sched_yield (2).
.IP
Upon reaching the soft limit, the process is sent a
.B SIGXCPU
signal.
@ -391,10 +391,10 @@ will be generated once each second until the hard limit is reached,
at which point the process is sent a
.B SIGKILL
signal.
.IP
The intended use of this limit is to stop a runaway
real-time process from locking up the system.
.IP
For further details on real-time scheduling policies, see
.BR sched (7)
.TP
@ -419,7 +419,7 @@ Upon reaching this limit, a
signal is generated.
To handle this signal, a process must employ an alternate signal stack
.RB ( sigaltstack (2)).
.IP
Since Linux 2.6.23,
this limit also determines the amount of space used for the process's
command-line arguments and environment variables; for details, see
@ -444,14 +444,14 @@ system call combines and extends the functionality of
and
.BR getrlimit ().
It can be used to both set and get the resource limits of an arbitrary process.
.PP
The
.I resource
argument has the same meaning as for
.BR setrlimit ()
and
.BR getrlimit ().
.PP
If the
.IR new_limit
argument is a not NULL, then the
@ -469,7 +469,7 @@ in the
.I rlimit
structure pointed to by
.IR old_limit .
.PP
The
.I pid
argument specifies the ID of the process on which the call is to operate.
@ -553,7 +553,7 @@ T{
.BR prlimit ()
T} Thread safety MT-Safe
.TE
.sp 1
.SH CONFORMING TO
.BR getrlimit (),
.BR setrlimit ():
@ -561,7 +561,7 @@ POSIX.1-2001, POSIX.1-2008, SVr4, 4.3BSD.
.br
.BR prlimit ():
Linux-specific.
.PP
.B RLIMIT_MEMLOCK
and
.B RLIMIT_NPROC
@ -583,12 +583,12 @@ A child process created via
inherits its parent's resource limits.
Resource limits are preserved across
.BR execve (2).
.PP
Lowering the soft limit for a resource below the process's
current consumption of that resource will succeed
(but will prevent the process from further increasing
its consumption of the resource).
.PP
One can set the resource limits of the shell using the built-in
.IR ulimit
command
@ -597,12 +597,12 @@ in
.BR csh (1)).
The shell's resource limits are inherited by the processes that
it creates to execute commands.
.PP
Since Linux 2.6.24, the resource limits of any process can be inspected via
.IR /proc/[pid]/limits ;
see
.BR proc (5).
.PP
Ancient systems provided a
.BR vlimit ()
function with a similar purpose to
@ -620,7 +620,7 @@ wrapper functions no longer invoke the corresponding system calls,
but instead employ
.BR prlimit (),
for the reasons described in BUGS.
.PP
The name of the glibc wrapper function is
.BR prlimit ();
the underlying system call is
@ -634,7 +634,7 @@ signals delivered when a process encountered the soft and hard
.B RLIMIT_CPU
limits were delivered one (CPU) second later than they should have been.
This was fixed in kernel 2.6.8.
.PP
In 2.6.x kernels before 2.6.17, a
.B RLIMIT_CPU
limit of 0 is wrongly treated as "no limit" (like
@ -642,12 +642,12 @@ limit of 0 is wrongly treated as "no limit" (like
Since Linux 2.6.17, setting a limit of 0 does have an effect,
but is actually treated as a limit of 1 second.
.\" see http://marc.theaimsgroup.com/?l=linux-kernel&m=114008066530167&w=2
.PP
A kernel bug means that
.\" See https://lwn.net/Articles/145008/
.B RLIMIT_RTPRIO
does not work in kernel 2.6.12; the problem is fixed in kernel 2.6.13.
.PP
In kernel 2.6.12, there was an off-by-one mismatch
between the priority ranges returned by
.BR getpriority (2)
@ -658,7 +658,7 @@ was calculated as
.IR "19\ \-\ rlim_cur" .
This was fixed in kernel 2.6.13.
.\" see http://marc.theaimsgroup.com/?l=linux-kernel&m=112256338703880&w=2
.PP
Since Linux 2.6.12,
.\" The relevant patch, sent to LKML, seems to be
.\" http://thread.gmane.org/gmane.linux.kernel/273462
@ -685,7 +685,7 @@ portable applications should avoid relying on this Linux-specific behavior.
The Linux-specific
.BR RLIMIT_RTTIME
limit exhibits the same behavior when the soft limit is encountered.
.PP
Kernels before 2.4.22 did not diagnose the error
.B EINVAL
for
@ -726,7 +726,7 @@ represent file offsets\(emthat is, as wide as a 64-bit
.BR off_t
(assuming a program compiled with
.IR _FILE_OFFSET_BITS=64 ).
.PP
To work around this kernel limitation,
if a program tried to set a resource limit to a value larger than
can be represented in a 32-bit
@ -736,7 +736,7 @@ then the glibc
wrapper function silently converted the limit value to
.BR RLIM_INFINITY .
In other words, the requested resource limit setting was silently ignored.
.PP
This problem was addressed in Linux 2.6.36 with two principal changes:
.IP * 3
the addition of a new kernel representation of resource limits that

View File

@ -211,7 +211,7 @@ T{
.BR getrusage ()
T} Thread safety MT-Safe
.TE
.sp 1
.SH CONFORMING TO
POSIX.1-2001, POSIX.1-2008, SVr4, 4.3BSD.
POSIX.1 specifies
@ -220,13 +220,13 @@ but specifies only the fields
.I ru_utime
and
.IR ru_stime .
.PP
.B RUSAGE_THREAD
is Linux-specific.
.SH NOTES
Resource usage metrics are preserved across an
.BR execve (2).
.PP
Including
.I <sys/time.h>
is not required these days, but increases portability.
@ -249,7 +249,7 @@ This nonconformance is rectified in Linux 2.6.9 and later.
.LP
The structure definition shown at the start of this page
was taken from 4.3BSD Reno.
.PP
Ancient systems provided a
.BR vtimes ()
function with a similar purpose to
@ -258,7 +258,7 @@ For backward compatibility, glibc also provides
.BR vtimes ().
All new applications should be written using
.BR getrusage ().
.PP
See also the description of
.IR /proc/[pid]/stat
in

View File

@ -85,7 +85,7 @@ POSIX.1-2001, POSIX.1-2008, SVr4.
.SH NOTES
Linux does not return
.BR EPERM .
.PP
See
.BR credentials (7)
for a description of sessions and session IDs.

View File

@ -59,7 +59,7 @@ argument should be initialized to indicate
the amount of space (in bytes) pointed to by
.IR addr .
On return it contains the actual size of the socket address.
.PP
The returned address is truncated if the buffer provided is too small;
in this case,
.I addrlen

View File

@ -64,7 +64,7 @@ manipulate options for the socket referred to by the file descriptor
Options may exist at multiple
protocol levels; they are always present at the uppermost
socket level.
.PP
When manipulating socket options, the level at which the
option resides and the name of the option must be specified.
To manipulate options at the sockets API level,
@ -83,7 +83,7 @@ should be set to the protocol number of
.BR TCP ;
see
.BR getprotoent (3).
.PP
The arguments
.I optval
and
@ -105,7 +105,7 @@ the value returned.
If no option value is to be supplied or returned,
.I optval
may be NULL.
.PP
.I Optname
and any specified options are passed uninterpreted to the appropriate
protocol module for interpretation.
@ -115,7 +115,7 @@ contains definitions for socket level options, described below.
Options at
other protocol levels vary in format and name; consult the appropriate
entries in section 4 of the manual.
.PP
Most socket-level options utilize an
.I int
argument for
@ -133,7 +133,7 @@ On success, zero is returned for the standard options.
On error, \-1 is returned, and
.I errno
is set appropriately.
.PP
Netfilter allows the programmer
to define custom socket options with associated handlers; for such
options, the return value on success is the value returned by the handler.
@ -185,7 +185,7 @@ POSIX.1 does not require the inclusion of
and this header file is not required on Linux.
However, some historical (BSD) implementations required this header
file, and portable applications are probably wise to include it.
.PP
For background on the
.I socklen_t
type, see

View File

@ -64,11 +64,11 @@ Glibc does not provide a wrapper for this system call; call it using
.BR syscall (2).
.\" FIXME . See http://sourceware.org/bugzilla/show_bug.cgi?id=6399
.\" "gettid() should have a wrapper"
.PP
The thread ID returned by this call is not the same thing as a
POSIX thread ID (i.e., the opaque value returned by
.BR pthread_self (3)).
.PP
In a new thread group created by a
.BR clone (2)
call that does not specify the

View File

@ -119,7 +119,7 @@ structure is obsolete; the
.I tz
argument should normally be specified as NULL.
(See NOTES below.)
.PP
Under Linux, there are some peculiar "warp clock" semantics associated
with the
.BR settimeofday ()
@ -182,12 +182,12 @@ affected by discontinuous jumps in the system time
(e.g., if the system administrator manually changes the system time).
If you need a monotonically increasing clock, see
.BR clock_gettime (2).
.PP
Macros for operating on
.I timeval
structures are described in
.BR timeradd (3).
.PP
Traditionally, the fields of
.I struct timeval
were of type
@ -218,7 +218,7 @@ or
.\" Each and every occurrence of this field in the kernel source
.\" (other than the declaration) is a bug.
Thus, the following is purely of historical interest.
.PP
On old systems, the field
.I tz_dsttime
contains a symbolic constant (values are given below)

View File

@ -37,7 +37,7 @@ getuid, geteuid \- get user identity
.SH DESCRIPTION
.BR getuid ()
returns the real user ID of the calling process.
.PP
.BR geteuid ()
returns the effective user ID of the calling process.
.SH ERRORS
@ -54,7 +54,7 @@ UNIX\ V7 introduced separate calls
.BR getuid ()
and
.BR geteuid ().
.PP
The original Linux
.BR getuid ()
and

View File

@ -39,7 +39,7 @@ getunwind \- copy the unwind data to caller's buffer
There is no glibc wrapper for this system call; see NOTES.
.SH DESCRIPTION
.I Note: this function is obsolete.
.PP
The
IA-64-specific
.BR getunwind ()
@ -49,7 +49,7 @@ unwind data into the buffer pointed to by
and returns the size of the unwind data;
this data describes the gate page (kernel code that
is mapped into user space).
.PP
The size of the buffer
.I buf
is specified in
@ -61,17 +61,17 @@ is greater than or equal to the size of the unwind data and
is not NULL;
otherwise, no data is copied, and the call succeeds,
returning the size that would be needed to store the unwind data.
.PP
The first part of the unwind data contains an unwind table.
The rest contains the associated unwind information, in no particular order.
The unwind table contains entries of the following form:
.PP
.nf
u64 start; (64-bit address of start of function)
u64 end; (64-bit address of end of function)
u64 info; (BUF-relative offset to unwind info)
.fi
.PP
An entry whose
.I start
value is zero indicates the end of the table.
@ -100,7 +100,7 @@ and is available only on the IA-64 architecture.
This system call has been deprecated.
The modern way to obtain the kernel's unwind data is via the
.BR vdso (7).
.PP
Glibc does not provide a wrapper for this system call;
in the unlikely event that you want to call it, use
.BR syscall (2).

View File

@ -51,7 +51,7 @@ and then runs the module's
.I init
function.
This system call requires privilege.
.PP
The
.I module_image
argument points to a buffer containing the binary image
@ -59,7 +59,7 @@ to be loaded;
.I len
specifies the size of that buffer.
The module image should be a valid ELF image, built for the running kernel.
.PP
The
.I param_values
argument is a string containing space-delimited specifications of the
@ -70,12 +70,12 @@ and
The kernel parses this string and initializes the specified
parameters.
Each of the parameter specifications has the form:
.PP
.RI " " name [\c
.BI = value\c
.RB [ ,\c
.IR value ...]]
.PP
The parameter
.I name
is one of those defined within the module using
@ -108,7 +108,7 @@ The
.I param_values
argument is as for
.BR init_module ().
.PP
The
.I flags
argument modifies the operation of
@ -140,7 +140,7 @@ for the function named by the symbol.
In this case, the kernel version number within the
"vermagic" string is ignored,
as the symbol version hashes are assumed to be sufficiently reliable.
.PP
Using the
.B MODULE_INIT_IGNORE_VERMAGIC
flag indicates that the "vermagic" string is to be ignored, and the
@ -272,17 +272,17 @@ it is (before glibc 2.23) sufficient to
manually declare the interface in your code;
alternatively, you can invoke the system call using
.BR syscall (2).
.PP
Glibc does not provide a wrapper for
.BR finit_module ();
call it using
.BR syscall (2).
.PP
Information about currently loaded modules can be found in
.IR /proc/modules
and in the file trees under the per-module subdirectories under
.IR /sys/module .
.PP
See the Linux kernel source file
.I include/linux/module.h
for some useful background information.
@ -291,11 +291,11 @@ for some useful background information.
In Linux 2.4 and earlier, the
.BR init_module ()
system call was rather different:
.PP
.B " #include <linux/module.h>"
.PP
.BI " int init_module(const char *" name ", struct module *" image );
.PP
(User-space applications can detect which version of
.BR init_module ()
is available by calling
@ -303,7 +303,7 @@ is available by calling
the latter call fails with the error
.BR ENOSYS
on Linux 2.6 and later.)
.PP
The older version of the system call
loads the relocated module image pointed to by
.I image

View File

@ -51,7 +51,7 @@ See
.BR inotify (7)
for a description of the bits that can be set in
.IR mask .
.PP
A successful call to
.BR inotify_add_watch ()
returns a unique watch descriptor for this inotify instance,
@ -63,7 +63,7 @@ then the watch descriptor is newly allocated.
If the filesystem object was already being watched
(perhaps via a different link to the same object), then the descriptor
for the existing watch is returned.
.PP
The watch descriptor is returned by later
.BR read (2)s
from the inotify file descriptor.

View File

@ -39,11 +39,11 @@ inotify_init, inotify_init1 \- initialize an inotify instance
.SH DESCRIPTION
For an overview of the inotify API, see
.BR inotify (7).
.PP
.BR inotify_init ()
initializes a new inotify instance and returns a file descriptor associated
with a new inotify event queue.
.PP
If
.I flags
is 0, then

View File

@ -39,7 +39,7 @@ removes the watch associated with the watch descriptor
.I wd
from the inotify instance associated with the file descriptor
.IR fd .
.PP
Removing a watch causes an
.B IN_IGNORED
event to be generated for this watch descriptor.

View File

@ -39,7 +39,7 @@ wrapper functions which perform the steps required
the system call.
Thus, making a system call looks the same as invoking a normal
library function.
.PP
In many cases, the C library wrapper function does nothing more than:
.IP * 3
copying arguments and the unique system call number to the
@ -62,7 +62,7 @@ try to note the details of both the (usually GNU) C library API
interface and the raw system call.
Most commonly, the main DESCRIPTION will focus on the C library interface,
and differences for the system call are covered in the NOTES section.
.PP
For a list of the Linux system calls, see
.BR syscalls (2).
.SH RETURN VALUE
@ -74,12 +74,12 @@ system call returns a negative value, the wrapper copies the
absolute value into the
.I errno
variable, and returns \-1 as the return value of the wrapper.
.PP
The value returned by a successful system call depends on the call.
Many system calls return 0 on success, but some can return nonzero
values from a successful call.
The details are described in the individual manual pages.
.PP
In some cases,
the programmer must define a feature test macro in order to obtain
the declaration of a system call from the header file specified

View File

@ -70,7 +70,7 @@ But instead, you probably want to use the
wrapper function provided by
.\" http://git.fedorahosted.org/git/?p=libaio.git
.IR libaio .
.PP
Note that the
.I libaio
wrapper function uses a different type

View File

@ -59,7 +59,7 @@ But instead, you probably want to use the
wrapper function provided by
.\" http://git.fedorahosted.org/git/?p=libaio.git
.IR libaio .
.PP
Note that the
.I libaio
wrapper function uses a different type

View File

@ -27,10 +27,10 @@ system call
attempts to read at least \fImin_nr\fP events and
up to \fInr\fP events from the completion queue of the AIO context
specified by \fIctx_id\fP.
.PP
The \fItimeout\fP argument specifies the amount of time to wait for events,
and is specified as a relative timeout in a structure of the following form:
.PP
.in +4n
.nf
struct timespec {
@ -39,10 +39,10 @@ struct timespec {
};
.fi
.in
.PP
The specified time will be rounded up to the system clock granularity
and is guaranteed not to expire early.
.PP
Specifying
.I timeout
as NULL means block indefinitely until at least
@ -60,7 +60,7 @@ expired.
It may also be a nonzero value less than
.IR min_nr ,
if the call was interrupted by a signal handler.
.PP
For the failure return, see NOTES.
.SH ERRORS
.TP
@ -96,7 +96,7 @@ But instead, you probably want to use the
wrapper function provided by
.\" http://git.fedorahosted.org/git/?p=libaio.git
.IR libaio .
.PP
Note that the
.I libaio
wrapper function uses a different type

View File

@ -72,7 +72,7 @@ But instead, you probably want to use the
wrapper function provided by
.\" http://git.fedorahosted.org/git/?p=libaio.git
.IR libaio .
.PP
Note that the
.I libaio
wrapper function uses a different type

View File

@ -74,7 +74,7 @@ But instead, you probably want to use the
wrapper function provided by
.\" http://git.fedorahosted.org/git/?p=libaio.git
.IR libaio .
.PP
Note that the
.I libaio
wrapper function uses a different type

View File

@ -142,7 +142,7 @@ repeatedly.
The
.I entry
argument is a two-element array of the following structures:
.PP
.in +4n
.nf
struct __fat_dirent {
@ -229,14 +229,14 @@ For further error values, see
and
.B VFAT_IOCTL_READDIR_SHORT
first appeared in Linux 2.0.
.PP
.BR FAT_IOCTL_GET_ATTRIBUTES
and
.BR FAT_IOCTL_SET_ATTRIBUTES
first appeared
.\" just before we got Git history
in Linux 2.6.12.
.PP
.B FAT_IOCTL_GET_VOLUME_ID
was introduced in version 3.11
.\" commit 6e5b93ee55d401f1619092fb675b57c28c9ed7ec
@ -254,7 +254,7 @@ the program reads and displays the attribute again.
.PP
The following was recorded when applying the program for the file
.IR /mnt/user/foo :
.PP
.in +4n
.nf
# ./toggle_fat_archive_flag /mnt/user/foo
@ -355,7 +355,7 @@ to display the volume ID of a FAT filesystem.
The following output was recorded when applying the program for
directory
.IR /mnt/user :
.PP
.in +4n
.nf
$ ./display_fat_volume_id /mnt/user
@ -418,7 +418,7 @@ to list a directory.
.PP
The following was recorded when applying the program to the directory
.IR /mnt/user :
.PP
.in +4n
.nf
$ ./fat_dir /mnt/user

View File

@ -47,7 +47,7 @@ If a file write should occur to a shared region,
the filesystem must ensure that the changes remain private to the file being
written.
This behavior is commonly referred to as "copy on write".
.PP
This ioctl reflinks up to
.IR src_length
bytes from file descriptor
@ -78,7 +78,7 @@ struct file_clone_range {
.in
Clones are atomic with regards to concurrent writes, so no locks need to be
taken to obtain a consistent cloned copy.
.PP
The
.B FICLONE
ioctl clones entire files.

View File

@ -47,7 +47,7 @@ If a file write should occur to a shared
region, the filesystem must ensure that the changes remain private to the file
being written.
This behavior is commonly referred to as "copy on write".
.PP
This ioctl performs the "compare and share if identical" operation on up to
.IR src_length
bytes from file descriptor
@ -68,20 +68,20 @@ struct file_dedupe_range {
};
.fi
.in
.PP
Deduplication is atomic with regards to concurrent writes, so no locks need to
be taken to obtain a consistent deduplicated copy.
.PP
The fields
.IR reserved1 " and " reserved2
must be zero.
.PP
Destinations for the deduplication operation are conveyed in the array at the
end of the structure.
The number of destinations is given in
.IR dest_count ",
and the destination information is conveyed in the following form:
.PP
.in +4n
.nf
struct file_dedupe_range_info {
@ -94,7 +94,7 @@ struct file_dedupe_range_info {
.fi
.in
.PP
Each deduplication operation targets
.IR src_length
bytes in file descriptor
@ -125,7 +125,7 @@ is mapped into
and the previous contents in
.IR dest_fd
are freed.
.PP
Upon successful completion of this ioctl, the number of bytes successfully
deduplicated is returned in
.IR bytes_deduped
@ -143,7 +143,7 @@ code is set to
for success, a negative error code in case of error, or
.B FILE_DEDUPE_RANGE_DIFFERS
if the data did not match.
.PP
.SH RETURN VALUE
On error, \-1 is returned, and
.I errno
@ -208,7 +208,7 @@ Because a copy-on-write operation requires the allocation of new storage, the
.BR fallocate (2)
operation may unshare shared blocks to guarantee that subsequent writes will
not fail because of lack of disk space.
.PP
Some filesystems may limit the amount of data that can be deduplicated in a
single call.
.SH SEE ALSO

View File

@ -98,7 +98,7 @@ Window sizes are kept in the kernel, but not used by the kernel
(except in the case of virtual consoles, where the kernel will
update the window size when the size of the virtual console changes,
for example, by loading a new font).
.PP
The following constants and structure are defined in
.IR <sys/ioctl.h> .
.TP
@ -109,7 +109,7 @@ Get window size.
Set window size.
.LP
The struct used by these ioctls is defined as
.PP
.in +4n
.nf
struct winsize {
@ -120,7 +120,7 @@ struct winsize {
};
.fi
.in
.PP
When the window size changes, a
.B SIGWINCH
signal is sent to the
@ -141,7 +141,7 @@ returns without doing anything.
When
.I arg
is nonzero, nobody knows what will happen.
.IP
(SVr4, UnixWare, Solaris, Linux treat
.I "tcsendbreak(fd,arg)"
with nonzero
@ -244,7 +244,7 @@ controlling terminal already.
For this case,
.I arg
should be specified as zero.
.IP
If this terminal is already the controlling terminal
of a different session group, then the ioctl fails with
.BR EPERM ,
@ -334,7 +334,7 @@ If the first byte is not
.B TIOCPKT_DATA
(0), it is an OR of one
or more of the following bits:
.IP
.nf
TIOCPKT_FLUSHREAD The read queue for the terminal is flushed.
TIOCPKT_FLUSHWRITE The write queue for the terminal is flushed.
@ -343,7 +343,7 @@ TIOCPKT_START Output to the terminal is restarted.
TIOCPKT_DOSTOP The start and stop characters are \fB^S\fP/\fB^Q\fP.
TIOCPKT_NOSTOP The start and stop characters are not \fB^S\fP/\fB^Q\fP.
.fi
.IP
While this mode is in use, the presence
of control status information to be read
from the master side may be detected by a
@ -353,7 +353,7 @@ for exceptional conditions or a
for the
.I POLLPRI
event.
.IP
This mode is used by
.BR rlogin (1)
and
@ -395,7 +395,7 @@ pseudoterminal slave device.
This operation can be performed
regardless of whether the pathname of the slave device
is accessible through the calling process's mount namespaces.
.IP
Security-conscious programs interacting with namespaces may wish to use this
operation rather than
.BR open (2)
@ -424,7 +424,7 @@ Clear the indicated modem bits.
Set the indicated modem bits.
.LP
The following bits are used by the above ioctls:
.PP
.nf
TIOCM_LE DSR (data set ready/line enable)
TIOCM_DTR DTR (data terminal ready)
@ -459,7 +459,7 @@ The counts are written to the
.I serial_icounter_struct
structure pointed to by
.IR argp .
.IP
Note: both 1->0 and 0->1 transitions are counted, except for
RI, where only 0->1 transitions are counted.
.SS Marking a line as local
@ -547,7 +547,7 @@ Inappropriate
Insufficient permission.
.SH EXAMPLE
Check the condition of DTR on the serial port.
.PP
.nf
#include <termios.h>
#include <fcntl.h>

View File

@ -55,7 +55,7 @@ is one of the commands listed below, and
.I argp
is a pointer to a data structure that is specific to
.IR cmd .
.PP
The various
.BR ioctl (2)
operations are described below.
@ -78,7 +78,7 @@ events.
.SS UFFDIO_API
(Since Linux 4.3.)
Enable operation of the userfaultfd and perform API handshake.
.PP
The
.I argp
argument is a pointer to a
@ -98,7 +98,7 @@ struct uffdio_api {
The
.I api
field denotes the API version requested by the application.
.PP
The kernel verifies that it can support the requested API version,
and sets the
.I features
@ -107,7 +107,7 @@ and
fields to bit masks representing all the available features and the generic
.BR ioctl (2)
operations available.
.PP
For Linux kernel versions before 4.11, the
.I features
field must be initialized to zero before the call to
@ -116,7 +116,7 @@ and zero (i.e., no feature bits) is placed in the
.I features
field by the kernel upon return from
.BR ioctl (2).
.PP
Starting from Linux 4.11, the
.I features
field can be used to ask whether particular features are supported
@ -124,7 +124,7 @@ and explicitly enable userfaultfd features that are disabled by default.
The kernel always reports all the available features in the
.I features
field.
.PP
To enable userfaultfd features the application should set
a bit corresponding to each feature it wants to enable in the
.I features
@ -135,7 +135,7 @@ Otherwise it will zero out the returned
structure and return
.BR EINVAL .
.\" FIXME add more details about feature negotiation and enablement
.PP
Since Linux 4.11, the following feature bits may be set:
.TP
.B UFFD_FEATURE_EVENT_FORK
@ -196,7 +196,7 @@ with the
flag set,
.BR memfd_create (2),
and so on.
.IP
The returned
.I ioctls
field can contain the following bits:
@ -255,15 +255,15 @@ by the current kernel version.
(Since Linux 4.3.)
Register a memory address range with the userfaultfd object.
The pages in the range must be "compatible".
.PP
Up to Linux kernel 4.11,
only private anonymous ranges are compatible for registering with
.BR UFFDIO_REGISTER .
.PP
Since Linux 4.11,
hugetlbfs and shared memory ranges are also compatible with
.BR UFFDIO_REGISTER .
.PP
The
.I argp
argument is a pointer to a
@ -285,7 +285,7 @@ struct uffdio_register {
.fi
.in
.PP
The
.I range
field defines a memory range starting at
@ -293,7 +293,7 @@ field defines a memory range starting at
and continuing for
.I len
bytes that should be handled by the userfaultfd.
.PP
The
.I mode
field defines the mode of operation desired for this memory region.
@ -316,7 +316,7 @@ bit-mask field to indicate which
operations are available for the specified range.
This returned bit mask is as for
.BR UFFDIO_API .
.PP
This
.BR ioctl (2)
operation returns 0 on success.
@ -364,12 +364,12 @@ There as an incompatible mapping in the specified address range.
Unregister a memory address range from userfaultfd.
The pages in the range must be "compatible" (see the description of
.BR UFFDIO_REGISTER .)
.PP
The address range to unregister is specified in the
.IR uffdio_range
structure pointed to by
.IR argp .
.PP
This
.BR ioctl (2)
operation returns 0 on success.
@ -406,7 +406,7 @@ fields of the
.I uffdio_copy
structure pointed to by
.IR argp :
.PP
.in +4n
.nf
struct uffdio_copy {
@ -448,7 +448,7 @@ field is output-only;
it is not read by the
.B UFFDIO_COPY
operation.
.PP
This
.BR ioctl (2)
operation returns 0 on success.
@ -505,14 +505,14 @@ operation.
.SS UFFDIO_ZEROPAGE
(Since Linux 4.3.)
Zero out a memory range registered with userfaultfd.
.PP
The requested range is specified by the
.I range
field of the
.I uffdio_zeropage
structure pointed to by
.IR argp :
.PP
.in +4n
.nf
struct uffdio_zeropage {
@ -552,7 +552,7 @@ field is output-only;
it is not read by the
.B UFFDIO_ZERO
operation.
.PP
This
.BR ioctl (2)
operation returns 0 on success.
@ -593,7 +593,7 @@ operation.
(Since Linux 4.3.)
Wake up the thread waiting for page-fault resolution on
a specified memory address range.
.PP
The
.B UFFDIO_WAKE
operation is used in conjunction with
@ -613,13 +613,13 @@ and
.BR UFFDIO_ZEROPAGE
operations in a batch and then explicitly wake up the faulting thread using
.BR UFFDIO_WAKE .
.PP
The
.I argp
argument is a pointer to a
.I uffdio_range
structure (shown above) that specifies the address range.
.PP
This
.BR ioctl (2)
operation returns 0 on success.
@ -675,6 +675,6 @@ operation that actually enables the desired features.
.BR ioctl (2),
.BR mmap (2),
.BR userfaultfd (2)
.PP
.IR Documentation/vm/userfaultfd.txt
in the Linux kernel source tree

View File

@ -53,7 +53,7 @@ If
.I turn_on
is nonzero, the calling thread must be privileged
.RB ( CAP_SYS_RAWIO ).
.PP
Before Linux 2.6.8,
only the first 0x3ff I/O ports could be specified in this manner.
For more ports, the
@ -62,7 +62,7 @@ system call had to be used (with a
.I level
argument of 3).
Since Linux 2.6.8, 65,536 I/O ports can be specified.
.PP
Permissions are inherited by the child created by
.BR fork (2)
(but see NOTES).
@ -70,7 +70,7 @@ Permissions are preserved across
.BR execve (2);
this is useful for giving port access permissions to unprivileged
programs.
.PP
This call is mostly for the i386 architecture.
On many other architectures it does not exist or will always
return an error.
@ -104,11 +104,11 @@ intended to be portable.
The
.I /proc/ioports
file shows the I/O ports that are currently allocated on the system.
.PP
Before Linux 2.4,
permissions were not inherited by a child created by
.BR fork (2).
.PP
Glibc has an
.BR ioperm ()
prototype both in

View File

@ -42,25 +42,25 @@ iopl \- change I/O privilege level
changes the I/O privilege level of the calling process,
as specified by the two least significant bits in
.IR level .
.PP
This call is necessary to allow 8514-compatible X servers to run under
Linux.
Since these X servers require access to all 65536 I/O ports, the
.BR ioperm (2)
call is not sufficient.
.PP
In addition to granting unrestricted I/O port access, running at a higher
I/O privilege level also allows the process to disable interrupts.
This will probably crash the system, and is not recommended.
.PP
Permissions are not inherited by the child process created by
.BR fork (2)
and are not preserved across
.BR execve (2)
(but see NOTES).
.PP
The I/O privilege level for a normal process is 0.
.PP
This call is mostly for the i386 architecture.
On many other architectures it does not exist or will always
return an error.
@ -98,7 +98,7 @@ Glibc2 has a prototype both in
and in
.IR <sys/perm.h> .
Avoid the latter, it is available on i386 only.
.PP
Prior to Linux 3.7,
on some architectures (such as i386), permissions
.I were

View File

@ -39,7 +39,7 @@ and
.BR ioprio_set ()
system calls respectively get and set the I/O scheduling class and
priority of one or more threads.
.PP
The
.I which
and
@ -95,7 +95,7 @@ is the lowest)
or if it belongs to the same priority class as the other process but
has a higher priority level (a lower priority number means a
higher priority level).
.PP
The
.I ioprio
argument given to
@ -141,7 +141,7 @@ information on scheduling classes and priorities,
as well as the meaning of specifying
.I ioprio
as 0.
.PP
I/O priorities are supported for reads and for synchronous
.RB ( O_DIRECT ,
.BR O_SYNC )
@ -201,7 +201,7 @@ These system calls are Linux-specific.
.SH NOTES
Glibc does not provide a wrapper for these system calls; call them using
.BR syscall (2).
.PP
Two or more processes or threads can share an I/O context.
This will be the case when
.BR clone (2)
@ -220,12 +220,12 @@ is the one that is returned by
.BR gettid (2)
or
.BR clone (2).
.PP
These system calls have an effect only when used
in conjunction with an I/O scheduler that supports I/O priorities.
As at kernel 2.6.17 the only such scheduler is the Completely Fair Queuing
(CFQ) I/O scheduler.
.PP
If no I/O scheduler has been set for a thread,
then by default the I/O priority will follow the CPU nice value
.RB ( setpriority (2)).
@ -242,7 +242,7 @@ as 0 can be used to reset to the default I/O scheduling behavior.
I/O schedulers are selected on a per-device basis via the special
file
.IR /sys/block/<device>/queue/scheduler .
.PP
One can view the current I/O scheduler via the
.I /sys
filesystem.
@ -365,6 +365,6 @@ Suitable definitions can be found in
.BR open (2),
.BR capabilities (7),
.BR cgroups (7)
.PP
.I Documentation/block/ioprio.txt
in the Linux kernel source tree

View File

@ -47,7 +47,7 @@ and
.I pid2
share a kernel resource such as virtual memory, file descriptors,
and so on.
.PP
Permission to employ
.BR kcmp ()
is governed by ptrace access mode
@ -58,7 +58,7 @@ and
.IR pid2 ;
see
.BR ptrace (2).
.PP
The
.I type
argument specifies which resource is to be compared in the two processes.
@ -210,7 +210,7 @@ The return value of a successful call to
is simply the result of arithmetic comparison
of kernel pointers (when the kernel compares resources, it uses their
memory addresses).
.PP
The easiest way to explain is to consider an example.
Suppose that
.I v1
@ -242,7 +242,7 @@ but ordering information is unavailable.
On error, \-1 is returned, and
.I errno
is set appropriately.
.PP
.BR kcmp ()
was designed to return values suitable for sorting.
This is particularly handy if one needs to compare
@ -304,7 +304,7 @@ is Linux-specific and should not be used in programs intended to be portable.
.SH NOTES
Glibc does not provide a wrapper for this system call; call it using
.BR syscall (2).
.PP
This system call is available only if the kernel was configured with
.BR CONFIG_CHECKPOINT_RESTORE .
The main use of the system call is for the
@ -313,7 +313,7 @@ The alternative to this system call would have been to expose suitable
process information via the
.BR proc (5)
filesystem; this was deemed to be unsuitable for security reasons.
.PP
See
.BR clone (2)
for some background information on the shared resources
@ -326,7 +326,7 @@ the same open file description.
The program tests different cases for the file descriptor pairs,
as described in the program output.
An example run of the program is as follows:
.PP
.nf
.in +4n
$ \fB./a.out\fP

View File

@ -102,7 +102,7 @@ or one of the following architecture constants
and
.BR KEXEC_ARCH_MIPS_LE .
The architecture must be executable on the CPU of the system.
.PP
The
.I entry
argument is the physical entry address in the kernel image.
@ -178,13 +178,13 @@ and is moved to the final destination at kexec reboot time (e.g., when the
command is executed with the
.I \-e
option).
.PP
In case of kexec on panic (i.e., the
.BR KEXEC_ON_CRASH
flag is set), the segment data is
loaded to reserved memory at the time of the call, and, after a crash,
the kexec mechanism simply passes control to that kernel.
.PP
The
.BR kexec_load ()
system call is available only if the kernel was configured with
@ -209,7 +209,7 @@ The
.IR cmdline_len
argument specifies size of the buffer.
The last byte in the buffer must be a null byte (\(aq\\0\(aq).
.PP
The
.IR flags
argument is a bit mask which modifies the behavior of the call.
@ -343,7 +343,7 @@ Call them using
.BR reboot (2),
.BR syscall (2),
.BR kexec (8)
.PP
The kernel source files
.IR Documentation/kdump/kdump.txt
and

File diff suppressed because it is too large Load Diff

View File

@ -85,7 +85,7 @@ If \fIsig\fP is 0, then no signal is sent,
but existence and permission checks are still performed;
this can be used to check for the existence of a process ID or
process group ID that the caller is permitted to signal.
.PP
For a process to have permission to send a signal,
it must either be privileged (under Linux: have the
.B CAP_KILL

View File

@ -66,13 +66,13 @@ _ATFILE_SOURCE
.SH DESCRIPTION
.BR link ()
creates a new link (also known as a hard link) to an existing file.
.PP
If
.I newpath
exists, it will
.I not
be overwritten.
.PP
This new name may be used exactly as the old one for any operation;
both names refer to the same file (and so have the same permissions
and ownership) and it is impossible to tell which name was the
@ -83,7 +83,7 @@ The
system call operates in exactly the same way as
.BR link (),
except for the differences described here.
.PP
If the pathname given in
.I oldpath
is relative, then it is interpreted relative to the directory
@ -93,7 +93,7 @@ referred to by the file descriptor
the calling process, as is done by
.BR link ()
for a relative pathname).
.PP
If
.I oldpath
is relative and
@ -105,13 +105,13 @@ then
is interpreted relative to the current working
directory of the calling process (like
.BR link ()).
.PP
If
.I oldpath
is absolute, then
.I olddirfd
is ignored.
.PP
The interpretation of
.I newpath
is as for
@ -119,7 +119,7 @@ is as for
except that a relative pathname is interpreted relative
to the directory referred to by the file descriptor
.IR newdirfd .
.PP
The following values can be bitwise ORed in
.IR flags :
.TP
@ -168,7 +168,7 @@ If procfs is mounted,
this can be used as an alternative to
.BR AT_EMPTY_PATH ,
like this:
.IP
.nf
.in +4n
linkat(AT_FDCWD, "/proc/self/fd/<fd>", newdirfd,
@ -309,9 +309,9 @@ capability.
An attempt was made to link to the
.I /proc/self/fd/NN
file corresponding to a file descriptor created with
.IP
open(path, O_TMPFILE | O_EXCL, mode);
.IP
See
.BR open (2).
.TP
@ -354,7 +354,7 @@ SVr4, 4.3BSD, POSIX.1-2001 (but see NOTES), POSIX.1-2008.
.\" SVr4 documents additional ENOLINK and
.\" EMULTIHOP error conditions; POSIX.1 does not document ELOOP.
.\" X/OPEN does not document EFAULT, ENOMEM or EIO.
.PP
.BR linkat ():
POSIX.1-2008.
.SH NOTES
@ -364,7 +364,7 @@ cannot span filesystems.
Use
.BR symlink (2)
if this is required.
.PP
POSIX.1-2001 says that
.BR link ()
should dereference

View File

@ -60,14 +60,14 @@ marks the socket referred to by
as a passive socket, that is, as a socket that will
be used to accept incoming connection requests using
.BR accept (2).
.PP
The
.I sockfd
argument is a file descriptor that refers to a socket of type
.B SOCK_STREAM
or
.BR SOCK_SEQPACKET .
.PP
The
.I backlog
argument defines the maximum length
@ -146,7 +146,7 @@ POSIX.1 does not require the inclusion of
and this header file is not required on Linux.
However, some historical (BSD) implementations required this header
file, and portable applications are probably wise to include it.
.PP
The behavior of the
.I backlog
argument on TCP sockets changed with Linux 2.2.
@ -162,7 +162,7 @@ length and this setting is ignored.
See
.BR tcp (7)
for more information.
.PP
If the
.I backlog
argument is greater than the value in

View File

@ -88,7 +88,7 @@ A single extended attribute
is a null-terminated string.
The name includes a namespace prefix; there may be several, disjoint
namespaces associated with an individual inode.
.PP
If
.I size
is specified as zero, these calls return the current size of the
@ -182,7 +182,7 @@ and
.BR getxattr (2).
For the file whose pathname is provided as a command-line argument,
it lists all extended file attributes and their values.
.PP
To keep the code simple, the program assumes that attribute keys and
values are constant during the execution of the program.
A production program should expect and handle changes during
@ -199,7 +199,7 @@ with a larger buffer each time it fails with the error
Calls to
.BR getxattr (2)
could be handled similarly.
.PP
The following output was recorded by first creating a file, setting
some extended file attributes,
and then listing the attributes with the example program.

View File

@ -59,7 +59,7 @@ or
respectively.
It returns the resulting file position in the argument
.IR result .
.PP
This system call exists on various 32-bit platforms to support
seeking to large file offsets.
.SH RETURN VALUE

View File

@ -35,7 +35,7 @@ Look up the full path of the directory entry specified by the value
The cookie is an opaque identifier uniquely identifying a particular
directory entry.
The buffer given is filled in with the full path of the directory entry.
.PP
For
.BR lookup_dcookie ()
to return successfully,
@ -84,7 +84,7 @@ is a special-purpose system call, currently used only by the
.BR oprofile (1)
profiler.
It relies on a kernel driver to register cookies for directory entries.
.PP
The path returned may be suffixed by the string " (deleted)" if the directory
entry has been removed.
.SH SEE ALSO

View File

@ -121,13 +121,13 @@ In both of the above cases,
fails if
.I offset
points past the end of the file.
.PP
These operations allow applications to map holes in a sparsely
allocated file.
This can be useful for applications such as file backup tools,
which can save space when creating backups and preserve holes,
if they have a mechanism for discovering holes.
.PP
For the purposes of these operations, a hole is a sequence of zeros that
(normally) has not been allocated in the underlying file storage.
However, a filesystem is not obliged to report holes,
@ -150,7 +150,7 @@ it can be considered to consist of data that is a sequence of zeros).
.\" https://lkml.org/lkml/2011/4/22/79
.\" http://lwn.net/Articles/440255/
.\" http://blogs.oracle.com/bonwick/entry/seek_hole_and_seek_data
.PP
The
.BR _GNU_SOURCE
feature test macro must be defined in order to obtain the definitions of
@ -159,7 +159,7 @@ and
.BR SEEK_HOLE
from
.IR <unistd.h> .
.PP
The
.BR SEEK_HOLE
and
@ -223,7 +223,7 @@ The resulting file offset cannot be represented in an
is associated with a pipe, socket, or FIFO.
.SH CONFORMING TO
POSIX.1-2001, POSIX.1-2008, SVr4, 4.3BSD.
.PP
.BR SEEK_DATA
and
.BR SEEK_HOLE
@ -236,7 +236,7 @@ See
.BR open (2)
for a discussion of the relationship between file descriptors,
open file descriptions, and files.
.PP
If the
.B O_APPEND
file status flag is set on the open file description,
@ -245,15 +245,15 @@ then a
.I always
moves the file offset to the end of the file, regardless of the use of
.BR lseek ().
.PP
The
.I off_t
data type is a signed integer data type specified by POSIX.1.
.PP
Some devices are incapable of seeking and POSIX does not specify which
devices must support
.BR lseek ().
.PP
On Linux, using
.BR lseek ()
on a terminal device fails with the error

View File

@ -67,7 +67,7 @@ and with size
bytes
In most cases,
the goal of such advice is to improve system or application performance.
.PP
Initially, the system call supported a set of "conventional"
.I advice
values, which are also available on several other implementations.
@ -125,7 +125,7 @@ Expect access in the near future.
Do not expect access in the near future.
(For the time being, the application is finished with the given range,
so the kernel can free resources associated with it.)
.IP
After a successful
.B MADV_DONTNEED
operation,
@ -136,14 +136,14 @@ up-to-date contents of the underlying mapped file
(for shared file mappings, shared anonymous mappings,
and shmem-based techniques such as System V shared memory segments)
or zero-fill-on-demand pages for anonymous private mappings.
.IP
Note that, when applied to shared mappings,
.BR MADV_DONTNEED
might not lead to immediate freeing of the pages in the range.
The kernel is free to delay freeing the pages until an appropriate moment.
The resident set size (RSS) of the calling process will be immediately
reduced however.
.IP
.B MADV_DONTNEED
cannot be applied to locked pages, Huge TLB pages, or
.BR VM_PFNMAP
@ -181,12 +181,12 @@ bytes containing zero.
.\" bufferpool (shared memory segments) - without writing back to
.\" disk/swap space. This feature is also useful for supporting
.\" hot-plug memory on UML.
.IP
The specified address range must be mapped shared and writable.
This flag cannot be applied to locked pages, Huge TLB pages, or
.BR VM_PFNMAP
pages.
.IP
In the initial implementation, only
.BR tmpfs (5)
is supported
@ -255,7 +255,7 @@ processes.
This operation may result in the calling process receiving a
.B SIGBUS
and the page being unmapped.
.IP
This feature is intended for testing of memory error-handling code;
it is available only if the kernel was configured with
.BR CONFIG_MEMORY_FAILURE .
@ -273,14 +273,14 @@ These are replaced by a single write-protected page (which is automatically
copied if a process later wants to update the content of the page).
KSM merges only private anonymous pages (see
.BR mmap (2)).
.IP
The KSM feature is intended for applications that generate many
instances of the same data (e.g., virtualization systems such as KVM).
It can consume a lot of processing power; use with care.
See the Linux kernel source file
.I Documentation/vm/ksm.txt
for more details.
.IP
The
.BR MADV_MERGEABLE
and
@ -312,7 +312,7 @@ The effect of the
.B MADV_SOFT_OFFLINE
operation is invisible to (i.e., does not change the semantics of)
the calling process.
.IP
This feature is intended for testing of memory error-handling code;
it is available only if the kernel was configured with
.BR CONFIG_MEMORY_FAILURE .
@ -332,7 +332,7 @@ to replace them with huge pages.
The kernel will also allocate huge pages directly when the region is
naturally aligned to the huge page size (see
.BR posix_memalign (2)).
.IP
This feature is primarily aimed at applications that use large mappings of
data and access large regions of that memory at a time (e.g., virtualization
systems such as QEMU).
@ -341,7 +341,7 @@ It can very easily waste memory (e.g., a 2MB mapping that only ever accesses
See the Linux kernel source file
.I Documentation/vm/transhuge.txt
for more details.
.IP
The
.BR MADV_HUGEPAGE
and
@ -397,7 +397,7 @@ If there is no subsequent write,
the kernel can free the pages at any time.
Once pages in the range have been freed, the caller will
see zero-fill-on-demand pages upon subsequent page references.
.IP
The
.B MADV_FREE
operation
@ -496,7 +496,7 @@ Other implementations typically implement at least the flags listed
above under
.IR "Conventional advice flags" ,
albeit with some variation in semantics.
.PP
POSIX.1-2001 describes
.BR posix_madvise (3)
with constants

View File

@ -55,7 +55,7 @@ and continuing for
.I len
bytes.
The memory policy defines from which node memory is allocated.
.PP
If the memory range specified by the
.IR addr " and " len
arguments includes an "anonymous" region of memory\(emthat is
@ -77,7 +77,7 @@ an initial read access will allocate pages according to the
memory policy of the thread that causes the page to be allocated.
This may not be the thread that called
.BR mbind ().
.PP
The specified policy will be ignored for any
.B MAP_SHARED
mappings in the specified memory range.
@ -85,7 +85,7 @@ Rather the pages will be allocated according to the memory policy
of the thread that caused the page to be allocated.
Again, this may not be the thread that called
.BR mbind ().
.PP
If the specified memory range includes a shared memory region
created using the
.BR shmget (2)
@ -102,7 +102,7 @@ the huge pages will be allocated according to the policy specified
only if the page allocation is caused by the process that calls
.BR mbind ()
for that region.
.PP
By default,
.BR mbind ()
has an effect only for new allocations; if the pages inside
@ -113,7 +113,7 @@ This default behavior may be overridden by the
and
.B MPOL_MF_MOVE_ALL
flags described below.
.PP
The
.I mode
argument must specify one of
@ -130,7 +130,7 @@ require the caller to specify the node or nodes to which the mode applies,
via the
.I nodemask
argument.
.PP
The
.I mode
argument may also include an optional
@ -182,7 +182,7 @@ allowed by the thread's current cpuset context
.B MPOL_F_STATIC_NODES
mode flag is specified),
and contains memory.
.PP
The
.I mode
argument must include one of the following values:
@ -296,7 +296,7 @@ if the existing pages in the memory range don't follow the policy.
.\" --Lee Schermerhorn
.\" In 2.6.16 or later the kernel will also try to move pages
.\" to the requested node with this flag.
.PP
If
.B MPOL_MF_MOVE
is specified in
@ -309,7 +309,7 @@ If
is also specified, then the call will fail with the error
.B EIO
if some pages could not be moved.
.PP
If
.B MPOL_MF_MOVE_ALL
is passed in
@ -427,12 +427,12 @@ This system call is Linux-specific.
.SH NOTES
For information on library support, see
.BR numa (7).
.PP
NUMA policy is not supported on a memory-mapped file range
that was mapped with the
.B MAP_SHARED
flag.
.PP
The
.B MPOL_DEFAULT
mode can have different effects for
@ -466,14 +466,14 @@ with an empty set of nodes.
This method will work for
.BR set_mempolicy (2),
as well.
.PP
Support for huge page policy was added with 2.6.16.
For interleave policy to be effective on huge page mappings the
policied memory needs to be tens of megabytes or larger.
.PP
.B MPOL_MF_STRICT
is ignored on huge page mappings.
.PP
.B MPOL_MF_MOVE
and
.B MPOL_MF_MOVE_ALL

View File

@ -39,12 +39,12 @@ effectively is
.I not
as simple as replacing memory barriers with this
system call, but requires understanding of the details below.
.PP
Use of memory barriers needs to be done taking into account that a
memory barrier always needs to be either matched with its memory barrier
counterparts, or that the architecture's memory model doesn't require the
matching barriers.
.PP
There are cases where one side of the matching barriers (which we will
refer to as "fast side") is executed much more often than the other
(which we will refer to as "slow side").
@ -53,18 +53,18 @@ This is a prime target for the use of
The key idea is to replace, for these matching
barriers, the fast-side memory barriers by simple compiler barriers,
for example:
.PP
asm volatile ("" : : : "memory")
.PP
and replace the slow-side memory barriers by calls to
.BR membarrier ().
.PP
This will add overhead to the slow side, and remove overhead from the
fast side, thus resulting in an overall performance increase as long as
the slow side is infrequent enough that the overhead of the
.BR membarrier ()
calls does not outweigh the performance gain on the fast side.
.PP
The
.I cmd
argument is one of the following:
@ -95,7 +95,7 @@ argument is currently unused and must be specified as 0.
All memory accesses performed in program order from each targeted thread
are guaranteed to be ordered with respect to
.BR membarrier ().
.PP
If we use the semantic
.I barrier()
to represent a compiler barrier forcing memory
@ -109,7 +109,7 @@ each pairing of
and
.IR smp_mb() .
The pair ordering is detailed as (O: ordered, X: not ordered):
.PP
barrier() smp_mb() membarrier()
barrier() X X O
smp_mb() X O O
@ -124,7 +124,7 @@ On error, \-1 is returned,
and
.I errno
is set appropriately.
.PP
For a given command, with
.I flags
set to 0, this system call is
@ -171,10 +171,10 @@ matching barriers on other cores.
For instance, a load fence can order
loads prior to and following that fence with respect to stores ordered
by store fences.
.PP
Program order is the order in which instructions are ordered in the
program assembly code.
.PP
Examples where
.BR membarrier ()
can be useful include implementations
@ -184,7 +184,7 @@ Assuming a multithreaded application where "fast_path()" is executed
very frequently, and where "slow_path()" is executed infrequently, the
following code (x86) can be transformed using
.BR membarrier ():
.PP
.in +4n
.nf
#include <stdlib.h>
@ -230,11 +230,11 @@ main(int argc, char **argv)
}
.fi
.in
.PP
The code above transformed to use
.BR membarrier ()
becomes:
.PP
.in +4n
.nf
#define _GNU_SOURCE

View File

@ -51,14 +51,14 @@ memory allocations such as those allocated using
with the
.BR MAP_ANONYMOUS
flag.
.PP
The initial size of the file is set to 0.
Following the call, the file size should be set using
.BR ftruncate (2).
(Alternatively, the file may be populated by calls to
.BR write (2)
or similar.)
.PP
The name supplied in
.I name
is used as a filename and will be displayed
@ -69,7 +69,7 @@ The displayed name is always prefixed with
and serves only for debugging purposes.
Names do not affect the behavior of the file descriptor,
and as such multiple files can have the same name without any side effects.
.PP
The following values may be bitwise ORed in
.IR flags
to change the behavior of
@ -104,7 +104,7 @@ meaning that no other seals can be set on the file.
Unused bits in
.I flags
must be 0.
.PP
As its return value,
.BR memfd_create ()
returns a new file descriptor that can be used to refer to the file.
@ -113,7 +113,7 @@ This file descriptor is opened for both reading and writing
and
.B O_LARGEFILE
is set for the file descriptor.
.PP
With respect to
.BR fork (2)
and
@ -166,7 +166,7 @@ system call is Linux-specific.
.SH NOTES
Glibc does not provide a wrapper for this system call; call it using
.BR syscall (2).
.PP
.\" See also http://lwn.net/Articles/593918/
.\" and http://lwn.net/Articles/594919/ and http://lwn.net/Articles/591108/
The
@ -179,7 +179,7 @@ The primary purpose of
is to create files and associated file descriptors that are
used with the file-sealing APIs provided by
.BR fcntl (2).
.PP
The
.BR memfd_create ()
system call also has uses without file sealing
@ -211,13 +211,13 @@ location in the shared memory region.
(Dealing with this possibility necessitates the use of a handler for the
.BR SIGBUS
signal.)
.PP
Dealing with untrusted peers imposes extra complexity on
code that employs shared memory.
Memory sealing enables that extra complexity to be eliminated,
by allowing a process to operate secure in the knowledge that
its peer can't modify the shared memory in an undesired fashion.
.PP
An example of the usage of the sealing mechanism is as follows:
.IP 1. 3
The first process creates a
@ -297,7 +297,7 @@ seal has not yet been applied).
Below are shown two example programs that demonstrate the use of
.BR memfd_create ()
and the file sealing API.
.PP
The first program,
.IR t_memfd_create.c ,
creates a
@ -312,18 +312,18 @@ The first argument is the name to associate with the file,
the second argument is the size to be set for the file,
and the optional third argument is a string of characters that specify
seals to be set on file.
.PP
The second program,
.IR t_get_seals.c ,
can be used to open an existing file that was created via
.BR memfd_create ()
and inspect the set of seals that have been applied to that file.
.PP
The following shell session demonstrates the use of these programs.
First we create a
.BR tmpfs (5)
file and set some seals on it:
.PP
.in +4n
.nf
$ \fB./t_memfd_create my_memfd_file 4096 sw &\fP
@ -331,7 +331,7 @@ $ \fB./t_memfd_create my_memfd_file 4096 sw &\fP
PID: 11775; fd: 3; /proc/11775/fd/3
.fi
.in
.PP
At this point, the
.I t_memfd_create
program continues to run in the background.
@ -347,7 +347,7 @@ Using that pathname, we inspect the content of the
symbolic link, and use our
.I t_get_seals
program to view the seals that have been placed on the file:
.PP
.in +4n
.nf
$ \fBreadlink /proc/11775/fd/3\fP

View File

@ -44,7 +44,7 @@ the kernel maintains the relative topology relationship inside
.I old_nodes
during the migration to
.IR new_nodes .
.PP
The
.I old_nodes
and
@ -66,7 +66,7 @@ as in
.BR mbind (2),
but different from
.BR select (2)).
.PP
The
.I pid
argument is the ID of the process whose pages are to be moved.
@ -80,7 +80,7 @@ If
is 0, then
.BR migrate_pages ()
moves pages of the calling process.
.PP
Pages shared with another process will be moved only if the initiating
process has the
.B CAP_SYS_NICE
@ -142,7 +142,7 @@ This system call is Linux-specific.
.SH NOTES
For information on library support, see
.BR numa (7).
.PP
Use
.BR get_mempolicy (2)
with the
@ -151,7 +151,7 @@ flag to obtain the set of nodes that are allowed by
the calling process's cpuset.
Note that this information is subject to change at any
time by manual or automatic reconfiguration of the cpuset.
.PP
Use of
.BR migrate_pages ()
may result in pages whose location
@ -163,7 +163,7 @@ and/or the specified process (see
That is, memory policy does not constrain the destination
nodes used by
.BR migrate_pages ().
.PP
The
.I <numaif.h>
header is not included with glibc, but requires installing
@ -179,6 +179,6 @@ or a similar package.
.BR numa (7),
.BR migratepages (8),
.BR numastat (8)
.PP
.IR Documentation/vm/page_migration
in the Linux kernel source tree

View File

@ -62,7 +62,7 @@ starting at the address
and continuing for
.I length
bytes.
.PP
The
.I addr
argument must be a multiple of the system page size.
@ -76,7 +76,7 @@ One may obtain the page size
.RB ( PAGE_SIZE )
using
.IR sysconf(_SC_PAGESIZE) .
.PP
The
.I vec
argument must point to an array containing at least

View File

@ -48,7 +48,7 @@ _ATFILE_SOURCE
.BR mkdir ()
attempts to create a directory named
.IR pathname .
.PP
The argument
.I mode
specifies the mode for the new directory (see
@ -62,7 +62,7 @@ Whether other
.I mode
bits are honored for the created directory depends on the operating system.
For Linux, see NOTES below.
.PP
The newly created directory will be owned by the effective user ID of the
process.
If the directory containing the file has the set-group-ID
@ -72,7 +72,7 @@ or, synonymously
.IR "mount -o grpid" ),
the new directory will inherit the group ownership from its parent;
otherwise it will be owned by the effective group ID of the process.
.PP
If the parent directory has the set-group-ID bit set, then so will the
newly created directory.
.\"
@ -83,7 +83,7 @@ The
system call operates in exactly the same way as
.BR mkdir (),
except for the differences described here.
.PP
If the pathname given in
.I pathname
is relative, then it is interpreted relative to the directory
@ -93,7 +93,7 @@ referred to by the file descriptor
the calling process, as is done by
.BR mkdir ()
for a relative pathname).
.PP
If
.I pathname
is relative and
@ -105,7 +105,7 @@ then
is interpreted relative to the current working
directory of the calling process (like
.BR mkdir ()).
.PP
If
.I pathname
is absolute, then
@ -209,7 +209,7 @@ library support was added to glibc in version 2.4.
.BR mkdir ():
SVr4, BSD, POSIX.1-2001, POSIX.1-2008.
.\" SVr4 documents additional EIO, EMULTIHOP
.PP
.BR mkdirat ():
POSIX.1-2008.
.SH NOTES

View File

@ -55,7 +55,7 @@ with attributes specified by
.I mode
and
.IR dev .
.PP
The
.I mode
argument specifies both the file mode to use and the type of node
@ -63,13 +63,13 @@ to be created.
It should be a combination (using bitwise OR) of one of the file types
listed below and zero or more of the file mode bits listed in
.BR inode (7).
.PP
The file mode is modified by the process's
.I umask
in the usual way: in the absence of a default ACL, the permissions of the
created node are
.RI ( mode " & ~" umask ).
.PP
The file type must be one of
.BR S_IFREG ,
.BR S_IFCHR ,
@ -83,7 +83,7 @@ special file, block special file, FIFO (named pipe), or UNIX domain socket,
respectively.
(Zero file type is equivalent to type
.BR S_IFREG .)
.PP
If the file type is
.B S_IFCHR
or
@ -96,13 +96,13 @@ special file
may be useful to build the value for
.IR dev );
otherwise it is ignored.
.PP
If
.I pathname
already exists, or is a symbolic link, this call fails with an
.B EEXIST
error.
.PP
The newly created node will be owned by the effective user ID of the
process.
If the directory containing the node has the set-group-ID
@ -117,7 +117,7 @@ The
system call operates in exactly the same way as
.BR mknod (),
except for the differences described here.
.PP
If the pathname given in
.I pathname
is relative, then it is interpreted relative to the directory
@ -127,7 +127,7 @@ referred to by the file descriptor
the calling process, as is done by
.BR mknod ()
for a relative pathname).
.PP
If
.I pathname
is relative and
@ -139,7 +139,7 @@ then
is interpreted relative to the current working
directory of the calling process (like
.BR mknod ()).
.PP
If
.I pathname
is absolute, then
@ -251,7 +251,7 @@ SVr4, 4.4BSD, POSIX.1-2001 (but see below), POSIX.1-2008.
.\" The Linux version differs from the SVr4 version in that it
.\" does not require root permission to create pipes, also in that no
.\" EMULTIHOP, ENOLINK, or EINTR error is documented.
.PP
.BR mknodat ():
POSIX.1-2008.
.SH NOTES
@ -272,14 +272,14 @@ However, nowadays one should never use
for this purpose; one should use
.BR mkfifo (3),
a function especially defined for this purpose.
.PP
Under Linux,
.BR mknod ()
cannot be used to create directories.
One should make directories with
.BR mkdir (2).
.\" and one should make UNIX domain sockets with socket(2) and bind(2).
.PP
There are many infelicities in the protocol underlying NFS.
Some of these affect
.BR mknod ()

View File

@ -45,7 +45,7 @@ and
lock part or all of the calling process's virtual address
space into RAM, preventing that memory from being paged to the
swap area.
.PP
.BR munlock ()
and
.BR munlockall ()
@ -53,7 +53,7 @@ perform the converse operation,
unlocking part or all of the calling process's virtual
address space, so that pages in the specified virtual address range may
once more to be swapped out if required by the kernel memory manager.
.PP
Memory locking and unlocking are performed in units of whole pages.
.SS mlock(), mlock2(), and munlock()
.BR mlock ()
@ -65,7 +65,7 @@ bytes.
All pages that contain a part of the specified address range are
guaranteed to be resident in RAM when the call returns successfully;
the pages are guaranteed to stay in RAM until later unlocked.
.PP
.BR mlock2 ()
.\" commit a8ca5d0ecbdde5cc3d7accacbd69968b0c98764e
.\" commit de60f5f10c58d4f34b68622442c0e04180367f3f
@ -79,7 +79,7 @@ However, the state of the pages contained in that range after the call
returns successfully will depend on the value in the
.I flags
argument.
.PP
The
.I flags
argument can be either 0 or the following constant:
@ -88,19 +88,19 @@ argument can be either 0 or the following constant:
Lock pages that are currently resident and mark the entire range to have
pages locked when they are populated by the page fault.
.PP
.PP
If
.I flags
is 0,
.BR mlock2 ()
behaves exactly the same as
.BR mlock ().
.PP
Note: currently, there is not a glibc wrapper for
.BR mlock2 (),
so it will need to be invoked using
.BR syscall (2).
.PP
.BR munlock ()
unlocks pages in the address range starting at
.I addr
@ -119,7 +119,7 @@ memory, and memory-mapped files.
All mapped pages are guaranteed
to be resident in RAM when the call returns successfully;
the pages are guaranteed to stay in RAM until later unlocked.
.PP
The
.I flags
argument is constructed as the bitwise OR of one or more of the
@ -175,7 +175,7 @@ In the same circumstances, stack growth may likewise fail:
the kernel will deny stack expansion and deliver a
.B SIGSEGV
signal to the process.
.PP
.BR munlockall ()
unlocks all pages mapped into the address space of the
calling process.
@ -275,7 +275,7 @@ For
is available since Linux 4.4.
.SH CONFORMING TO
POSIX.1-2001, POSIX.1-2008, SVr4.
.PP
mlock2 ()
is Linux specific.
.SH AVAILABILITY
@ -290,7 +290,7 @@ can be determined from the constant
.B PAGESIZE
(if defined) in \fI<limits.h>\fP or by calling
.IR sysconf(_SC_PAGESIZE) .
.PP
On POSIX systems on which
.BR mlockall ()
and
@ -321,7 +321,7 @@ software has erased the secrets in RAM and terminated.
(But be aware that the suspend mode on laptops and some desktop
computers will save a copy of the system's RAM to disk, regardless
of memory locks.)
.PP
Real-time processes that are using
.BR mlockall ()
to prevent delays on page faults should reserve enough
@ -334,7 +334,7 @@ This way, enough pages will be mapped for the stack and can be
locked into RAM.
The dummy writes ensure that not even copy-on-write
page faults can occur in the critical section.
.PP
Memory locks are not inherited by a child created via
.BR fork (2)
and are automatically removed (unlocked) during an
@ -349,7 +349,7 @@ settings are not inherited by a child created via
.BR fork (2)
and are cleared during an
.BR execve (2).
.PP
Note that
.BR fork (2)
will prepare the address space for a copy-on-write operation.
@ -363,11 +363,11 @@ or
.BR mlock ()
operation\(emnot even from a thread which runs at a low priority within
a process which also has a thread running at elevated priority.
.PP
The memory lock on an address range is automatically removed
if the address range is unmapped via
.BR munmap (2).
.PP
Memory locks do not stack, that is, pages which have been locked several times
by calls to
.BR mlock (),
@ -381,7 +381,7 @@ for the corresponding range or by
Pages which are mapped to several locations or by several processes stay
locked into RAM as long as they are locked at least at one location or by
at least one process.
.PP
If a call to
.BR mlockall ()
which uses the
@ -390,7 +390,7 @@ flag is followed by another call that does not specify this flag, the
changes made by the
.B MCL_FUTURE
call will be lost.
.PP
The
.BR mlock2 ()
.B MLOCK_ONFAULT
@ -417,7 +417,7 @@ and
allows an implementation to require that
.I addr
is page aligned, so portable applications should ensure this.
.PP
The
.I VmLck
field of the Linux-specific
@ -438,7 +438,7 @@ a process must be privileged
in order to lock memory and the
.B RLIMIT_MEMLOCK
soft resource limit defines a limit on how much memory the process may lock.
.PP
Since Linux 2.6.9, no limits are placed on the amount of memory
that a privileged process can lock and the
.B RLIMIT_MEMLOCK
@ -467,7 +467,7 @@ would fail on requests that should have succeeded.
This bug was fixed
.\" commit 0cf2f6f6dc605e587d2c1120f295934c77e810e8
in Linux 4.9
.PP
In the 2.4 series Linux kernels up to and including 2.4.17,
a bug caused the
.BR mlockall ()
@ -475,7 +475,7 @@ a bug caused the
flag to be inherited across a
.BR fork (2).
This was rectified in kernel 2.4.18.
.PP
Since kernel 2.6.9, if a privileged process calls
.I mlockall(MCL_FUTURE)
and later drops privileges (loses the

View File

@ -60,7 +60,7 @@ The starting address for the new mapping is specified in
The
.I length
argument specifies the length of the mapping.
.PP
If
.I addr
is NULL,
@ -74,7 +74,7 @@ on Linux, the mapping will be created at a nearby page boundary.
.\" Before Linux 2.6.24, the address was rounded up to the next page
.\" boundary; since 2.6.24, it is rounded down!
The address of the new mapping is returned as the result of the call.
.PP
The contents of a file mapping (as opposed to an anonymous mapping; see
.B MAP_ANONYMOUS
below), are initialized using
@ -135,7 +135,7 @@ It is unspecified whether changes made to the file after the
call are visible in the mapped region.
.LP
Both of these flags are described in POSIX.1-2001 and POSIX.1-2008.
.PP
In addition, zero or more of the following values can be ORed in
.IR flags :
.TP
@ -251,7 +251,7 @@ Used in conjunction with
.B MAP_HUGETLB
to select alternative hugetlb page sizes (respectively, 2 MB and 1 GB)
on systems that support multiple hugetlb page sizes.
.IP
More generally, the desired huge page size can be configured by encoding
the base-2 logarithm of the desired page size in the six bits at the offset
.BR MAP_HUGE_SHIFT .
@ -261,14 +261,14 @@ the default huge page size can be discovered vie the
field exposed by
.IR /proc/meminfo .)
Thus, the above two constants are defined as:
.IP
.nf
.in +4n
#define MAP_HUGE_2MB (21 << MAP_HUGE_SHIFT)
#define MAP_HUGE_1GB (30 << MAP_HUGE_SHIFT)
.in
.fi
.IP
The range of huge page sizes that are supported by the system
can be discovered by listing the subdirectories in
.IR /sys/kernel/mm/hugepages .
@ -411,7 +411,7 @@ On error, the value
is returned, and
.I errno
is set to indicate the cause of the error.
.PP
On success,
.BR munmap ()
returns 0.
@ -577,7 +577,7 @@ or not.
Portable programs should always set
.B PROT_EXEC
if they intend to execute code in the new mapping.
.PP
The portable way to create a mapping is to specify
.I addr
as 0 (NULL), and omit
@ -592,7 +592,7 @@ If the
flag is specified, and
.I addr
is 0 (NULL), then the mapped address will be 0 (NULL).
.PP
Certain
.I flags
constants are defined only if suitable feature test macros are defined
@ -625,7 +625,7 @@ The relevant flags are:
.BR MAP_POPULATE ,
and
.BR MAP_STACK .
.PP
An application can determine which pages of a mapping are
currently resident in the buffer/page cache using
.BR mincore (2).
@ -662,7 +662,7 @@ and
.BR munmap ()
differ somewhat from the requirements for mappings
that use the native system page size.
.PP
For
.BR mmap (),
.I offset
@ -670,7 +670,7 @@ must be a multiple of the underlying huge page size.
The system automatically aligns
.I length
to be a multiple of the underlying huge page size.
.PP
For
.BR munmap (),
.I addr
@ -698,14 +698,14 @@ On Linux, there are no guarantees like those suggested above under
.BR MAP_NORESERVE .
By default, any process can be killed
at any moment when the system runs out of memory.
.PP
In kernels before 2.6.7, the
.B MAP_POPULATE
flag has effect only if
.I prot
is specified as
.BR PROT_NONE .
.PP
SUSv3 specifies that
.BR mmap ()
should fail if
@ -720,7 +720,7 @@ Since kernel 2.6.12,
fails with the error
.B EINVAL
for this case.
.PP
POSIX specifies that the system shall always
zero fill any partial page at the end
of the object and that system will never write any modification of the
@ -837,14 +837,14 @@ main(int argc, char *argv[])
.BR userfaultfd (2),
.BR shm_open (3),
.BR shm_overview (7)
.PP
The descriptions of the following files in
.BR proc (5):
.IR /proc/[pid]/maps ,
.IR /proc/[pid]/map_files ,
and
.IR /proc/[pid]/smaps .
.PP
B.O. Gallmeister, POSIX.4, O'Reilly, pp. 128-129 and 389-391.
.\"
.\" Repeat after me: private read-only mappings are 100% equivalent to

View File

@ -40,7 +40,7 @@ mmap2 \- map files or devices into memory
This is probably not the system call that you are interested in; instead, see
.BR mmap (2),
which describes the glibc wrapper function that invokes this system call.
.PP
The
.BR mmap2 ()
system call provides the same interface as
@ -83,9 +83,9 @@ the glibc
wrapper function invokes this system call rather than the
.BR mmap (2)
system call.
.PP
This system call does not exist on x86-64.
.PP
On ia64, the unit for
.I offset
is actually the system page size, rather than 4096 bytes.

View File

@ -70,7 +70,7 @@ structure
and
.I bytecount
must equal the size of this structure.
.PP
The
.I user_desc
structure is defined in \fI<asm/ldt.h>\fP as:

View File

@ -42,7 +42,7 @@ The result of the move is reflected in
The
.I flags
indicate constraints on the pages to be moved.
.PP
.I pid
is the ID of the process in which pages are to be moved.
To move pages in another process,
@ -55,7 +55,7 @@ If
is 0, then
.BR move_pages ()
moves pages of the calling process.
.PP
.I count
is the number of pages to move.
It defines the size of the three arrays
@ -63,7 +63,7 @@ It defines the size of the three arrays
.IR nodes ,
and
.IR status .
.PP
.I pages
is an array of pointers to the pages that should be moved.
These are pointers that should be aligned to page boundaries.
@ -71,7 +71,7 @@ These are pointers that should be aligned to page boundaries.
.\" not aligned to page boundaries
Addresses are specified as seen by the process specified by
.IR pid .
.PP
.I nodes
is an array of integers that specify the desired location for each page.
Each element in the array is a node number.
@ -84,13 +84,13 @@ where each page currently resides, in the
array.
Obtaining the status of each page may be necessary to determine
pages that need to be moved.
.PP
.I status
is an array of integers that return the status of each page.
The array contains valid values only if
.BR move_pages ()
did not return an error.
.PP
.I flags
specify what types of pages to move.
.B MPOL_MF_MOVE
@ -198,7 +198,7 @@ This system call is Linux-specific.
.SH NOTES
For information on library support, see
.BR numa (7).
.PP
Use
.BR get_mempolicy (2)
with the
@ -209,7 +209,7 @@ flag to obtain the set of nodes that are allowed by
the current cpuset.
Note that this information is subject to change at any
time by manual or automatic reconfiguration of the cpuset.
.PP
Use of this function may result in pages whose location
(node) violates the memory policy established for the
specified addresses (See
@ -219,7 +219,7 @@ and/or the specified process (See
That is, memory policy does not constrain the destination
nodes used by
.BR move_pages ().
.PP
The
.I <numaif.h>
header is not included with glibc, but requires installing

View File

@ -47,7 +47,7 @@ containing any part of the address range in the
interval [\fIaddr\fP,\ \fIaddr\fP+\fIlen\fP\-1].
.I addr
must be aligned to a page boundary.
.PP
If the calling process tries to access memory in a manner
that violates the protections, then the kernel generates a
.B SIGSEGV
@ -218,7 +218,7 @@ POSIX says that the behavior of
is unspecified if it is applied to a region of memory that
was not obtained via
.BR mmap (2).
.PP
.BR pkey_mprotect ()
is a nonportable Linux extension.
.SH NOTES
@ -228,7 +228,7 @@ on any address in a process's address space (except for the
kernel vsyscall area).
In particular, it can be used
to change existing code mappings to be writable.
.PP
Whether
.B PROT_EXEC
has any effect different from
@ -242,12 +242,12 @@ specifying
.B PROT_READ
will implicitly add
.BR PROT_EXEC.
.PP
On some hardware architectures (e.g., i386),
.B PROT_WRITE
implies
.BR PROT_READ .
.PP
POSIX.1 says that an implementation may permit access
other than that specified in
.IR prot ,
@ -256,7 +256,7 @@ but at a minimum can allow write access only if
has been set, and must not allow any access if
.B PROT_NONE
has been set.
.PP
Applications should be careful when mixing use of
.BR mprotect ()
and
@ -269,7 +269,7 @@ set to
.B PROT_EXEC
a pkey is may be allocated and set on the memory implicitly
by the kernel, but only when the pkey was 0 previously.
.PP
On systems that do not support protection keys in hardware,
.BR pkey_mprotect ()
may still be used, but
@ -287,10 +287,10 @@ The program below demonstrates the use of
The program allocates four pages of memory, makes the third
of these pages read-only, and then executes a loop that walks upward
through the allocated region modifying bytes.
.PP
An example of what we might see when running the program is the
following:
.PP
.in +4n
.nf
.RB "$" " ./a.out"

View File

@ -39,7 +39,7 @@ mq_getsetattr \- get/set message queue attributes
There is no glibc wrapper for this system call; see NOTES.
.SH DESCRIPTION
Do not use this system call.
.PP
This is the low-level system call used to implement
.BR mq_getattr (3)
and

View File

@ -44,7 +44,7 @@ mremap \- remap a virtual memory address
expands (or shrinks) an existing memory mapping, potentially
moving it at the same time (controlled by the \fIflags\fP argument and
the available virtual address space).
.PP
\fIold_address\fP is the old address of the virtual memory block that you
want to expand (or shrink).
Note that \fIold_address\fP has to be page
@ -58,7 +58,7 @@ An optional fifth argument,
may be provided; see the description of
.B MREMAP_FIXED
below.
.PP
In Linux the memory is divided into pages.
A user process has (one or)
several linear virtual memory segments.
@ -70,7 +70,7 @@ a segmentation violation if the memory is accessed incorrectly (e.g.,
writing to a read-only segment).
Accessing virtual memory outside of the
segments will also cause a segmentation violation.
.PP
.BR mremap ()
uses the Linux page table scheme.
.BR mremap ()
@ -78,7 +78,7 @@ changes the
mapping between virtual addresses and memory pages.
This can be used to implement a very efficient
.BR realloc (3).
.PP
The \fIflags\fP bit-mask argument may be 0, or include the following flag:
.TP
.B MREMAP_MAYMOVE
@ -196,7 +196,7 @@ and the prototype for
did not allow for the
.I new_address
argument.
.PP
If
.BR mremap ()
is used to move or expand an area locked with
@ -216,7 +216,7 @@ if the area cannot be populated.
.BR sbrk (2),
.BR malloc (3),
.BR realloc (3)
.PP
Your favorite text book on operating systems
for more information on paged memory
(e.g., \fIModern Operating Systems\fP by Andrew S. Tanenbaum,

View File

@ -247,7 +247,7 @@ A successful
.B MSG_STAT
operation returns the identifier of the queue whose index was given in
.IR msqid .
.PP
On error, \-1 is returned with
.I errno
indicating the error.
@ -338,7 +338,7 @@ Applications intended to be portable to such old systems may need
to include these header files.
.\" Like Linux, the FreeBSD man pages still document
.\" the inclusion of these header files.
.PP
The
.BR IPC_INFO ,
.B MSG_STAT
@ -350,7 +350,7 @@ program to provide information on allocated resources.
In the future these may modified or moved to a
.I /proc
filesystem interface.
.PP
Various fields in the \fIstruct msqid_ds\fP were
typed as
.I short

View File

@ -194,7 +194,7 @@ Applications intended to be portable to such old systems may need
to include these header files.
.\" Like Linux, the FreeBSD man pages still document
.\" the inclusion of these header files.
.PP
.B IPC_PRIVATE
isn't a flag field but a
.I key_t

View File

@ -138,7 +138,7 @@ is specified in
.IR msgflg ,
then the call instead fails with the error
.BR EAGAIN .
.PP
A blocked
.BR msgsnd ()
call may also fail if:
@ -262,7 +262,7 @@ Nondestructively fetch a copy of the message at the ordinal position
in the queue specified by
.I msgtyp
(messages are considered to be numbered starting at 0).
.IP
This flag must be specified in conjunction with
.BR IPC_NOWAIT ,
with the result that, if there is no message available at the given position,
@ -276,7 +276,7 @@ and
.BR MSG_EXCEPT
may not both be specified in
.IR msgflg .
.IP
The
.BR MSG_COPY
flag was added for the implementation of
@ -473,7 +473,7 @@ and this kernel was configured without
.BR CONFIG_CHECKPOINT_RESTORE .
.SH CONFORMING TO
POSIX.1-2001, POSIX.1-2008, SVr4.
.PP
The
.B MSG_EXCEPT
and
@ -496,14 +496,14 @@ Applications intended to be portable to such old systems may need
to include these header files.
.\" Like Linux, the FreeBSD man pages still document
.\" the inclusion of these header files.
.PP
The
.I msgp
argument is declared as \fIstruct msgbuf\ *\fP in
glibc 2.0 and 2.1.
It is declared as \fIvoid\ *\fP
in glibc 2.2 and later, as required by SUSv2 and SUSv3.
.PP
The following limits on message queue resources affect the
.BR msgsnd ()
call:
@ -554,7 +554,7 @@ of whether that message was at the ordinal position
This bug is fixed
.\" commit 4f87dac386cc43d5525da7a939d4b4e7edbea22c
in Linux 3.14.
.PP
Specifying both
.B MSG_COPY
and
@ -575,11 +575,11 @@ The program below demonstrates the use of
.BR msgsnd ()
and
.BR msgrcv ().
.PP
The example program is first run with the \fB\-s\fP option to send a
message and then run again with the \fB\-r\fP option to receive a
message.
.PP
The following shell session shows a sample run of the program:
.in +4n
.nf

View File

@ -45,7 +45,7 @@ corresponds to the memory area starting at
and having length
.I length
is updated.
.PP
The
.I flags
argument should specify exactly one of
@ -98,7 +98,7 @@ are set in
The indicated memory (or part of it) was not mapped.
.SH CONFORMING TO
POSIX.1-2001, POSIX.1-2008.
.PP
This call was introduced in Linux 1.3.21, and then used
.B EFAULT
instead of
@ -149,5 +149,5 @@ in
.IR flags .
.SH SEE ALSO
.BR mmap (2)
.PP
B.O. Gallmeister, POSIX.4, O'Reilly, pp. 128-129 and 389-391.

Some files were not shown because too many files have changed in this diff Show More