This sentence is left over from an earlier rewrite of the text,
and the relevant details are covered a few paragraphs later.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The glibc wrapper for renameat2() was added in glibc 2.28 and
requires _GNU_SOURCE.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The newly added IOCB_FLAG_IOPRIO aio_flag introduces new behaviors
and return values.
The details of this new feature are posted here:
https://lkml.org/lkml/2018/5/22/809
Signed-off-by: Adam Manzanares <adam.manzanares@wdc.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The PERF_EVENT_IOC_QUERY_BPF ioctl was introduced in Linux 4.16.
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The PERF_EVENT_IOC_MODIFY_ATTRIBUTES ioctl was introduced in
Linux 4.17. It currently only works on breakpoint events.
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The PERF_EVENT_IOC_PAUSE_OUTPUT ioctl was introduced in Linux 4.7.
I've have this patch for a long time, I apologize for the delay
in getting it submitted. I've made some minor changes to the
original patch proposed by Wang Nan.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Reviewed-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
It turns out no one is really sure what the perf_event_open.2
exclude_idle field is supposed to do, and a recent thread on the
linux-kernel list:
[RFC] perf/core: what is exclude_idle supposed to do
did not really clarify things.
I think the following adjustment to the page clarifies things
at least a little.
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Some discussion on the linux-perf-users list has turned up that
the perf_event_open.2 description of how
PR_TASK_PERF_EVENTS_ENABLE / PR_TASK_PERF_EVENTS_DISABLE prctl()
works is misleading.
The descriptions were based on the tools/perf/design.txt document
which describes behavior that was removed in 082ff5a2767a06 (prior
to 2.6.31, the first release with perf_event_open support).
I have written some tests in my perf_event_tests testsuite that
verifies the behavior of prctl() in this case.
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
JIT support for x86-32 was during the Linux 4.18 release cycle.
Also correct the entry for MIPS (only MIPS64 is supported).
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The man page notes that vmsplice() can splice pages from memory
to a pipe, but it can work in the other direction as well.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
sys/memfd.h doesn't exist. memfd_create() is declared in
sys/mman.h and some flags are available only in linux/memfd.h.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The EINVAL errors that can occur for clone(2) when it is called
with various CLONE_NEW* flags and the kernel was not configured
with support for the corresponding namespace can also occur for
unshare(2). (As far as I can see, these errors don't occur for
either clone(2) or unshare(2) when it comes to CLONE_NEWNS and
CLONE_NEWCGROUP.)
Reported-by: Shawn Landden <shawn@git.icu>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Note that EINVAL can occur with CLONE_NEWUSER if the kernel was
not configured with CONFIG_USER_NS.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
When it fails, inet_aton() does not set errno, so using
perror() is not appropriate.
Reported-by: Antonio Chirizzi <antonio.chirizzi@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
In the kernel, the type of the arguments to pkey_alloc() is
"unsigned long" and that's what the page documented until now.
Now that glibc support is added for pkey_alloc(), switch to the
glibc prototype, which uses "unsigned int".
Reported-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Direct writes can perform partial writes because large writes
can be broken into smaller chunks by the block layer. Part of
the I/O submitted can fail and the failure is returned to write
as an error in the return value. However, part of the write can
be successful which means that data at the offset is inconsistent.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The wording is a little confusing, suggesting that this is
the primary use of O_NONBLOCK. Fix that.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
A process doesn't have a capability in a mount namespace, but
rather in the user namespace that owns the mount namespace.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Attempts (settimeofday(), clcok_settime(CLOCK_REALTIME)) to set
the real time clock to a value less than the current value of the
CLOCK_MONOTONIC clock result in EINVAL.
In the kernel source file kernel/time/timekeeping.c::do_settimeofday64(),
there is this check:
if (timespec64_compare(&tk->wall_to_monotonic, &ts_delta) > 0) {
ret = -EINVAL;
goto out;
}
It appears that the check was added in Linux 4.3:
commit e1d7ba8735551ed79c7a0463a042353574b96da3
Author: Wang YanQing <udknight@gmail.com>
Date: Tue Jun 23 18:38:54 2015 +0800
time: Always make sure wall_to_monotonic isn't positive
Reported-by: Jens Thoms Toerring <jt@toerring.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
I noticed that it was undocumented how inotify_add_watch(2)
behaves if IN_ONLYDIR is specified and the target is not a
directory.
I've included a patch that adds ENOTDIR as an additional error in
the inotify_add_watch(2) man page.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Remove a section that adds no benefit to the discussion of O_DIRECT.
Signed-off-by: Andrew Price <andy@andrewprice.me.uk>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
* man2/s390_sthyi.2
(.SH DESCRIPTION): Document the size of the resp_buffer when
function_code is 0.
(.SH NOTES): Document various aspects of the current
implementation (the lifted requirement for the response buffer
alignment, the presence of in-kernel cache), add description
for the documentation URL.
Coauthored-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Eugene Syromyatnikov <evgsyr@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
* man2/s390_runtime_instr.2 (.SH NOTES): Note the version of
the Linux kernel since which asm/runtime_inster.h header
is available.
Signed-off-by: Eugene Syromyatnikov <evgsyr@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
There are system calls of the same name present on the m86k and
MIPS architectures, but they simply allow setting some arbitrary
value which can be interpreted as a thread pointer by a threading
library.
* man2/set_thread_area.2 (.SH NAME): Rephrase in order to not
mention GDT.
(.SH SYNOPSIS): Add declarations for MIPS and m68k.
(.SH DESCRIPTION, .SH RETURN VALUE): Add description for MIPS
and m68k.
(.SH NOTES): Mention a way to get thread pointer on MIPS.
Signed-off-by: Eugene Syromyatnikov <evgsyr@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
This text has become rather long, making it it somewhat
unwieldy in the discussion of the mmap() flags. Therefore,
move it to NOTES, with a pointer in DESCRIPTION referring
the reader to NOTES.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Change "extremely hazardous" to "hazardous". The former phrasing
is a little overwrought; on its own "hazardous" is enough to
convey the sense of danger.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Clarify that MAP_FIXED is appropriate if the specified address
range has been reserved using an existing mapping, but shouldn't
be used otherwise.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Document the following membarrier commands introduced in
Linux 4.16:
MEMBARRIER_CMD_GLOBAL_EXPEDITED
(the old enum label MEMBARRIER_CMD_SHARED is now an
alias to preserve header backward compatibility)
MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED
MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE
MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The sentence is out of place, and probably doesn't really add to
the understanding already provided by the rest of the text
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Two new types kprobe and uprobe are being added to
perf_event_open(), which allow creating kprobe or
uprobe with perf_event_open. This patch adds
information about these types.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
In Linux 2..0, do_mmap() had the following check:
if (flags & MAP_DENYWRITE) {
if (file->f_inode->i_writecount > 0)
return -ETXTBSY;
}
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
4.17+ kernels offer a new MAP_FIXED_NOREPLACE flag which allows
the caller to atomically probe for a given address range.
[wording heavily updated by John Hubbard]
Cowritten-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
These functions are nonstandard, but there is no replacement.
See https://bugzilla.kernel.org/show_bug.cgi?id=199215
Reported-by: Martin Mares <mj@ucw.cz>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
As far as I can tell, these EBUSY errors disappeared
with the addition of stackable mounts in Linux 2.4.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
On a multiarch/multi-ABI platform such as modern x86, each
architecture/ABI (x86-64, x32, i386)has its own syscall numbers,
which means a seccomp() filter may see different syscall numbers
over the life of the process if that process uses execve() to
execute programs that has a different architectures/ABIs.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
It is not possible to consecutively stack mounts of the
same source+target inside the same mount namespace.
For example, if procfs was already mounted against /proc in
this mount namespace:
$ sudo mount -t proc none /proc
mount: /proc: none already mounted or mount point busy.
See the following code in fs/namespace.c:
/* Refuse the same filesystem on the same mount point */
err = -EBUSY;
if (path->mnt->mnt_sb == newmnt->mnt.mnt_sb &&
path->mnt->mnt_root == path->dentry)
goto unlock;
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
The "RETURN VALUE" section made a claim that was incorrect for
PTRACE_SECCOMP_GET_FILTER. Explicitly describe the behavior of
PTRACE_SECCOMP_GET_FILTER in the "RETURN VALUE" section (as
usual), but leave the now duplicate description in the section
describing PTRACE_SECCOMP_GET_FILTER, since the
PTRACE_SECCOMP_GET_FILTER section would otherwise probably become
harder to understand.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>