Commit Graph

18786 Commits

Author SHA1 Message Date
Michael Kerrisk 55d59b9b1d io_submit.2: Add cross-reference to io_getevents(2)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-20 13:55:34 +01:00
Michael Kerrisk e9b96f1319 io_submit.2: Cross reference pwritev(2) in discussion of RWF_SYNC and RWF_DSYNC
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-20 13:55:34 +01:00
Michael Kerrisk 2be12b9eaa io_submit.2: Minor fixes to Goldwyn Rodrigues's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-20 13:55:34 +01:00
Michael Kerrisk bfddbad031 io_submit.2: Rewrap source lines
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-20 13:55:34 +01:00
Goldwyn Rodrigues 7a62a0551b io_submit.2: Add iocb details to io_submit
Add more information about the iocb structure. It explains the
fields of the I/O control block structure which is passed to the
io_submit call.

The work also includes the nowait feature flags which is currently
posted at http://marc.info/?l=linux-fsdevel&m=149664103900715&w=2

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-20 13:55:34 +01:00
Michael Kerrisk 4cee582147 socket.7: Correct the description of SO_RXQ_OVFL
Two reports that the description of SO_RXQ_OVFL was wrong.

======

Commentary from Tobias:

This bug pertains to the manpage as visible on man7.org right
now.

The socket(7) man page has this paragraph:

       SO_RXQ_OVFL (since Linux 2.6.33)
              Indicates that an unsigned 32-bit value ancillary
              message (cmsg) should be attached to received skbs
              indicating the number of packets dropped by the
              socket between the last received packet and this
              received packet.

The second half is wrong: the counter (internally,
SOCK_SKB_CB(skb)->dropcount is *not* reset after every packet.
That is, it is a proper counter, not a gauge, in monitoring
parlance.

A better version of that paragraph:

       SO_RXQ_OVFL (since Linux 2.6.33)
              Indicates that an unsigned 32-bit value ancillary
              message (cmsg) should be attached to received skbs
              indicating the number of packets dropped by the
              socket since its creation.
======
Commentary from Petr

Generic SO_RXQ_OVFL helpers sock_skb_set_dropcount() and
sock_recv_drops() implements returning of sk->sk_drops (the total
number of dropped packets), although the documentation says the
number of dropped packets since the last received one should be
returned (quoting the current socket.7):

  SO_RXQ_OVFL (since Linux 2.6.33)
  Indicates that an unsigned 32-bit value ancillary message (cmsg)
  should be attached to received skbs indicating the number of packets
  dropped by the socket between the last received packet and this
  received packet.

I assume the documentation needs to be updated, as fixing this in
the code could break programs depending on the current behavior,
although the formerly planned functionality seems to be more
useful.

The problem can be revealed with the following program:

int extract_drop(struct msghdr *msg)
{
        struct cmsghdr *cmsg;
        int rtn;

        for (cmsg = CMSG_FIRSTHDR(msg); cmsg; cmsg = CMSG_NXTHDR(msg,cmsg)) {
                if (cmsg->cmsg_level == SOL_SOCKET &&
                    cmsg->cmsg_type == SO_RXQ_OVFL) {
                        memcpy(&rtn, CMSG_DATA(cmsg), sizeof rtn);
                        return rtn;
                }
        }
        return -1;
}

int main(int argc, char *argv[])
{
        struct sockaddr_in addr = { .sin_family = AF_INET };
        char msg[48*1024], cmsgbuf[256];
        struct iovec iov = { .iov_base = msg, .iov_len = sizeof msg };
        int sk1, sk2, i, one = 1;

        sk1 = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);
        sk2 = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP);

        inet_pton(AF_INET, "127.0.0.1", &addr.sin_addr);
        addr.sin_port = htons(53333);

        bind(sk1, (struct sockaddr*)&addr, sizeof addr);
        connect(sk2, (struct sockaddr*)&addr, sizeof addr);

        // Kernel doubles this limit, but it accounts also the SKB overhead,
        // but it receives as long as there is at least 1 byte free.
        i = sizeof msg;
        setsockopt(sk1, SOL_SOCKET, SO_RCVBUF, &i, sizeof i);
        setsockopt(sk1, SOL_SOCKET, SO_RXQ_OVFL, &one, sizeof one);

        for (i = 0; i < 4; i++) {
                int rtn;

                send(sk2, msg, sizeof msg, 0);
                send(sk2, msg, sizeof msg, 0);
                send(sk2, msg, sizeof msg, 0);

                do {
                        struct msghdr msghdr = {
                                        .msg_iov = &iov, .msg_iovlen = 1,
                                        .msg_control = &cmsgbuf,
                                        .msg_controllen = sizeof cmsgbuf };
                        rtn = recvmsg(sk1, &msghdr, MSG_DONTWAIT);
                        if (rtn > 0) {
                                printf("rtn: %d drop %d\n", rtn,
                                                extract_drop(&msghdr));
                        } else {
                                printf("rtn: %d\n", rtn);
                        }
                } while (rtn > 0);
        }

        return 0;
}

which prints
  rtn: 49152 drop -1
  rtn: 49152 drop -1
  rtn: -1
  rtn: 49152 drop 1
  rtn: 49152 drop 1
  rtn: -1
  rtn: 49152 drop 2
  rtn: 49152 drop 2
  rtn: -1
  rtn: 49152 drop 3
  rtn: 49152 drop 3
  rtn: -1
although it should print (according to the documentation):
  rtn: 49152 drop 0
  rtn: 49152 drop 0
  rtn: -1
  rtn: 49152 drop 1
  rtn: 49152 drop 0
  rtn: -1
  rtn: 49152 drop 1
  rtn: 49152 drop 0
  rtn: -1
  rtn: 49152 drop 1
  rtn: 49152 drop 0
  rtn: -1

Reported-by: Petr Malat <oss@malat.biz>
Reported-by: Tobias Klausmann <klausman@schwarzvogel.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-20 13:54:28 +01:00
Breno Leitao 2f694223f0 pkey_alloc.2: Fix argument order
Currently pkey_alloc() syscall has two arguments, and the very
first argument is still not supported as in kernel 4.14-rc8 and
should be set to zero, as showed in the following syscall
implementation:

	SYSCALL_DEFINE2(pkey_alloc, unsigned long, flags, ...)
	{
		int pkey;
		int ret;

		/* No flags supported yet. */
		if (flags)
			return -EINVAL;

This behaviour is also documented correctly in the kernel
documentation as Documentation/x86/protection-keys.txt

The second argument is the one that should specify the page
access rights.

This patch fixes the manpage to describe how the code behaves.

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-20 11:08:20 +01:00
Michael Kerrisk 73be834acb posixoptions.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-20 10:38:10 +01:00
Michael Kerrisk de2ea7d63d keyctl.2: ffix: add some soft hyphenation points to long URL
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-20 10:38:10 +01:00
Michael Kerrisk bcfa608f87 Makefile: Remove a redundant comment
Reported-by: Дилян Палаузов <dilyan.palauzov@aegee.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-20 10:38:10 +01:00
Michael Kerrisk 2927055043 proc.5: Rework the description of /proc/PID/mountinfo parent-ID field
After comments from Miklos, and further digging in the kernel
source that showed that chroot() can also result in "hidden"
parent-IDs in mountinfo, I've revised the description of
mountinfo.

In fs/proc_namespace.cs::how_mountinfo() there is:

        /* mountpoints outside of chroot jail will give SEQ_SKIP on this */
        err = seq_path_root(m, &mnt_path, &p->root, " \t\n\\");
        if (err)
                goto out;

I instrumented the 'if (err)' code path with printk()
to show that there is indeed a record corresponding to the
parent-ID for the process root that is being skipped.

Reported-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-20 10:37:33 +01:00
Michael Kerrisk 35cf1b9397 proc.5: Correct the description of the parent mount ID for /proc/PID/mountinfo
I do not have an exact handle on the details, but I can see
roughly what is going on.  Internally, there seems to be one
("hidden") mount ID reserved to each mount namespace, and that ID
is the parent of the root mount point.

Looking through the (4.14) kernel source, mount IDs are allocated
by a kernel function called mnt_alloc_id() (in fs/namespace.c),
which is in turn called by alloc_vfsmnt() which is in turn called
by clone_mnt().

A new mount namespace is created by the kernel function
copy_mnt_ns() (in fs/namespace.c, called by
create_new_namespaces() in kernel/nsproxy.c). The copy_mnt_ns()
function calls copy_tree() (in fs/namespace.c), and copy_tree()
calls clone_mnt() in *two* places.  The first of these is the call
that creates the "hidden" mount ID that becomes the parent of the
root mount point. (I verified this by instrumenting the kernel
with a few printk() calls to display the IDs.)  The second place
where copy_tree() calls clone_mnt() is in a loop that replicates
each of the mount points (including the root mount point) in the
source mount namespace.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-20 10:13:00 +01:00
Michael Kerrisk faec2136ca seccomp.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-20 00:37:40 +01:00
Michael Kerrisk 9b0e3937a9 proc.5: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 20:27:37 +01:00
Marcus Folkesson 5753354a3a proc.5: Update description of /proc/<pid>/oom_score
After Linux 2.6.36, the heuristic calculation of oom_score
has changed to only consider used memory and CAP_SYS_ADMIN.

See kernel commit a63d83f427fbce97a6cea0db2e64b0eb8435cd10.

Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 19:52:55 +01:00
Michael Kerrisk 5115e06c26 ioctl_tty.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 13:21:53 +01:00
Michael Kerrisk 0823652975 membarrier.2: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 11:35:29 +01:00
Michael Kerrisk d1555345ef membarrier.2: Minor fixes to Mathieu's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 11:26:01 +01:00
Michael Kerrisk 20fe250908 membarrier.2: srcfix: rewrap source lines
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 11:06:31 +01:00
Michael Kerrisk ee595da39c membarrier.2: srcfix FIXME
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 11:05:04 +01:00
Mathieu Desnoyers c50f154e6b membarrier.2: Update membarrier manpage for 4.14
Add documentation for those new membarrier() commands:
        MEMBARRIER_CMD_PRIVATE_EXPEDITED
        MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED

Adapt the MEMBARRIER_CMD_SHARED return value documentation to reflect
that it now returns -EINVAL when issued on a system configured for
nohz_full.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Paul Turner <pjt@google.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Andrew Hunter <ahh@google.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Andi Kleen <andi@firstfloor.org>
CC: Dave Watson <davejwatson@fb.com>
CC: Chris Lameter <cl@linux.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Ben Maurer <bmaurer@fb.com>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Russell King <linux@arm.linux.org.uk>
CC: Catalin Marinas <catalin.marinas@arm.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: linux-api@vger.kernel.org
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 11:03:59 +01:00
Michael Kerrisk 0771269c60 seccomp.2: Document the "default" filter return action
The kernel defaults to either SECCOMP_RET_KILL_PROCESS
or SECCOMP_RET_KILL_THREAD for unrecognized filter
return action values.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 10:36:59 +01:00
Grégory Vander Schueren b61f53a44e send.2: Add EALREADY to ERRORS
From linux/v4.14-rc6/source/net/ipv4/tcp.c:

    if (tp->fastopen_req)
        return -EALREADY; /* Another Fast Open is in progress */

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 10:05:35 +01:00
Michael Kerrisk f2c2c3083f user_namespaces.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 09:56:40 +01:00
Michael Kerrisk 2660d01041 user_namespaces.7: Restore historical details about UID maps
Christian Brauner's patch added the Linux 4.15 details,
but we need to retain the historical details.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 09:56:35 +01:00
Christian Brauner dc04b65274 user_namespaces.7: Document new 340 line idmap limit
This patch documents the following kernel commit:

    commit 6397fac4915ab3002dc15aae751455da1a852f25
    Author: Christian Brauner <christian.brauner@ubuntu.com>
    Date:   Wed Oct 25 00:04:41 2017 +0200

        userns: bump idmap limits to 340

Since Linux 4.15 the number of idmap lines has been bumped to 340.
The patch also removes the "(arbitrary)" in "There is an
(arbitrary) limit on the number of lines in the file." since the
340 line limit is well-explained by the current implementation.
The struct recording the idmaps is 12 bytes and quite some proc
files only allow writes the size of a single page size which is
4096kB. This leaves room for 340 idmappings (340 * 12 = 4080
bytes).  The struct layout itself has been chosen very carefully
to allow for an implementation that limits the time-complexity for
the idmap codepaths to O(log n). However, I think it's unnecessary
to expose this much implementation detail to users in the man
page. So only mention this in the commit message.  Furthermore,
the comment about the page size restriction is misleading. The
kernel sources show that >= page size is considered an error.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 09:46:07 +01:00
Michael Kerrisk df5b5f9aa8 seccomp.2: Document the seccomp audit logging feature added in Linux 4.14
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-19 09:13:45 +01:00
Michael Kerrisk 0c43878057 seccomp.2: Change SECCOMP_RET_ACTION to SECCOMP_RET_ACTION_FULL
In Linux 4.14, the action component of the return value
switched from being 15 bits to being 16 bits. A new macro,
SECCOMP_RET_ACTION_FULL, that masks the 16 bits was added,
to replace the older SECCOMP_RET_ACTION.

Reported-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-18 23:17:56 +01:00
Michael Kerrisk 1d530819c5 seccomp.2: Minor wording change
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-18 23:17:56 +01:00
Michael Kerrisk b9c6742b0b seccomp.2: Consolidate some common text
Consolidate some common text for SECCOMP_RET_KILL_PROCESS
and SECCOMP_RET_KILL_THREAD.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-18 23:17:56 +01:00
Michael Kerrisk 51c58a6c11 seccomp.2: Add description of SECCOMP_RET_KILL_PROCESS
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-18 23:17:45 +01:00
Michael Kerrisk 5cfa062716 seccomp.2: Explicitly note that other threads survive SECCOMP_RET_KILL_THREAD
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-18 20:04:32 +01:00
Michael Kerrisk 6aa0baa439 seccomp.2: Add SECCOMP_RET_KILL_THREAD description and rework SECCOMP_RET_KILL text
Linux 4.14 added SECCOMP_RET_KILL_THREAD as a synonym for
SECCOMP_RET_KILL. Remove also the discussion of multithreaded
processes, since that will be addressed in the documentation
of SECCOMP_RET_KILL_PROCESS.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-18 20:04:32 +01:00
Michael Kerrisk 1f5ad3c846 seccomp.2: Minor consolidation/reworking of EINVAL descriptions
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-18 00:29:55 +01:00
Michael Kerrisk 865c9c8130 seccomp.2: Minor wording fix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-18 00:23:41 +01:00
Michael Kerrisk d8b6e735ee smartpqi.4: Add some details on how to find controller User Guide
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-17 20:25:44 +01:00
Michael Kerrisk a39a3f8d9b smartpqi.4: Add explanation of ioaccel
Based on text sent by Don Brace.

Reported-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-17 20:25:44 +01:00
Michael Kerrisk 8c5ea8ce8b smartpqi.4: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-17 20:25:44 +01:00
Michael Kerrisk b7097761de smartpqi.4: Add VERSIONS section
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-17 20:25:44 +01:00
Michael Kerrisk 25ee990c44 smartpqi.4: Reorder various pieces of text to follow usual conventions
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-17 20:25:44 +01:00
Michael Kerrisk 912f1e1ef2 smartpqi.4: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-17 20:25:44 +01:00
Michael Kerrisk 1408b135d9 smartpqi.4: Minor wording fixes
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-17 20:25:44 +01:00
Michael Kerrisk 813d39bce0 smartpqi.4: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-17 20:25:43 +01:00
Michael Kerrisk 702560ae27 smartpqi.4: srcfix: add some FIXME markers
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-17 20:25:43 +01:00
G. Branden Robinson b08671565d smartpqi.4: Various fixes, mostly formatting related
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-17 20:25:43 +01:00
Don Brace 484cb54f15 smartpqi: initial submit of smartpqi man page
This patch contains the initial submission of the
smartpqi man page.

Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-17 20:25:43 +01:00
Michael Kerrisk 1ec37705a4 chown.2: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-17 19:18:28 +01:00
Michael Kerrisk 1445a0ff3d seccomp.2: srcfix: Update copyright notice
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-17 18:36:55 +01:00
Michael Kerrisk 96a35a8352 connect.2: Clarify that ECONREFUSED is for stream sockets
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-16 18:43:19 +01:00
Michael Kerrisk b5fff4eaee futex.2: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2017-11-16 18:27:50 +01:00