Commit Graph

3682 Commits

Author SHA1 Message Date
Michael Kerrisk 44449eb99f locale.7, user_namespaces.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-27 10:25:56 +01:00
Michael Kerrisk f711139679 epoll_ctl.2, ioctl_userfaultfd.2, keyctl.2, ptrace.2, socket.7: ffix
Reported-by: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-27 10:06:04 +01:00
nixiaoming ebfb6feee6 fanotify_init.2, fanotify.7: Document FAN_REPORT_TID
fanotify_init.2: add new flag FAN_REPORT_TID
fanotify.7: update description of member pid in
    struct fanotify_event_metadata

Signed-off-by: nixiaoming <nixiaoming@huawei.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-26 17:39:00 +01:00
Michael Kerrisk 953d1e0792 fanotify_mark.2, fanotify.7: Minor tweaks to Amir Goldstein's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-26 17:39:00 +01:00
Amir Goldstein b2f8214d47 fanotify_mark.2, fanotify.7: Document FAN_MARK_FILESYSTEM
Monitor fanotify events on the entire filesystem.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-26 17:39:00 +01:00
Michael Kerrisk fd1eb8a782 fanotify_mark.2, fanotify.7: Minor tweaks to Matthew Bobrowski's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-26 17:39:00 +01:00
Matthew Bobrowski fc37d2f1c8 fanotify_mark.2, fanotify.7: Document FAN_OPEN_EXEC and FAN_OPEN_EXEC_PERM
New event masks have been added to the fanotify API. Documentation to
support the use and behaviour of these new masks has been added
accordingly.

Signed-off-by: Matthew Bobrowski <mbobrowski@mbobrowski.org>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-26 17:39:00 +01:00
Michael Kerrisk 5ef4a59dbf inotify.7: Minor tweaks
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-26 17:39:00 +01:00
Michael Kerrisk 859758b692 inotify.7: Minor fixes to Henry Wilson's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-26 17:38:56 +01:00
Henry Wilson 381b7a9111 inotify.7: Document IN_MASK_CREATE
Add documentation for new flag IN_MASK_CREATE for inotify_add_watch()
which is used to only allow new watches to be created.

Information obtained from a patch I submitted to the linux kernel
https://marc.info/?l=linux-fsdevel&m=152775980422847&w=2

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-26 17:38:25 +01:00
Michael Kerrisk 29fa4cbc2e cgroups.7: Document the use of 'cgroup_no_v1=named' to disable v1 named hierarchies
This feature was added in Linux 5.0.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-26 17:38:19 +01:00
Eugene Syromyatnikov 5dfd2983f7 address_families.7: tfix
Signed-off-by: Eugene Syromyatnikov <evgsyr@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-25 11:13:37 +01:00
Eugene Syromyatnikov 22570de1e1 socket.2, address_families.7: Mention that address family names are Linux-specific
* man2/socket.2 (.SH DESCRIPTION): Mention that the list of
  address families is Linux-specific.
* man7/address_families.7 (.SH DESCRIPTION): Likewise.

Signed-off-by: Eugene Syromyatnikov <evgsyr@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-25 11:07:52 +01:00
bert hubert 2ca483cd4b ip.7: IP_RECVTTL error fixed
I need to get the TTL of UDP datagrams from userspace, so I set
the IP_RECVTTL socket option.  And as promised by ip.7, I then get
IP_TTL messages from recvfrom.  However, unlike what the manpage
promises, the TTL field gets passed as a 32 bit integer.

The following userspace code works:

  uint32_t ttl32;
  for (cmsg = CMSG_FIRSTHDR(msgh); cmsg != NULL; cmsg = CMSG_NXTHDR(msgh,cmsg)) {
    if ((cmsg->cmsg_level == IPPROTO_IP) && (cmsg->cmsg_type == IP_TTL) &&
        CMSG_LEN(sizeof(ttl32)) == cmsg->cmsg_len) {

      memcpy(&ttl32, CMSG_DATA(cmsg), sizeof(ttl32));
      *ttl=ttl32;
      return true;
    }
    else
      cerr<<"Saw something else "<<(cmsg->cmsg_type == IP_TTL) <<
		", "<<(int)cmsg->cmsg_level<<", "<<cmsg->cmsg_len<<", "<<
		CMSG_LEN(1)<<endl;
  }

The 'else' field was used to figure out I go the length wrong.

Note from mtk:

Reading the source code also seems to confirm this, from
net/ipv4/ip_sockglue.c:

[[
static void ip_cmsg_recv_ttl(struct msghdr *msg, struct sk_buff *skb)
{
        int ttl = ip_hdr(skb)->ttl;
        put_cmsg(msg, SOL_IP, IP_TTL, sizeof(int), &ttl);
}
]]

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-25 10:49:09 +01:00
Michael Kerrisk 9f92e4e1cb capabilities.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-23 22:03:20 +01:00
Michael Kerrisk 4312e0cb67 capabilities.7: CAP_SYS_CHROOT allows use of setns() to change the mount namespace
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-23 22:03:20 +01:00
Michael Kerrisk dd61e8a8f4 capabilities.7: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-23 22:03:20 +01:00
Michael Kerrisk 9c5b11bf42 capabilities.7: Add a subsection on per-user-namespace "set-user-ID-root" programs
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-23 22:03:20 +01:00
Michael Kerrisk bcf7072dbd capabilities.7: Relocate the subsection "Interaction with user namespaces"
This best belongs at the end of the page, after the subsections
that already make some mention of user namespaces.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-23 22:03:20 +01:00
Michael Kerrisk 049d1a1534 capabilities.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-23 22:03:20 +01:00
Michael Kerrisk 33d0916f81 capabilities.7: Substantially rework "Capabilities and execution of programs by root"
Rework for improved clarity, and also to include missing details
on the case where (1) the binary that is being executed has
capabilities attached and (2) the real user ID of the process is
not 0 (root) and (3) the effective user ID of the process is 0
(root).

Kernel code analysis and some test code (GPLv3 licensed) below.

======

My analysis of security/commoncaps.c capabilities handling
(from Linux 4.20 source):

execve() eventually calls __do_execve_file():

__do_execve_file()
  |
  +-prepare_bprm_creds(&bprm)
  |  |
  |  +-prepare_exec_creds()
  |  |  |
  |  |  +-prepare_creds()
  |  |     |
  |  |     | // Returns copy of existing creds
  |  |     |
  |  |     +-security_prepare_creds()
  |  |        |
  |  |        +-cred_prepare() [via hook]
  |  |           // Seems to do nothing for commoncaps
  |  |
  |  // Returns creds provided by prepare_creds()
  |
  // Places creds returned by prepare_exec_creds() in bprm->creds
  |
  |
  +-prepare_binprm(&bprm) // bprm from prepare_bprm_creds()
     |
     +-bprm_fill_uid(&bprm)
     |
     |  // Places current credentials into bprm
     |
     |  // Performs set-UID & set-GID transitions if those file bits are set
     |
     +-security_bprm_set_creds(&bprm)
        |
        +-bprm_set_creds(&bprm) [via hook]
           |
           +-cap_bprm_set_creds(&bprm)
              |
              // effective = false
              |
              +-get_file_caps(&bprm, &effective, &has_fcap)
              |  |
              |  +-get_vfs_caps_from_disk(..., &vcaps)
              |  |
              |  |  // Fetches file capabilities from disk and places in vcaps
              |  |
              |  +-bprm_caps_from_vfs_caps(&vcaps, &bprm, &effective, &has_fcap)
              |
              |     // If file effective bit is set: effective = true
              |     //
              |     // If file has capabilities: has_fcap |= true
              |     //
              |     // Perform execve transformation:
              |     //     P'(perm) = F(inh) & P(Inh) | F(Perm) & P(bset)
              |
              +-handle_privileged_root(&bprm, has_fcap, &effective, root_uid)
              |
              |  // If has_fcap && (rUID != root && eUID == root) then
              |  //     return without doing anything
              |  //
              |  // If rUID == root || eUID == root then
              |  //    P'(perm) = P(inh) | P(bset)
              |  //
              |  // If eUID == root then
              |  //     effective = true
              |
              // Perform execve() transformation:
              //
              //     P'(Amb) = (privprog) ? 0 : P(Amb)
              //     P'(Perm) |= P'(Amb)
              //     P'(Eff) = effective ? P'(Perm) : P'(Amb)

Summary

1. Perform set-UID/set-GID transformations

2. P'(Amb) = (privprog) ? 0 : P(Amb)

3. If [process has nonzero UIDs] OR
   ([file has caps] && [rUID != root && eUID == root]), then

        P'(perm) = F(inh) & P(Inh) | F(Perm) & P(bset) | P'(Amb)

   else // ~ [process has rUID == root || eUID == root]

        P'(perm) = P(inh) | P(bset) | P'(Amb)

4. P'(Eff) = (F(eff) || eUID == root) ? P'(Perm) : P'(Amb)

======

$ cat show_creds_and_caps_long.c

int
main(int argc, char *argv[])
{
    uid_t ruid, euid, suid;
    gid_t rgid, egid, sgid;
    cap_t caps;
    char *s;

    if (getresuid(&ruid, &euid, &suid) == -1) {
        perror("getresuid");
        exit(EXIT_FAILURE);
    }

    if (getresgid(&rgid, &egid, &sgid) == -1) {
        perror("getresgid");
        exit(EXIT_FAILURE);
    }

    printf("UID: %5ld (real), %5ld (effective), %5ld (saved)\n",
            (long) ruid, (long) euid, (long) suid);
    printf("GID: %5ld (real), %5ld (effective), %5ld (saved)\n",
            (long) rgid, (long) egid, (long) sgid);

    caps = cap_get_proc();
    if (caps == NULL) {
        perror("cap_get_proc");
        exit(EXIT_FAILURE);
    }
    s = cap_to_text(caps, NULL);
    if (s == NULL) {
        perror("cap_to_text");
        exit(EXIT_FAILURE);
    }
    printf("Capabilities: %s\n", s);

    cap_free(caps);
    cap_free(s);

    exit(EXIT_SUCCESS);
}

$ cat cred_launcher.c

                        } while (0)

                        do { fprintf(stderr, "Usage: "); \
                             fprintf(stderr, msg, progName); \
                             exit(EXIT_FAILURE); } while (0)

int
main(int argc, char *argv[])
{
    uid_t r, e, s;

    if (argc != 5 || strcmp(argv[1], "--help") == 0)
        usageErr("%s rUID eUID sUID <prog>\n", argv[0]);

    r = atoi(argv[1]);
    e = atoi(argv[2]);
    s = atoi(argv[3]);

    if (setresuid(r, e, s) == -1)
        errExit("setresuid");

    if (getresuid(&r, &e, &s) == -1)
        errExit("getresuid");

    execv(argv[4], &argv[4]);
    errExit("execve");
}

$ cc -o cred_launcher cred_launcher.c
$ cc -o show_creds_and_caps_long show_creds_and_caps_long.c -lcap

$ sudo ./cred_launcher 1000 0 1000 ./show_creds_and_caps_long
UID:  1000 (real),     0 (effective),     0 (saved)
GID:     0 (real),     0 (effective),     0 (saved)
Capabilities: =ep

$ sudo setcap cap_kill=pe show_creds_and_caps_long
$ sudo ./cred_launcher 1000 0 1000 ./show_creds_and_caps_long
UID:  1000 (real),     0 (effective),     0 (saved)
GID:     0 (real),     0 (effective),     0 (saved)
Capabilities: = cap_kill+ep

The final program execution above shows the special casing
that occurs in handle_privileged_root() for the case where:

    rUID != root && eUID == root && [file has capabilities]

======

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-23 22:03:20 +01:00
Michael Kerrisk cc0fb214da capabilities.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-23 22:03:20 +01:00
Michael Kerrisk 1a9ed17c9e capabilities.7: Improve the discussion of when file capabilities are ignored
The text stated that the execve() capability transitions are not
performed for the same reasons that setuid and setgid mode bits
may be ignored (as described in execve(2)). But, that's not quite
correct: rather, the file capability sets are treated as empty
for the purpose of the capability transition calculations.

Also merge the new 'no_file_caps' kernel option text into the
same paragraph.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-23 22:03:20 +01:00
Michael Kerrisk f6acfeb8f8 capabilities.7: Document the 'no_file_caps' kernel command-line option
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-23 22:03:20 +01:00
Michael Kerrisk bc1950ac92 capabilities.7: Rework discussion of exec and UID 0, correcting a couple of details
Clarify the "Capabilities and execution of programs by root"
section, and correct a couple of details:

* If a process with rUID == 0 && eUID != 0 does an exec,
  the process will nevertheless gain effective capabilities
  if the file effective bit is set.
* Set-UID-root programs only confer a full set of capabilities
  if the binary does not also have attached capabilities.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-10 03:40:15 +01:00
Michael Kerrisk db18d67f21 capabilities.7: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-07 11:40:25 +01:00
Michael Kerrisk 1873715c21 namespaces.7: srcfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-07 02:07:28 +01:00
Michael Kerrisk 619dbe1c6d cgroups.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-02-02 19:48:15 +01:00
Michael Kerrisk 4b1c2041f4 cgroups.7: Reframe the text on delegation to include more details about cgroups v1
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-23 22:17:17 +01:00
Michael Kerrisk 2b91ed4e5f cgroups.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-23 22:17:17 +01:00
Michael Kerrisk 51629a3000 cgroups.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-23 22:17:17 +01:00
Michael Kerrisk 87b18a8b63 cgroups.7: Soften the discussion about delegation in cgroups v1
Balbir pointed out that v1 delegation was not an accidental
feature.

Reported-by: Balbir Singh <bsingharora@gmail.com>
Reported-by: Marcus Gelderie <redmnic@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-23 22:17:17 +01:00
Michael Kerrisk e366c4d48d cgroups.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-23 22:17:17 +01:00
Jakub Wilk 6f25f547da man.7: tfix
Use \(aq for ASCII apostrophes and \(ga for backtick,
as recommended by groff_man(7).

Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-17 08:33:58 +13:00
Michael Kerrisk 1be4da28c5 feature_test_macros.7: Add more detail on why FTMs must be defined before including any header
Reported-by: Andreas Westfeld <andreas.westfeld@htw-dresden.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2019-01-10 13:01:03 +13:00
Michael Kerrisk 928c3e7c95 unix.7: wfix
Reported-by: Felipe Gasper <felipe@felipegasper.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Michael Kerrisk 421d34c632 unix.7: Clarify that SO_PASSCRED behavior
Clarify that SO_PASSCRED results in SCM_CREDENTIALS data in each
subsequently received message.

See https://bugzilla.kernel.org/show_bug.cgi?id=201805

Reported-by: Felipe Gasper <felipe@felipegasper.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Michael Kerrisk 3d3cddde94 unix.7: Rework SO_PEERCRED text for greater clarity
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Michael Kerrisk 88fedfa061 unix.7: Explicitly note that SO_PASSCRED provides SCM_CREDENTIALS messages
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Michael Kerrisk c02ed554e9 unix.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Michael Kerrisk 744c8fa8d2 unix.7: Improve wording describing socket option argument/return values
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-23 19:09:33 +01:00
Radostin Stoyanov 69fc6c6761 namespaces.7: tfix
Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-18 21:09:16 +01:00
Konst Mayer 081ec61f02 tcp.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-12-01 09:47:28 +01:00
Anthony Iliopoulos 6684e3e4ff fanotify.7: wfix
Use "FAN_OPEN_PERM" consistently rather than "FAN_PERM_OPEN".

Signed-off-by: Anthony Iliopoulos <ailiopoulos@suse.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-24 08:04:42 +01:00
Michael Kerrisk 4f1a13fe85 pid_namespaces.7: Clarify the semantics for the adoption of orphaned processes
Because of setns() semantics, the parent of a process may reside
in the outer PID namespace. If that parent terminates, then the
child is adopted by the "init" in the outer PID namespace (rather
than the "init" of the PID namespace of the child).

Thus, in a scenario such as the following, if process M
terminates, P is adopted by the init process in the initial
PID namespace, and if P terminates, Q is adopted by the init
process in the inner PID namespace.

    +---------------------------------------------+
    | Initial PID NS                              |
    |                           +---------------+ |
    |  +-+                      | inner PID NS  | |
    |  |1|                      |               | |
    |  +-+                      |    +-+        | |
    |                           |    |1|        | |
    |                           |    +-+        | |
    |                           |               | |
    |  +-+   setns(), fork()    |    +-+        | |
    |  |M|----------------------+--> |P|        | |
    |  +-+                      |    +-+        | |
    |                           |     | fork()  | |
    |                           |     v         | |
    |                           |    +-+        | |
    |                           |    |Q|        | |
    |                           |    +-+        | |
    |                           +---------------+ |
    +---------------------------------------------+

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-19 16:55:50 +01:00
Michael Kerrisk 1fa9fdb1e9 signal.7: Unify signal lists into a signal table that embeds standards info
Having the signals listed in three different tables reduces
readability, and would require more table splits if future
standards specify other signals.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 10:17:39 +01:00
Michael Kerrisk 6043ed9d54 signal.7: Insert standards info into tables
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 10:17:39 +01:00
Michael Kerrisk 9a10a14487 signal.7: Place signal numbers in a separate table
The current tables of signal information are unwieldy,
as they try to cram in too much information.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 10:17:39 +01:00
Michael Kerrisk bdbc9b4475 signal.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 08:59:02 +01:00
Michael Kerrisk d893df00d9 capabilities.7: Update URL for libcap tarballs
The previous location does not seem to be getting updated.
(For example, at the time of this commit, libcap-2.26
had been out for two months, but was not present at
http://www.kernel.org/pub/linux/libs/security/linux-privs.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-17 07:26:22 +01:00
Jakub Wilk b784b9d50f user_namespaces.7: tfix
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-09 16:02:07 +01:00
Michael Kerrisk a13b92e5da signal.7: tfix
Reported-by: Helge Deller <deller@gmx.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-09 04:48:59 +01:00
Michael Kerrisk 4a501601a6 signal.7: Reorder the architectures in the signal number lists
x86 and ARM are the most common architectures, but currently
are in the second subfield in the signal number lists.
Instead, swap that info with subfield 1, so the most
common architectures are first in the list.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-07 22:35:50 +01:00
Helge Deller a42f9c51cb signal.7: Add signal numbers for parisc
This patch adds the signal numbers for parisc to the signal(7) man page.

Those parisc-specific values for the various signals are valid since the
Linux kernel upstream commit ("parisc: Reduce SIGRTMIN from 37 to 32 to
behave like other Linux architectures") during development of kernel 3.18:
http://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1f25df2eff5b25f52c139d3ff31bc883eee9a0ab

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-07 22:35:45 +01:00
Michael Kerrisk aa2c362324 cgroups.7: Minor fix: bump kernel version to 4.19 in a couple of points
The stated points still hold true as at Linux 4.1.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-07 21:30:33 +01:00
Jakub Wilk 587ff4d5af vdso.7: tfix
Escape hyphens; use \(aq for ASCII apostrophes.

Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-05 17:00:05 +01:00
Michael Kerrisk 77eefc59bd cgroups.7: tfix
Reported-by: Alan Jenkins <alan.christopher.jenkins@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-04 11:29:06 +01:00
Michael Kerrisk c6c28d527d user_namespaces.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-02 13:52:24 +01:00
Michael Kerrisk 2c1608c23b namespaces.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-02 13:32:25 +01:00
Michael Kerrisk 2eb89baa0e capabilities.7: Minor fixes to Marcus Gelderie's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-01 20:55:13 +01:00
Marcus Gelderie 35ecd12dd9 capabilities.7: Mention header for SECBIT constants
Mention that the named constants (SECBIT_KEEP_CAPS and others)
are available only if the linux/securebits.h user-space header
is included.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-01 20:55:13 +01:00
Michael Kerrisk 53666f6c30 bpf-helpers.7: Add new man page for eBPF helper functions
eBPF sub-system on Linux can use "helper functions", functions
implemented in the kernel that can be called from within a eBPF program
injected by a user on Linux. The kernel already supports a long list of
such helpers (sixty-seven at this time, new ones are under review).
Therefore, it is proposed to create a new manual page, separate from
bpf(2), to document those helpers for people willing to develop new eBPF
programs.

Additionally, in an effort to keep this documentation in synchronisation
with what is implemented in the kernel, it is further proposed to keep
the documentation itself in the kernel sources, as comments in file
"include/uapi/linux/bpf.h", and to generate the man page from there.

This patch adds the new man page, generated from kernel sources, to the
man-pages repository. For each eBPF helper function, a description of
the helper, of its arguments and of the return value is provided. The
idea is that all future changes for this page should be redirected to
the kernel file "include/uapi/linux/bpf.h", and the modified page
generated from there.

Generating the page itself is a two-step process. First, the
documentation is extracted from include/uapi/linux/bpf.h, and converted
to a RST (reStructuredText-formatted) page, with the relevant script
from Linux sources:

      $ ./scripts/bpf_helpers_doc.py > /tmp/bpf-helpers.rst

The second step consists in turning the RST document into the final man
page, with rst2man:

      $ rst2man /tmp/bpf-helpers.rst > bpf-helpers.7

The bpf.h file was taken as at kernel 4.19

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-01 14:57:49 +01:00
Michael Kerrisk dd63e15948 capabilities.7: Correct the description of SECBIT_KEEP_CAPS
This just adds to the point made by Marcus Gelderie's patch.  Note
also that SECBIT_KEEP_CAPS provides the same functionality as the
prctl() PR_SET_KEEPCAPS flag, and the prctl(2) manual page has the
correct description of the semantics (i.e., that the flag affects
the treatment of onlt the permitted capability set).

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-01 14:40:49 +01:00
Michael Kerrisk ab7ef2a882 capabilities.7: Minor tweaks to the text added by Marcus Gelderie's patch
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-01 14:40:49 +01:00
Marcus Gelderie 7d32b135d6 capabilities.7: Add details about SECBIT_KEEP_CAPS
The description of SECBIT_KEEP_CAPS is misleading about the
effects on the effective capabilities of a process during a
switch to nonzero UIDs.  The effective set is cleared based on
the effective UID switching to a nonzero value, even if
SECBIT_KEEP_CAPS is set. However, with this bit set, the
effective and permitted sets are not cleared if the real and
saved set-user-ID are set to nonzero values.

This was tested using the following C code and reading the kernel
source at security/commoncap.c: cap_emulate_setxuid.

void print_caps(void) {
    cap_t current = cap_get_proc();
    if (!current) {
        perror("Current caps");
        return;
    }
    char *text = cap_to_text(current, NULL);
    if (!text) {
        perror("Converting caps to text");
        goto free_caps;
    }
    printf("Capabilities: %s\n", text);
    cap_free(text);
free_caps:
    cap_free(current);
}

void print_creds(void) {
    uid_t ruid, suid, euid;
    if (getresuid(&ruid, &euid, &suid)) {
        perror("Error getting UIDs");
        return;
    }
    printf("real = %d, effective = %d, saved set-user-ID = %d\n", ruid, euid, suid);
}

void set_caps(int size, const cap_value_t *caps) {
    cap_t current = cap_init();
    if (!current) {
        perror("Error getting current caps");
        return;
    }
    if (cap_clear(current)) {
        perror("Error clearing caps");
    }
    if (cap_set_flag(current, CAP_INHERITABLE, size, caps, CAP_SET)) {
        perror("setting caps");
        goto free_caps;
    }
    if (cap_set_flag(current, CAP_EFFECTIVE, size, caps, CAP_SET)) {
        perror("setting caps");
        goto free_caps;
    }
    if (cap_set_flag(current, CAP_PERMITTED, size, caps, CAP_SET)) {
        perror("setting caps");
        goto free_caps;
    }
    if (cap_set_proc(current)) {
        perror("Comitting caps");
        goto free_caps;
    }
free_caps:
    cap_free(current);
}

const cap_value_t caps[] = {CAP_SETUID, CAP_SETPCAP};
const size_t num_caps = sizeof(caps) / sizeof(cap_value_t);

int main(int argc, char **argv) {
    puts("[+] Dropping most capabilities to reduce amount of console output...");
    set_caps(num_caps, caps);
    puts("[+] Dropped capabilities. Starting with these credentials and capabilities:");

    print_caps();
    print_creds();

    if (argc >= 2 && 0 == strncmp(argv[1], "keep", 4)) {
        puts("[+] Setting SECBIT_KEEP_CAPS bit");
        if (prctl(PR_SET_SECUREBITS, SECBIT_KEEP_CAPS, 0, 0, 0)) {
            perror("Setting secure bits");
            return 1;
        }
    }

    puts("[+] Setting effective UID to 1000");
    if (seteuid(1000)) {
        perror("Error setting effective UID");
        return 2;
    }
    print_caps();
    print_creds();

    puts("[+] Raising caps again");
    set_caps(num_caps, caps);
    print_caps();
    print_creds();

    puts("[+] Setting all remaining UIDs to nonzero values");
    if (setreuid(1000, 1000)) {
        perror("Error setting all UIDs to 1000");
        return 3;
    }
    print_caps();
    print_creds();

    return 0;
}

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-11-01 14:39:25 +01:00
Michael Kerrisk 6e8a3b421b user_namespaces.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-31 08:47:02 +01:00
Michael Kerrisk 043aaa9427 namespaces.7: f
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-31 08:40:21 +01:00
Michael Kerrisk d45e85a94b namespaces.7: Briefly explain why CAP_SYS_ADMIN is needed to create nonuser namespaces
Reported-by: Tycho Kirchner <tychokirchner@mail.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-31 08:39:02 +01:00
Michael Kerrisk 29af6f1a59 user_namespaces.7: Rework terminology describing ownership of nonuser namespaces
Prefer the word "owns" rather than "associated with" when
describing the relationship between user namespaces and non-user
namespaces. The existing text used a mix of the two terms, with
"associated with" being predominant, but to my ear, describing the
relationship as "ownership" is more comprehensible.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-31 08:31:47 +01:00
Josh Triplett d63618d564 precedence.7: Add as a redirect to operator.7
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-28 10:10:20 +01:00
Michael Kerrisk d7d7c8ea04 namespaces.7: SEE ALSO: add pam_namespace(8)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-25 10:19:45 +02:00
Jakub Wilk 29c8d172fd address_families.7: tfix
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-21 19:58:12 +02:00
Michael Kerrisk e1b1b8985c inode.7: tfix
Reported-by: Burkhard Lück <lueck@hube-lueck.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-17 08:19:39 +02:00
Michael Kerrisk a5409af7ec socket.7: SEE ALSO: add address_families(7)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-16 10:46:49 +02:00
Michael Kerrisk a88c75c24b address_families.7: New page that contains details of socket address families
There is too much detail in socket(2). Move most of it into
a new page instead.

Cowritten-by: Eugene Syromyatnikov <evgsyr@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-16 10:46:16 +02:00
Michael Kerrisk a970e1f920 sched.7: In the kernel source SCHED_OTHER is actually called SCHED_NORMAL
Reported-by: Eugene Syromyatnikov <evgsyr@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-14 16:15:50 +02:00
Michael Kerrisk c9a35b01a1 cgroup_namespaces.7: Clarify
Clarify the example by making an implied detail more explicit.

Quoting the Troy Engel on the problem with the original text:

    The problem is "and a process in a sibling cgroup (sub2)"
    (shown as PID 20124 here) - how did this get here? How do I
    recreate this? Following this example, there's no mention of
    how, it's out of place when following the instructions.
    There is nothing in any of the cgroup files which contain
    this (# grep freezer /proc/*/cgroup) while at this stage.

    The intent is understood, however the man page seems to skip
    a step to create this in the teaching example. We should add
    whatever simple steps are needed to create the "process in a
    sibling cgroup" as outlined so it makes sense - as written,
    I have no clue where "sibling cgroup (sub2)" came from, it
    just appeared out of the blue in that step. Thanks!

See https://bugzilla.kernel.org/show_bug.cgi?id=201047

Reported-by: Troy Engel <troyengel@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-14 13:56:27 +02:00
Michael Kerrisk d190902bc2 cgroup_namespaces.7: Move a sentence from DESCRIPTION to NOTES
This sentence fits better in NOTES.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-14 13:40:47 +02:00
Michael Kerrisk e39f614f9f cgroup_namespaces.7: Remove redundant use of 'sh -c' in shell session
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-14 13:37:02 +02:00
Michael Kerrisk 4d9b3039d6 cgroup_namespaces.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-14 11:41:57 +02:00
Michael Kerrisk 44084d19bb cgroups.7: Complete partial sentence re kernel boot options and 'nsdelegate'
The intended text was hidden elsewhere in the source of the
page as a comment.

https://bugzilla.kernel.org/show_bug.cgi?id=201029

Reported-by: Mike Weilgart <mike.weilgart@verticalsysadmin.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-14 10:10:02 +02:00
Michael Kerrisk 2b3c0042d1 sched.7: SEE ALSO: add ps(1) and top(1)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-09 12:53:13 +02:00
Michael Kerrisk 17094a28ff cgroups.7: Minor wording fix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-09 11:48:45 +02:00
Michael Kerrisk edc90967b9 cgroups.7: wfix: use "threads" consistently
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-09 11:48:03 +02:00
Michael Kerrisk 0bef253ec5 cgroups.7: Add more detail on v2 'cpu' controller and realtime threads
Explicitly note the scheduling policies that are relevant for the
v2 'cpu' controller.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-09 11:45:43 +02:00
Michael Kerrisk 4644794c1e cgroups.7: Minor wording fix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-05 08:49:15 +02:00
Michael Kerrisk 6c9aa5ad5f cgroups.7: Rework discussion of writing to cgroup.type file
In particular, it is possible to write "threaded" to a
cgroup.type file if the current type is "domain threaded".
Previously, the text had implied that this was not possible.
Verified by experiment on Linux 4.15 and 4.19-rc.

Reported-by: Leah Hanson <lhanson@pivotal.io>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-05 08:22:10 +02:00
Michael Kerrisk df0a41dfe3 pid_namespaces.7: Note a detail of /proc/PID/ns/pid_for_children behavior
After clone(CLONE_NEWPID), /proc/PID/ns/pid_for_children is empty
until the first child is created. Verified by experiment.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-01 14:49:08 +02:00
Michael Kerrisk e5cd406d8e pid_namespaces.7: Note that a process can do unshare(CLONE_NEWPID) only once
(See the recent commit to the unshare(2) manual page.)

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-10-01 14:42:07 +02:00
Michael Kerrisk 3acd70581d capabilities.7: Update URL for location of POSIX.1e draft standard
Reported-by: Allison Randal <allison@lohutok.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-09-29 00:02:44 +02:00
Michael Kerrisk 37894e514e sched.7: SEE ALSO: add chcpu(1), lscpu(1)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-09-28 18:38:48 +02:00
Michael Kerrisk 396761eee3 cgroups.7: Minor clarification to remove possible ambiguity
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-09-20 12:25:00 +02:00
Michael Kerrisk 5367a9aba9 capabilities.7: Ambient capabilities do not trigger secure-execution mode
Reported-by: Pierre Chifflier <pollux@debian.org>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-09-13 11:41:08 +02:00
Michael Kerrisk 96123f413d signal.7: SEE ALSO: add clone(2)
Because of the discussion of trheads and signals in clone(2)/

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-09-10 11:18:06 +02:00
Michael Kerrisk c2df769494 cgroups.7: tfix
Reported-by: Mike Weilgart <mike.weilgart@verticalsysadmin.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-09-06 23:19:36 +02:00
Lucas Werkmeister 8bd6881ea9 user_namespaces.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-08-18 09:45:06 +02:00
Jakub Wilk 68bd4ad98c namespaces.7: tfix
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-08-12 14:08:19 +02:00
Tobias Klauser 5a2ed9eebe namespaces.7: tfix
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-08-06 21:42:42 +02:00
Michael Kerrisk 0d59d0c8bf capabilities.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-08-03 16:07:59 +02:00
Michael Kerrisk 50c7074665 posixoptions.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-08-03 15:53:03 +02:00
Michael Kerrisk 3426f62cea namespaces.7: Mention ioctl(2) in discussion of namespaces APIs
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-08-03 07:36:48 +02:00
Michael Kerrisk 9a6d888cb6 namespaces.7: List factors that may pin a namespace into existence
Various factors may pin a namespace into existence, even when it
has no member processes.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-08-03 07:30:17 +02:00
Michael Kerrisk 7df0e773c7 unix.7: wfix: s/foreign process/peer process/
The more common parlance these days is, I think, "peer".

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-28 12:30:44 +02:00
Michael Kerrisk 94950b9a68 socket.7, unix.7: Move text describing SO_PEERCRED from socket(7) to unix(7)
This is, AFAIK, an option specific to UNIX domain sockets, so
place it in unix(7).

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-28 12:30:44 +02:00
Michael Kerrisk ffab8460c6 unix.7: Refer reader to socket(7) for information about SO_PEEK_OFF
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-28 12:30:44 +02:00
Michael Kerrisk 2fc7c74cc5 socket.7: Refer reader to unix(7) for information on SO_PASSSEC
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-28 12:30:44 +02:00
Michael Kerrisk 48c2b7065d tcp.7, udp.7: Add a reference to socket(7) noting existence of further socket options
Some other socket options that are applicable for TCP and UDP sockets
are documented in socket(7), so help the reader by pointing them at
that page.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-28 12:30:44 +02:00
Michael Kerrisk 670387c122 udp.7: srcfix: add FIXME
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-28 12:30:44 +02:00
Michael Kerrisk 1221abb60e unix.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-28 12:30:44 +02:00
Michael Kerrisk ffad6a017f unix.7: Document SCM_SECURITY ancillary data
And fix a wording error in the description of SO_PASSSEC.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-28 12:30:44 +02:00
Michael Kerrisk 366a9bffc8 unix.7: Document SO_PASSSEC
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-28 11:50:11 +02:00
Michael Kerrisk 5af0f223d1 unix.7: Ancillary data forms a barrier when receiving on a stream socket
Thanks to a tip from Keith Packard:
https://keithp.com/blogs/fd-passing/
(Also verified by experiment.)

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-17 09:39:56 +02:00
Michael Kerrisk 5219daec26 unix.7: One must send at least one byte of real data with ancillary data
When sending ancillary data, at least one byte of real data should
also be sent.  This is strictly necessary for stream sockets
(verified by experiment). It is not required for datagram sockets
on Linux (verified by experiment), but portable applications
should do so.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-15 10:33:42 +02:00
Michael Kerrisk c0e56ed687 unix.7: Clarify treatment of incoming ancillary data if 'msg_control' is NULL
If no buffer is supplied for incoming ancillary data, then
the data is lost.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-15 10:33:32 +02:00
Michael Kerrisk 4564dd1fee unix.7: If the buffer to receive SCM_RIGHTS FDs is too small, FDs are closed
If the ancillary data buffer for receiving SCM_RIGHTS file
descriptors is too small, then the excess file descriptors are
automatically closed in the receiving process. Verified by
experiment.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-15 10:16:49 +02:00
Michael Kerrisk b65f4c691d unix.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-15 10:16:49 +02:00
Michael Kerrisk 879962006f unix.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-15 09:50:30 +02:00
Michael Kerrisk 93f5b0f8f4 mount_namespaces.7: SEE ALSO: add findmnt(8)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-13 07:08:28 +02:00
Michael Kerrisk 5b5cb19580 unix.7: When sending ancillary data, only one item of each type may be sent
Verified by experiment and reading the source code (although
the SCM_RIGHTS case is not so clear to me in the source code).

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-10 07:14:50 +02:00
Michael Kerrisk 52900faab3 unix.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-10 07:14:50 +02:00
Michael Kerrisk 311bf2f694 unix.7: Minor wording fixes
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-10 07:14:50 +02:00
Michael Kerrisk 05bf3361a6 unix.7: grfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-10 07:14:50 +02:00
Michael Kerrisk c87721467e unix.7: Note behavior if buffer to receive ancillary data is too small
If the buffer supplied to recvmsg() to receive ancillary data is
too small, then the data is truncated and the MSG_CTRUNC flag is
set.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-08 21:13:08 +02:00
Michael Kerrisk 13600496d3 unix.7: Enhance the description of SCM_RIGHTS
The existing description is rather thin. More can be said.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-08 10:57:27 +02:00
Michael Kerrisk 8bdcf4bf81 unix.7: There is a limit on the size of the file descriptor array for SCM_RIGHTS
The limit is defined in the kernel as SCM_MAX_FD (253).

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-08 10:38:44 +02:00
Michael Kerrisk f1081bdc42 unix.7: Fix a minor imprecision in description of SCM_CREDENTIALS
To spoof credentials requires privilege (i.e., capabilities),
not UID 0.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-08 10:21:43 +02:00
Michael Kerrisk b66d5714b1 unix.7: grfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-08 10:20:52 +02:00
Michael Kerrisk bdef802116 unix.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-08 10:20:32 +02:00
Michael Kerrisk 2c77e8de08 capabilities.7: Note that v3 security.attributes are transparently created/retrieved
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-02 09:59:21 +02:00
Michael Kerrisk 00ae99b028 capabilities.7: Fix some imprecisions in discussion of namespaced file capabilities
The file UID does not come into play when creating a v3
security.capability extended attribute.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-01 11:42:13 +02:00
Michael Kerrisk 9b2c207a33 capabilities.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-01 11:42:13 +02:00
Michael Kerrisk c281d0505d capabilities.7: wfix
Fix some confusion between "mask" and "extended attribute"

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-01 11:42:13 +02:00
Michael Kerrisk 54254ef33a capabilities.7: srcfix: Removed FIXME
No credential match of file UID and namespace creator UID
is needed to create a v3 security extended attribute.

Verified by experiment using my userns_child_exec.c and
show_creds.c programs (available on http://man7.org/tlpi/code):

    $ sudo setcap cap_setuid,cap_dac_override=pe \
            ./userns_child_exec
    $ ./userns_child_exec -U -r setcap cap_kill=pe show_creds
    $ ./userns_child_exec -U -M '0 1000 10' -G '0 1000 1' \
            -s 1 ./show_creds
    eUID = 1;  eGID = 0;  capabilities: = cap_kill+ep

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-07-01 11:42:07 +02:00
Michael Kerrisk ffea2c14f2 capabilities.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-06-24 08:54:17 +02:00
Michael Kerrisk a607673bb8 epoll.7: Consistently use the term "interest list" rather than "epoll set"
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-06-22 12:21:56 +02:00
Michael Kerrisk d1d90ea54d epoll.7: Expand the discussion of the implications of file descriptor duplication
In particular, note that it may be difficult for an application
to know about the existence of duplicate file descriptors.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-06-22 12:20:25 +02:00
Michael Kerrisk a3961b2fd5 epoll.7: Note that edge-triggered notification wakes up only one waiter
Note a useful performance benefit of EPOLLET: ensuring that
only one of multiple waiters (in epoll_wait()) is woken
up when a file descriptor becomes ready.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-06-22 12:20:25 +02:00
Michael Kerrisk 0409116028 epoll.7: Introduce the terms "interest list" and "ready list"
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-06-22 12:20:25 +02:00
Michael Kerrisk 4524285a71 epoll.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-06-22 09:41:16 +02:00
Michael Kerrisk 1e79ad8cd8 epoll.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-06-22 09:30:02 +02:00
Michael Kerrisk b4ebb4ee79 epoll.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-06-22 09:27:46 +02:00
Michael Kerrisk 6832efaf3c epoll.7: Reformat Q&A list
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-06-22 09:27:24 +02:00
Helge Deller 0201f48246 vdso.7: Fix parisc gateway page description
The parisc gateway page currently only exports 3 functions:
The lws_entry for CAS operations (at 0xb0), the set_thread_pointer
function for usage in glibc (at 0xe0) and the Linux syscall entry
(at 0x100).

All other symbols in the manpage are internal labels and
shouldn't be used directly by userspace or glibc, so drop them
from the man page documentation.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-05-28 11:04:33 +02:00
Michael Kerrisk 0cec24722b signal.7: Clarify that sigsuspend() and pause() suspend the calling *thread*
Reported-by: Robin Kuzmin <kuzmin.robin@gmail.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-05-18 10:04:37 +02:00
Michael Kerrisk 390795d76a inotify.7: Note ENOTDIR error that can occur for IN_ONLYDIR
Note ENOTDIR error that occurs when requesting a watch on a
nondirectory with IN_ONLYDIR.

Reported-by: Paul Millar <paul.millar@desy.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-05-06 10:22:13 +02:00
Michael Kerrisk 0a719e9411 capabilities.7: tfix
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-05-02 21:16:20 +02:00
Michael Kerrisk c87cbea10f capabilities.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-05-02 11:37:29 +02:00
Michael Kerrisk c2b279afb7 capabilities.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-05-01 13:55:37 +02:00
Michael Kerrisk ddc1ad3079 capabilities.7: Add background details on capability transformations during execve(2)
Add background details on ambient and bounding set when
discussing capability transformations during execve(2).

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-05-01 13:55:37 +02:00
Michael Kerrisk 7c957134f1 capabilities.7: Minor rewording
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-05-01 13:55:37 +02:00
Michael Kerrisk bb1f24fab8 capabilities.7: Reorder text on capability bounding set
Reverse order of text blocks describing pre- and
post-2.6.25 bounding set. No content changes.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-05-01 13:55:37 +02:00
Michael Kerrisk 2e87ced3b5 capabilities.7: Rework bounding set as per-thread set in transformation rules
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-05-01 13:55:37 +02:00
Michael Kerrisk 36de80b984 capabilities.7: Add text introducing bounding set along with other thread capability sets
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-05-01 13:55:37 +02:00
Michael Kerrisk daf8312704 capabilities.7: Clarify which capability sets capset(2) and capget(2) apply to
capset(2) and capget(2) apply operate only on the permitted,
effective, and inheritable process capability sets.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-05-01 12:46:48 +02:00
Michael Kerrisk 1db1d36d82 capabilities.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-05-01 12:40:14 +02:00
Michael Kerrisk 09b8afdc04 execve.2, fallocate.2, getrlimit.2, io_submit.2, membarrier.2, mmap.2, msgget.2, open.2, ptrace.2, readv.2, semget.2, shmget.2, shutdown.2, syscall.2, wait.2, wait4.2, crypt.3, encrypt.3, fseek.3, getcwd.3, makedev.3, pthread_create.3, puts.3, tsearch.3, elf.5, filesystems.5, group.5, passwd.5, sysfs.5, mount_namespaces.7, posixoptions.7, time.7, unix.7, vdso.7, xattr.7, ld.so.8: tstamp
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-30 17:41:31 +02:00
Michael Kerrisk 29c0586f51 bpf.2, sched_setattr.2, crypt.3, elf.5, proc.5, fanotify.7, feature_test_macros.7, sched.7: spfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-27 14:48:33 +02:00
Michael Kerrisk 075f5e6592 namespaces.7: Mention that device ID should also be checked when comparing NS symlinks
When comparing two namespaces symlinks to see if they refer to
the same namespace, both the inode number and the device ID
should be compared. This point was already made clear in
ioctl_ns(2), but was missing from this page.

Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-27 14:10:32 +02:00
Jakub Wilk 3eb078c52f unix.7: tfix
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-27 14:01:50 +02:00
Jakub Wilk 90ef0f7bf8 capabilities.7: tfix
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-27 14:01:43 +02:00
Michael Kerrisk 314d88f611 vdso.7: VDSO symbols (system calls) are not visible to seccomp(2) filters
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-24 18:25:44 +02:00
Michael Kerrisk 115c1eb46c capabilities.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-19 11:18:31 +02:00
Michael Kerrisk 690e62da71 capabilities.7: srcfix: FIXME
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 21:23:28 +02:00
Michael Kerrisk bcaa30c985 capabilities.7: Rework file capability versioning and namespaced file caps text
There was some confused missing of concepts between the
two subsections, and some other details that needed fixing up.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 21:23:28 +02:00
Michael Kerrisk 6442c03b68 capabilities.7: Explain when VFS_CAP_REVISION_3 file capabilities have effect
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 21:23:28 +02:00
Michael Kerrisk 7b45f4b2ad capabilities.7: Explain rules that determine version of security.capability xattr
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 21:23:28 +02:00
Michael Kerrisk 7da0c87a78 capabilities.7: Explain term "namespace root user ID"
Confirmed with Serge Hallyn that: "nsroot" means the UID 0
in the namespace as it would be mapped into the initial userns.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 21:23:28 +02:00
Michael Kerrisk 12dce73121 capabilities.7: Document namespaced-file capabilities
Cowritten-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 21:23:28 +02:00
Michael Kerrisk b684870410 capabilities.7: Describe file capability versioning
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 21:23:28 +02:00
Michael Kerrisk 873727f44a posixoptions.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 17:02:28 +02:00
Michael Kerrisk 11e9d8f890 posixoptions.7: Use a more consistent, less cluttered layout for option lists
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 17:02:18 +02:00
Michael Kerrisk 17282a589f posixoptions.7: Make function lists more consistent and less cluttered
Use more consistent layout for lists of functions, and
remove punctuation from the lists to make them less cluttered.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 10:44:01 +02:00
Michael Kerrisk 5a9ef49145 posixoptions.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 10:25:11 +02:00
Michael Kerrisk 6f131a899a posixoptions.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 10:25:11 +02:00
Michael Kerrisk 45adee316b posixoptions.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 10:25:11 +02:00
Michael Kerrisk 742ce8ddec posixoptions.7: tfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 10:25:11 +02:00
Michael Kerrisk 6b2300a2f3 posixoptions.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 09:42:26 +02:00
Carlos O'Donell 233b0395d8 posixoptions.7: Expand XSI Options groups
We define in detail the X/Open System Interfaces i.e. _XOPEN_UNIX
and all of the X/Open System Interfaces (XSI) Options Groups.

The XSI options groups include encryption, realtime, advanced
realtime, realtime threads, advanced realtime threads, tracing,
streams, and legacy interfaces.

Signed-off-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-13 09:39:10 +02:00
Michael Kerrisk 7934bcdfdc unix.7: ERRORS: add EBADF for sending closed file descriptor with SCM_RIGHTS
As noted by Rusty Russell:

I was really surprised that sendmsg() returned EBADF on a valid fd;
turns out I was using sendmsg with SCM_RIGHTS to send a closed fd,
which gives EBADF (see test program below).

But this is only obliquely referenced in unix(7):

       SCM_RIGHTS
              Send or receive a set  of  open  file  descriptors
              from  another  process.  The data portion contains
              an integer array of  the  file  descriptors.   The
              passed file descriptors behave as though they have
              been created with dup(2).

EBADF is not mentioned in the unix(7) ERRORS (it's mentioned in
dup(2)).

int fdpass_send(int sockout, int fd)
{
	/* From the cmsg(3) manpage: */
	struct msghdr msg = { 0 };
	struct cmsghdr *cmsg;
	struct iovec iov;
	char c = 0;
	union {         /* Ancillary data buffer, wrapped in a union
			   in order to ensure it is suitably aligned */
		char buf[CMSG_SPACE(sizeof(fd))];
		struct cmsghdr align;
	} u;

	msg.msg_control = u.buf;
	msg.msg_controllen = sizeof(u.buf);
	memset(&u, 0, sizeof(u));
	cmsg = CMSG_FIRSTHDR(&msg);
	cmsg->cmsg_level = SOL_SOCKET;
	cmsg->cmsg_type = SCM_RIGHTS;
	cmsg->cmsg_len = CMSG_LEN(sizeof(fd));
	memcpy(CMSG_DATA(cmsg), &fd, sizeof(fd));

	msg.msg_name = NULL;
	msg.msg_namelen = 0;
	msg.msg_iov = &iov;
	msg.msg_iovlen = 1;
	msg.msg_flags = 0;

	/* Keith Packard reports that 0-length sends don't work, so we
	 * always send 1 byte. */
	iov.iov_base = &c;
	iov.iov_len = 1;

	return sendmsg(sockout, &msg, 0);
}

int fdpass_recv(int sockin)
{
	/* From the cmsg(3) manpage: */
	struct msghdr msg = { 0 };
	struct cmsghdr *cmsg;
	struct iovec iov;
	int fd;
	char c;
	union {         /* Ancillary data buffer, wrapped in a union
			   in order to ensure it is suitably aligned */
		char buf[CMSG_SPACE(sizeof(fd))];
		struct cmsghdr align;
	} u;

	msg.msg_control = u.buf;
	msg.msg_controllen = sizeof(u.buf);

	msg.msg_name = NULL;
	msg.msg_namelen = 0;
	msg.msg_iov = &iov;
	msg.msg_iovlen = 1;
	msg.msg_flags = 0;

	iov.iov_base = &c;
	iov.iov_len = 1;

	if (recvmsg(sockin, &msg, 0) < 0)
		return -1;

	cmsg = CMSG_FIRSTHDR(&msg);
        if (!cmsg
	    || cmsg->cmsg_len != CMSG_LEN(sizeof(fd))
	    || cmsg->cmsg_level != SOL_SOCKET
	    || cmsg->cmsg_type != SCM_RIGHTS) {
		errno = -EINVAL;
		return -1;
	}

	memcpy(&fd, CMSG_DATA(cmsg), sizeof(fd));
	return fd;
}

static void child(int sockfd)
{
	int newfd = fdpass_recv(sockfd);
	assert(newfd < 0);
	exit(0);
}

int main(void)
{
	int sv[2];
	int pid, ret;

	assert(socketpair(AF_UNIX, SOCK_STREAM, 0, sv) == 0);

	pid = fork();
	if (pid == 0) {
		close(sv[1]);
		child(sv[0]);
	}

	close(sv[0]);
	ret = fdpass_send(sv[1], sv[0]);
	printf("fdpass of bad fd return %i (%s)\n", ret, strerror(errno));
	return 0;
}

Reported-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-12 10:55:29 +02:00
Michael Kerrisk d3e7786def unix.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-12 10:42:34 +02:00
Konstantin Grinemayer 04c8a02088 keyring.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-04-12 08:46:42 +02:00
Michael Kerrisk 3f6061d025 socket.7: Fix error in SO_INCOMING_CPU code snippet
The last argument is passed by value, not reference.
Reported-by: Tomi Salminen <tsalminen@forcepoint.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-03-27 22:06:52 +02:00
Michael Kerrisk d8c64e25f8 network_namespaces.7: Add cross reference to unix(7)
For further information on UNIX domain abstract sockets.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-03-16 08:50:36 +01:00
Michael Kerrisk 39ad46695f time.7: Mention clock_gettime()/clock_settime() rather than [gs]ettimeofday()
gettimeofday() is declared obsolete by POSIX. Mention instead
the modern APIs for working with the realtime clock.

See https://bugzilla.kernel.org/show_bug.cgi?id=199049

Reported-by: Enrique Garcia <cquike@arcor.de>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-03-16 08:50:36 +01:00
Michael Kerrisk 6b49df2229 mount_namespaces.7: Note another case where shared "peer groups" are formed
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-02-25 16:42:16 +01:00
Michael Kerrisk 46af719866 mount_namespaces.7: ffix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-02-25 16:37:08 +01:00
Michael Kerrisk a21658aad3 network_namespaces.7: Network namespaces isolate the UNIX domain abstract socket namespace
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-02-24 23:04:53 +01:00
Michael Kerrisk aeeb48005e user_namespaces.7: wfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-02-23 10:38:47 +01:00
Michael Kerrisk 1a7e08e367 namespaces.7: Note an idiosyncracy of /proc/[pid]/ns/pid_for_children
/proc/[pid]/ns/pid_for_children has a value only after first
child is created in PID namespace. Verified by experiment.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-02-21 17:31:48 +01:00
Michael Kerrisk 0813749503 capabilities.7: remove redundant mention of PTRACE_SECCOMP_GET_FILTER
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-02-21 10:38:17 +01:00
Michael Kerrisk 9863b9acfe xattr.7: SEE ALSO: add selinux(8)
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-02-21 08:43:14 +01:00
Michael Kerrisk 7747ed9789 cgroups.7: cgroup.events transitions generate POLLERR as well as POLLPRI
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-02-10 09:46:14 +01:00
Michael Kerrisk 2cd9bbfa48 Removed trailing white space at end of lines 2018-02-02 07:48:33 +01:00
Michael Kerrisk 8538a62b4c iconv.1, bpf.2, copy_file_range.2, fcntl.2, memfd_create.2, mlock.2, mount.2, mprotect.2, perf_event_open.2, pkey_alloc.2, prctl.2, read.2, recvmmsg.2, s390_sthyi.2, seccomp.2, sendmmsg.2, syscalls.2, unshare.2, write.2, errno.3, fgetpwent.3, fts.3, pthread_rwlockattr_setkind_np.3, fuse.4, veth.4, capabilities.7, cgroups.7, ip.7, man-pages.7, namespaces.7, network_namespaces.7, sched.7, socket.7, user_namespaces.7, iconvconfig.8: tstamp
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-02-02 07:38:54 +01:00
Michael Kerrisk 93b96116f0 vsock.7: Add license and copyright
Stefan noted on the mailing list that selection of the
verbatim license was fine.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-02-01 22:23:28 +01:00
Jakub Wilk 7a1cddd289 cgroups.7: tfix
Signed-off-by: Jakub Wilk <jwilk@jwilk.net>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-01-26 19:58:40 +01:00
Michael Kerrisk 42dfc34c33 capabilities.7: spfix
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-01-13 20:58:58 +01:00
Michael Kerrisk cd7f4c4958 cgroups.7: Add a detail on delegation of cgroup.threads
Some notes from a conversation with Tejun Heo:

    Subject: Re: cgroups(7): documenting cgroups v2 delegation
    Date: Wed, 10 Jan 2018 14:27:26 -0800
    From: Tejun Heo <tj@kernel.org>

    > > 1. When delegating, cgroup.threads should be delegated.  Doing that
    > >    selectively doesn't achieve anything meaningful.
    >
    > Understood. But surely delegating cgroup.threads is effectively
    > meaningless when delegating a "domain" cgroup tree? (Obviously it's
    > not harmful to delegate the the cgroup.threads file in this case;
    > it's just not useful to do so.)

    Yeap, unless we can somehow support non-root mixed domains.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-01-11 00:52:26 +01:00
Michael Kerrisk 6dc513cd38 cgroups.7: Subhierarchy under delegated subtree will be owned by delegatee
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-01-11 00:47:12 +01:00
Michael Kerrisk 7b327dd5f3 cgroups.7: Add a detail on delegation of cgroup.threads
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2018-01-11 00:47:12 +01:00