From 30d0d39a4f09e3e893a1d643f89d7163d32753bc Mon Sep 17 00:00:00 2001 From: Michael Kerrisk Date: Mon, 23 Sep 2019 10:00:19 +0200 Subject: [PATCH] pidfd_open.2: Opening /proc/PID doesn't yield a pollable file descriptor Thus, pidfd_open() is the preferred way of obtaining a PID file descriptor. Notes from a conversation with Christian Brauner: [[ > A further question... We now have three ways of getting a > process file descriptor [*]: > > open() of /proc/PID > pidfd_open() > clone()/clone3() with CLONE_PIDFD > > I thought the FD was supposed to be equivalent in all three cases. > However, if I try (on kernel 5.3) poll() an FD returned by opening > /proc/PID, poll() tells me POLLNVAL for the FD. Is that difference > intentional? (I am guessing it is not.) It's intentional. The short answer is that /proc/ is a convenience for sending signals. The longer answer is that this stems from a heavy debate about what a process file descriptor was supposed to be and some people pushing for at least being able to use /proc/ dirfds while ignoring security problems as soon as you're talking about returning those fds from clone(); not to mention the additional problems discovered when trying to implementing this. A "real" pidfd is one from CLONE_PIDFD or pidfd_open() and all features such as exit notification, read, and other future extensions will only be implemented on top of them. As much as we'd have liked to get rid of two different file descriptor types it doesn't hurt us much and is not that much different from what we will e.g. see with fsinfo() in the new mount api which needs to work on regular fds gotten via open()/openat() and mountfds gotten from fsopen() and fspick(). The mountfds will also allow for advanced operations that the other ones will not. There's even an argument to be made that fds you will get from open()/openat() and openat2() are different types since they have very different behavior; openat2() returning fds that are non arbitrarily upgradable etc. ]] Signed-off-by: Michael Kerrisk --- man2/pidfd_open.2 | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/man2/pidfd_open.2 b/man2/pidfd_open.2 index 98fe449b1..51136e170 100644 --- a/man2/pidfd_open.2 +++ b/man2/pidfd_open.2 @@ -88,6 +88,19 @@ the file descriptor indicates as readable. Note, however, that in the current implementation, nothing can be read from the file descriptor. .PP +The +.BR pidfd_open () +system call is the preferred way of obtaining a PID file descriptor. +The alternative is to obtain a file descriptor by opening a +.I /proc/[pid] +directory. +However, the latter technique is possible only if the +.BR proc (5) +file system is mounted; +furthermore, the file descriptor obtained in this way is +.I not +pollable. +.PP See also the discussion of the .BR CLONE_PIDFD flag in