Sometime soon, we'll have to add documentation of clone3() to this
page. As a preparatorys step, make the names of the clone()
arguments the same as the fields in the clone3() 'args' struct:
ctid ==> child_pid
ptid ==> parent_tid
newtls ==> tld
child_stack ==> stack
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
As noted in kernel commit 821cc7b0b205c0df64cce59aacc330af251fa8f7,
threads create an ambiguity: what if the calling process's PGID
is changed by another thread while waitpid(0, ...) is blocked?
So, clarify that waitpid(0, ...) means wait for children whose
PGID matches the caller's PGID at the time of the call to
waitpid().
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Since Linux 5.4, idtype == P_PGID && id == 0 can be used to wait
on children in same process group as caller.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Since kernel commit a9a08845e9acbd224e4ee466f5c1275ed50054e8, the
equivalence between select() and poll()/epoll is defined in terms
of the EPOLL* constants, rather than the POLL* constants.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Christian noted that SA_NOCLDWAIT also matters in this scenario.
Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
After review comments from Christian and Daniel.
Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Reported-by: Daniel Colascione <dancol@google.com>
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Thus, pidfd_open() is the preferred way of obtaining a PID
file descriptor.
Notes from a conversation with Christian Brauner:
[[
> A further question... We now have three ways of getting a
> process file descriptor [*]:
>
> open() of /proc/PID
> pidfd_open()
> clone()/clone3() with CLONE_PIDFD
>
> I thought the FD was supposed to be equivalent in all three cases.
> However, if I try (on kernel 5.3) poll() an FD returned by opening
> /proc/PID, poll() tells me POLLNVAL for the FD. Is that difference
> intentional? (I am guessing it is not.)
It's intentional.
The short answer is that /proc/<pid> is a convenience for sending
signals.
The longer answer is that this stems from a heavy debate about what a
process file descriptor was supposed to be and some people pushing for
at least being able to use /proc/<pid> dirfds while ignoring security
problems as soon as you're talking about returning those fds from
clone(); not to mention the additional problems discovered when trying
to implementing this.
A "real" pidfd is one from CLONE_PIDFD or pidfd_open() and all features
such as exit notification, read, and other future extensions will only
be implemented on top of them.
As much as we'd have liked to get rid of two different file descriptor
types it doesn't hurt us much and is not that much different from what
we will e.g. see with fsinfo() in the new mount api which needs to work
on regular fds gotten via open()/openat() and mountfds gotten from
fsopen() and fspick(). The mountfds will also allow for advanced
operations that the other ones will not. There's even an argument to be
made that fds you will get from open()/openat() and openat2() are
different types since they have very different behavior; openat2()
returning fds that are non arbitrarily upgradable etc.
]]
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>