futex.2: Fixes after review comments from Thomas Gleixner

Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
2015-08-07 22:41:53 +02:00 · 2015-08-07 22:41:53 +02:00 · c3875d1d3a
parent 30239c10a8
commit c3875d1d3a
1 changed files with 76 additions and 50 deletions
--- a/man2/futex.2
+++ b/man2/futex.2
@ -913,22 +913,58 @@ The operation checks the value of the futex word at the address
 .IR uaddr .
 If the value is 0, then the kernel tries to atomically set
 the futex value to the caller's TID.
-.\" FIXME What would be the cause(s) of failure referred to
+If the futex word's value is nonzero,
 .\"       in the following sentence?
 If that fails,
 or the futex word's value is nonzero,
 the kernel atomically sets the
 .B FUTEX_WAITERS
 bit, which signals the futex owner that it cannot unlock the futex in
 user space atomically by setting the futex value to 0.
-After that, the kernel tries to find the thread which is
+.\" tglx (July 2015):
-associated with the owner TID,
+.\"     The operation here is similar to the FUTEX_WAIT logic. When the user
-.\" FIXME Could I get a bit more detail on the next two lines?
+.\"     space atomic acquire does not succeed because the futex value was non
-.\"       What is "creates or reuses kernel state" about?
+.\"     zero, then the waiter goes into the kernel, takes the kernel internal
-.\"       (I think this needs to be clearer in the page)
+.\"     lock and retries the acquisition under the lock. If the acquisition
-creates or reuses kernel state on behalf of the owner
+.\"     does not succeed either, then it sets the FUTEX_WAITERS bit, to signal
-and attaches the waiter to it.
+.\"     the lock owner that it needs to go into the kernel. Here is the pseudo
-
+.\"     code:
 .\"
 .\"     	lock(kernel_lock);
 .\"     retry:
 .\"     	
 .\"     	/*
 .\"     	 * Owner might have unlocked in userspace before we
 .\"     	 * were able to set the waiter bit.
 .\"              */
 .\"             if (atomic_acquire(futex) == SUCCESS) {
 .\"     	   unlock(kernel_lock());
 .\"     	   return 0;
 .\"     	}
 .\"
 .\"     	/*
 .\"     	 * Owner might have unlocked after the above atomic_acquire()
 .\"     	 * attempt.
 .\"     	 */
 .\"     	if (atomic_set_waiters_bit(futex) != SUCCESS)
 .\"     	   goto retry;
 .\"
 .\"     	queue_waiter();
 .\"     	unlock(kernel_lock);
 .\"     	block();
 .\"
 After that, the kernel:
 .RS
 .IP 1. 3
 Tries to find the thread which is associated with the owner TID.
 .IP 2.
 Creates or reuses kernel state on behalf of the owner.
 (If this is the first waiter, there is no kernel state for this
 futex, so kernel state is created by locking the RT-mutex
 and the futex owner is made the owner of the RT-mutex.
 If there are existing waiters, then the existing state is reused.)
 .IP 3.
 Attaches the waiter to it
 (i.e., the waiter is enqueued on the RT-mutex waiter list).
 .RE
 .IP
 If more than one waiter exists,
 the enqueueing of the waiter is in descending priority order.
 (For information on priority ordering, see the discussion of the
@ -946,16 +982,11 @@ policy) or the waiter's priority (if the waiter is scheduled under the
 or
 .BR SCHED_FIFO
 policy).
-.\"
+This inheritance follows the lock chain in the case of nested locking
-.\" FIXME Could I get some help translating the next sentence into
+(i.e., task 1 blocks on lock A, held by task 2,
-.\"       something that user-space developers (and I) can understand?
+while task 2 blocks on lock B, held by task 3)
-.\"       In particular, what are "nested locks" in this context?
+and performs deadlock detection.
 This inheritance follows the lock chain in the case of
 nested locking and performs deadlock detection.
 .\" FIXME tglx said "The timeout argument is handled as described in
 .\"       FUTEX_WAIT." However, it appears to me that this is not right.
 .\"       Is the following formulation correct?
 The
 .I timeout
 argument provides a timeout for the lock attempt.
@ -980,21 +1011,19 @@ arguments are ignored.
 .\" commit c87e2837be82df479a6bae9f155c43516d2feebc
 This operation tries to acquire the futex at
 .IR uaddr .
 It is invoked when a user-space atomic acquire did not
 succeed because the futex word was not 0.
 The trylock in kernel might succeed because the futex word
 contains stale state
 .RB ( FUTEX_WAITERS
 and/or
 .BR FUTEX_OWNER_DIED ).
 This can happen when the owner of the futex died.
 .\" FIXME I think it would be helpful here to say a few more words about
 .\"       the difference(s) between FUTEX_LOCK_PI and FUTEX_TRYLOCK_PI.
 .\"       Can someone propose something?
 .\"
 .\" FIXME(Torvald)  Additionally, we claim above that just FUTEX_WAITERS
 .\"       is never an allowed state.
 It deals with the situation where the TID value at
 .I uaddr
 is 0, but the
 .B FUTEX_WAITERS
 bit is set.
 .\" FIXME How does the situation in the previous sentence come about?
 .\"       Probably it would be helpful to say something about that in
 .\"       the man page.
 .\" FIXME And *how* does FUTEX_TRYLOCK_PI deal with this situation?
 User space cannot handle this condition in a race-free manner
 The
@ -1083,31 +1112,28 @@ arguments serve the same purposes as for
 .BR FUTEX_WAIT_REQUEUE_PI " (since Linux 2.6.31)"
 .\" commit 52400ba946759af28442dee6265c5c0180ac7122
 .\"
-.\" FIXME I find the next sentence (from tglx) pretty hard to grok.
+Wait on a non-PI futex at
 .\"       Could someone explain it a bit more?
 Wait operation to wait on a non-PI futex at
 .I uaddr
-and potentially be requeued onto a PI futex at
+and potentially be requeued (via a
 .BR FUTEX_CMP_REQUEUE_PI
 operation in another task) onto a PI futex at
 .IR uaddr2 .
 The wait operation on
 .I uaddr
-is the same as
+is the same as for
 .BR FUTEX_WAIT .
-.\"
+
 .\" FIXME I'm not quite clear on the meaning of the following sentence.
 .\"       Is this trying to say that while blocked in a
 .\"       FUTEX_WAIT_REQUEUE_PI, it could happen that another
 .\"       task does a FUTEX_WAKE on uaddr that simply causes
 .\"       a normal wake, with the result that the FUTEX_WAIT_REQUEUE_PI
 .\"       does not complete? What happens then to the FUTEX_WAIT_REQUEUE_PI
 .\"       opertion? Does it remain blocked, or does it unblock
 .\"       In which case, what does user space see?
 The waiter can be removed from the wait on
 .I uaddr
 via
 .BR FUTEX_WAKE
 without requeueing on
-.IR uaddr2 .
+.IR uaddr2
 via a
 .BR FUTEX_WAIT
 operation in another task.
 In this case, the
 .BR FUTEX_WAIT_REQUEUE_PI
 operation returns with the error
 .BR EWOULDBLOCK .
 If
 .I timeout
@ -1304,7 +1330,7 @@ The futex word at
 is already locked by the caller.
 .TP
 .BR EDEADLK
-.\" FIXME XXX I see that kernel/locking/rtmutex.c uses EDEADLK in some
+.\" FIXME . I see that kernel/locking/rtmutex.c uses EDEADLK in some
 .\"       places, and EDEADLOCK in others. On almost all architectures
 .\"       these constants are synonymous. Is there a reason that both
 .\"       names are used?