diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html
index 59acd82e67d4..2a56031bfdd4 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.html
+++ b/Documentation/RCU/Design/Requirements/Requirements.html
@@ -583,6 +583,17 @@ The first and second guarantees require unbelievably strict ordering!
Are all these memory barriers really required?
Answer
+
Quick Quiz 7:
+You claim that rcu_read_lock() and rcu_read_unlock()
+generate absolutely no code in some kernel builds.
+This means that the compiler might arbitrarily rearrange consecutive
+RCU read-side critical sections.
+Given such rearrangement, if a given RCU read-side critical section
+is done, how can you be sure that all prior RCU read-side critical
+sections are done?
+Won't the compiler rearrangements make that impossible to determine?
+
Answer
+
Note that these memory-barrier requirements do not replace the fundamental RCU requirement that a grace period wait for all pre-existing readers. @@ -626,9 +637,9 @@ inconvenience can be avoided through use of the call_rcu() and kfree_rcu() API members described later in this document. -
Quick Quiz 8:
But how does the upgrade-to-write operation exclude other readers?
-
Answer
+
Answer
This guarantee allows lookup code to be shared between read-side @@ -714,9 +725,9 @@ to do significant reordering. This is by design: Any significant ordering constraints would slow down these fast-path APIs. -
Quick Quiz 9:
Can't the compiler also reorder this code?
-
Answer
+
Answer
Quick Quiz 10:
Suppose that synchronize_rcu() did wait until all readers had completed.
Would the updater be able to rely on this?
-
Answer
+
Answer
Quick Quiz 11:
How long a sequence of grace periods, each separated by an RCU read-side
critical section, would be required to partition the RCU read-side
critical sections at the beginning and end of the chain?
-
Answer
+
Answer
Quick Quiz 12:
What about sleeping locks?
-
Answer
+
Answer
It often comes as a surprise that many algorithms do not require a @@ -1354,12 +1365,12 @@ write an RCU callback function that takes too long. Long-running operations should be relegated to separate threads or (in the Linux kernel) workqueues. -
Quick Quiz 13:
Why does line 19 use rcu_access_pointer()?
After all, call_rcu() on line 25 stores into the
structure, which would interact badly with concurrent insertions.
Doesn't this mean that rcu_dereference() is required?
-
Answer
+
Answer
However, all that remove_gp_cb() is doing is @@ -1406,14 +1417,14 @@ This was due to the fact that RCU was not heavily used within DYNIX/ptx, so the very few places that needed something like synchronize_rcu() simply open-coded it. -
Quick Quiz 14:
Earlier it was claimed that call_rcu() and
kfree_rcu() allowed updaters to avoid being blocked
by readers.
But how can that be correct, given that the invocation of the callback
and the freeing of the memory (respectively) must still wait for
a grace period to elapse?
-
Answer
+
Answer
But what if the updater must wait for the completion of code to be @@ -1838,11 +1849,11 @@ kthreads to be spawned. Therefore, invoking synchronize_rcu() during scheduler initialization can result in deadlock. -
Quick Quiz 15:
So what happens with synchronize_rcu() during
scheduler initialization for CONFIG_PREEMPT=n
kernels?
-
Answer
+
Answer
I learned of these boot-time requirements as a result of a series of @@ -2547,10 +2558,10 @@ If you needed to wait on multiple different flavors of SRCU (but why???), you would need to create a wrapper function resembling call_my_srcu() for each SRCU flavor. -
Quick Quiz 16:
But what if I need to wait for multiple RCU flavors, but I also need
the grace periods to be expedited?
-
Answer
+
Answer
Again, it is usually better to adjust the RCU read-side critical sections @@ -2827,6 +2838,39 @@ adhered to the as-if rule than it is to actually adhere to it!
Quick Quiz 7: +You claim that rcu_read_lock() and rcu_read_unlock() +generate absolutely no code in some kernel builds. +This means that the compiler might arbitrarily rearrange consecutive +RCU read-side critical sections. +Given such rearrangement, if a given RCU read-side critical section +is done, how can you be sure that all prior RCU read-side critical +sections are done? +Won't the compiler rearrangements make that impossible to determine? + + +
Answer: +In cases where rcu_read_lock() and rcu_read_unlock() +generate absolutely no code, RCU infers quiescent states only at +special locations, for example, within the scheduler. +Because calls to schedule() had better prevent calling-code +accesses to shared variables from being rearranged across the call to +schedule(), if RCU detects the end of a given RCU read-side +critical section, it will necessarily detect the end of all prior +RCU read-side critical sections, no matter how aggressively the +compiler scrambles the code. + +
+Again, this all assumes that the compiler cannot scramble code across +calls to the scheduler, out of interrupt handlers, into the idle loop, +into user-mode code, and so on. +But if your kernel build allows that sort of scrambling, you have broken +far more than just RCU! + + +
Back to Quick Quiz 7. + + +
Quick Quiz 8: But how does the upgrade-to-write operation exclude other readers? @@ -2835,10 +2879,10 @@ It doesn't, just like normal RCU updates, which also do not exclude RCU readers. -
Quick Quiz 9: Can't the compiler also reorder this code? @@ -2848,10 +2892,10 @@ No, the volatile casts in READ_ONCE() and this particular case. -
Quick Quiz 10: Suppose that synchronize_rcu() did wait until all readers had completed. Would the updater be able to rely on this? @@ -2866,10 +2910,10 @@ Therefore, the code following in any case. -
Quick Quiz 11: How long a sequence of grace periods, each separated by an RCU read-side critical section, would be required to partition the RCU read-side critical sections at the beginning and end of the chain? @@ -2883,10 +2927,10 @@ Therefore, even in practice, RCU users must abide by the theoretical rather than the practical answer. -
Quick Quiz 12: What about sleeping locks? @@ -2914,10 +2958,10 @@ the mutex was not immediately available. Either way, mutex_trylock() returns immediately without sleeping. -
Quick Quiz 13: Why does line 19 use rcu_access_pointer()? After all, call_rcu() on line 25 stores into the structure, which would interact badly with concurrent insertions. @@ -2933,10 +2977,10 @@ is released on line 25, which in turn means that rcu_access_pointer() suffices. -
Quick Quiz 14: Earlier it was claimed that call_rcu() and kfree_rcu() allowed updaters to avoid being blocked by readers. @@ -2957,10 +3001,10 @@ next update as soon as it has invoked call_rcu() or grace period. -
Quick Quiz 15: So what happens with synchronize_rcu() during scheduler initialization for CONFIG_PREEMPT=n kernels? @@ -2976,10 +3020,10 @@ so it is still necessary to avoid invoking synchronize_rcu() during scheduler initialization. -
Quick Quiz 16: But what if I need to wait for multiple RCU flavors, but I also need the grace periods to be expedited? @@ -2991,7 +3035,7 @@ But if that is nevertheless a problem, you can use workqueues or multiple kthreads to wait on the various expedited grace periods concurrently. -