Skip to content

Commit e7950f7

Browse files
author
Jakub Łopuszański
committed
Bug #30837136 RW_LOCK_X_LOCK_LOW: CONDITIONAL JUMP OR MOVE
Valgrind identified a "problem" (present since 2013) that we sometimes read "invalid" value of writer_thread. A thread which releases the latch does: 1. set lock->recursive to false 2. mark lock->writer_thread as invalid A thread which tries to check if it already owns the latch does: 0. check lock->recursive is true 3. check lock->writer_thread is myself Note that if they are executed in order 0,1,2,3 even with full memory barriers etc. you will end up in a situation where the "invalid" value of writer_thread is being inspected. This is not really a bug, because if the value is "invalid" it still can not be equal to our thread's id, unless a torn read happened (quite unlikely on modern architectures) and the mixture of old and new bytes looks like our id. Another (unlikely?) possibility would be that the `0x0` value we used for initialization of `writer_thread` field is somehow our thread's id (which anyway shouldn't matter because of initially setting `recursive` to `false`). To remove any doubt here, instead of `0`, we will use `std::thread().native_handle()` which is the official "invalid" value for a thread handle - no real thread can have such a native handle. To avoid "torn" values, this patch declares `writer_thread` as `std::atomic`, but uses `std::memory_order_relaxed` for accesses to it, so that it does not introduce any new ordering constraints - just makes sure operations are atomic (so, no "torn" read is possible). Rather than adding another supression for Valgrind, this patch simply removes the "2." step. What *is* important (and was recently fixed) is to make sure that IF we loaded `recursive==true`, THEN we will see `writer_thread` value set by the thread which set `recursive` to `true`, or some LATER value. In particular this is important, if our thread was recently owning the latch, and our thread id was left in `writer_thread` - we need to make sure that it gets overwritten by next thread before we look at it. In other words, we need a guarantee that ``` recursive.load(acquire) && writer_thread==me ``` happens if and only if, my thread is indeed the one who already owns the latch. And for that it suffices to follow the simple protocol: When acquiring the latch for yourself: ``` writer_thread = me recursive.store(true, release) ``` When acquiring the latch for someone else (pass=true): ``` writer_thread = me recursive.store(false, release) ``` (this recursive=false will prevent me from wrongly believe I can latch it recursively, while my intention was to pass the ownership to another thread). When recieving ownership from someone else, do the same thing as when acquiring the latch. Before releasing the latch do: ``` recursive.store(false, relaxed) ``` When checking if you already own the latch do: ``` if(recursive.load(acquire) && writer_thread.load(relaxed)==me) ``` The guarantee holds, because: (=>) If we indeed hold the latch we set writer_thread = me and recursive to true and nobody else have modified it since, so the `if`'s condition is true. (<=) If the `if` condition was evaluated to `true` then it means that our `recursive.load` has read `true`, so it synchronizes with a thread which performed `recursive.store(true, release)`, so the value of `writer_thread` we read, is (A) either the one it has stored, or (B) some even fresher one. (A) As we saw `writer_thread == me` and we are the only thread which could write this particular value to this field, it must mean that we are the thread which performed `recursive.store(true, release)` most recently, so we own the latch. (B) As we are now in a source code line which does `if` (as opposed to a source code line between write to `writer_thread` and `recursive.store`) we had to perform some `recursive.store` (`true`, or `false`) before getting here. So our `recursive.load` has to happen after our own `store` to it. But this contradicts the premise of (B), as we get a cycle: our write to `writer_thread` is before our `store`, which is before or the same `store` we synchronize with, which is, by premise of (B), before our write to writer_thread. So, case B is impossible. RB:23954 Reviewed by : Kevin Lewis <kevin.lewis@oracle.com> Reviewed by : Nikša Skeledžija <niksa.skeledzija@oracle.com>
1 parent a366f38 commit e7950f7

File tree

4 files changed

+59
-76
lines changed

4 files changed

+59
-76
lines changed

storage/innobase/include/sync0rw.h

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -494,11 +494,10 @@ bool rw_lock_lock_word_decr(rw_lock_t *lock, ulint amount, lint threshold);
494494
UNIV_INLINE
495495
lint rw_lock_lock_word_incr(rw_lock_t *lock, ulint amount);
496496

497-
/** This function sets the lock->writer_thread and lock->recursive fields. For
498-
platforms where INNODB_RW_LOCKS_USE_ATOMICS is defined we set the
497+
/** This function sets the lock->writer_thread and lock->recursive fields. Sets
499498
lock->recursive field using atomic release after setting lock->writer thread to
500-
ensure proper memory ordering of the two. Otherwise we use lock->mutex to ensure
501-
this. Note that it is assumed that the caller of this function effectively owns
499+
ensure proper memory ordering of the two.
500+
Note that it is assumed that the caller of this function effectively owns
502501
the lock i.e.: nobody else is allowed to modify lock->writer_thread at this
503502
point in time. The protocol is that lock->writer_thread MUST be updated BEFORE
504503
the lock->recursive flag is set.
@@ -595,9 +594,10 @@ struct rw_lock_t
595594
causing much memory bus traffic */
596595
bool writer_is_wait_ex;
597596

598-
/** Thread id of writer thread. Is only guaranteed to have sane
599-
and non-stale value iff recursive flag is set. */
600-
os_thread_id_t writer_thread;
597+
/** Thread id of writer thread. Is only guaranteed to have non-stale value if
598+
recursive flag is set, otherwise it may contain native thread handle of a
599+
thread which already released or passed the lock. */
600+
std::atomic<os_thread_id_t> writer_thread;
601601

602602
/** Used by sync0arr.cc for thread queueing */
603603
os_event_t event;

storage/innobase/include/sync0rw.ic

Lines changed: 10 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -276,21 +276,8 @@ lint rw_lock_lock_word_incr(rw_lock_t *lock, /*!< in/out: rw-lock */
276276

277277
UNIV_INLINE
278278
void rw_lock_set_writer_id_and_recursion_flag(rw_lock_t *lock, bool recursive) {
279-
os_thread_id_t curr_thread = os_thread_get_curr_id();
280-
281-
#ifdef INNODB_RW_LOCKS_USE_ATOMICS
282-
283-
lock->writer_thread = curr_thread;
284-
lock->recursive.store(recursive, std::memory_order_release);
285-
286-
#else /* INNODB_RW_LOCKS_USE_ATOMICS */
287-
288-
mutex_enter(&lock->mutex);
289-
lock->writer_thread = curr_thread;
279+
lock->writer_thread.store(os_thread_get_curr_id(), std::memory_order_relaxed);
290280
lock->recursive.store(recursive, std::memory_order_release);
291-
mutex_exit(&lock->mutex);
292-
293-
#endif /* INNODB_RW_LOCKS_USE_ATOMICS */
294281
}
295282

296283
/** Low-level function which tries to lock an rw-lock in s-mode. Performs no
@@ -382,7 +369,8 @@ ibool rw_lock_x_lock_func_nowait(
382369
if (success) {
383370
rw_lock_set_writer_id_and_recursion_flag(lock, true);
384371
} else if (lock->recursive.load(std::memory_order_acquire) &&
385-
os_thread_eq(lock->writer_thread, os_thread_get_curr_id())) {
372+
os_thread_eq(lock->writer_thread.load(std::memory_order_relaxed),
373+
os_thread_get_curr_id())) {
386374
/* Relock: this lock_word modification is safe since no other
387375
threads can modify (lock, unlock, or reserve) lock_word while
388376
there is an exclusive writer and this is the writer thread. */
@@ -455,12 +443,13 @@ void rw_lock_x_unlock_func(
455443
ut_ad(lock->lock_word == 0 || lock->lock_word == -X_LOCK_HALF_DECR ||
456444
lock->lock_word <= -X_LOCK_DECR);
457445

458-
/* lock->recursive flag also indicates if lock->writer_thread is
459-
valid or stale. If we are the last of the recursive callers
460-
then we must unset lock->recursive flag to indicate that the
461-
lock->writer_thread is now stale.
462-
Note that since we still hold the x-lock we can safely read the
463-
lock_word. */
446+
/* lock->recursive == true implies that the lock->writer_thread is the
447+
current writer. If we are the last of the recursive callers then we must unset
448+
lock->recursive flag (or reset lock->writer_thread to the impossible
449+
std::thread().native_handle()) to indicate that the lock->writer_thread is now
450+
stale. Otherwise if our thread tried to reacquire the lock it would wrongly
451+
believe it already has it.
452+
Note that since we still hold the x-lock we can safely read the lock_word. */
464453
if (lock->lock_word == 0) {
465454
/* Last caller in a possible recursive chain. */
466455
lock->recursive.store(false, std::memory_order_relaxed);
@@ -518,7 +507,6 @@ void rw_lock_sx_unlock_func(
518507
/* Last caller in a possible recursive chain. */
519508
if (lock->lock_word > 0) {
520509
lock->recursive.store(false, std::memory_order_relaxed);
521-
UNIV_MEM_INVALID(&lock->writer_thread, sizeof lock->writer_thread);
522510

523511
if (rw_lock_lock_word_incr(lock, X_LOCK_HALF_DECR) <= X_LOCK_HALF_DECR) {
524512
ut_error;

storage/innobase/sync/sync0arr.cc

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -530,13 +530,11 @@ static void sync_array_cell_print(FILE *file, /*!< in: file where to print */
530530

531531
if (writer != RW_LOCK_NOT_LOCKED) {
532532
fprintf(file,
533-
"a writer (thread id " UINT64PF
534-
") has"
535-
" reserved it in mode %s",
536-
(uint64_t)(rwlock->writer_thread),
533+
"a writer (thread id " UINT64PF ") has reserved it in mode %s\n",
534+
(uint64_t)(rwlock->writer_thread.load()),
537535
writer == RW_LOCK_X
538-
? " exclusive\n"
539-
: writer == RW_LOCK_SX ? " SX\n" : " wait exclusive\n");
536+
? "exclusive"
537+
: (writer == RW_LOCK_SX ? "SX" : "wait exclusive"));
540538
}
541539

542540
fprintf(file,

storage/innobase/sync/sync0rw.cc

Lines changed: 38 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -106,31 +106,31 @@ sx-lock holder.
106106
107107
The other members of the lock obey the following rules to remain consistent:
108108
109-
recursive: This and the writer_thread field together control the
110-
behaviour of recursive x-locking or sx-locking.
111-
lock->recursive must be FALSE in following states:
112-
1) The writer_thread contains garbage i.e.: the
113-
lock has just been initialized.
114-
2) The lock is not x-held and there is no
115-
x-waiter waiting on WAIT_EX event.
116-
3) The lock is x-held or there is an x-waiter
117-
waiting on WAIT_EX event but the 'pass' value
118-
is non-zero.
119-
lock->recursive is TRUE iff:
120-
1) The lock is x-held or there is an x-waiter
121-
waiting on WAIT_EX event and the 'pass' value
122-
is zero.
123-
This flag must be set after the writer_thread field
124-
has been updated with a memory ordering barrier.
125-
It is unset before the lock_word has been incremented.
126-
writer_thread: Is used only in recursive x-locking. Can only be safely
127-
read iff lock->recursive flag is TRUE.
128-
This field is uninitialized at lock creation time and
129-
is updated atomically when x-lock is acquired or when
130-
move_ownership is called. A thread is only allowed to
131-
set the value of this field to it's thread_id i.e.: a
132-
thread cannot set writer_thread to some other thread's
133-
id.
109+
recursive: This flag is true iff the lock->writer_thread is allowed to take
110+
the x or sx lock recursively.
111+
112+
In particular you should follow these rules:
113+
- make sure to change recursive to `false` (or reset the
114+
`writer_thread` to `std::thread().native_handle()`) before
115+
releasing the lock
116+
- make sure to change recursive to `false` (but do not reset the
117+
`writer_thread` as it will remove debug info) before passing
118+
the lock to other thread
119+
- make sure to put your thread's native handle before setting
120+
recursive to `true`
121+
- make sure that when you want to read both `recursive` and
122+
`writer_thread` you do this in this precise order
123+
writer_thread: Is used only in recursive x-locking.
124+
This field is initialized to an impossible thread native handle
125+
value at lock creation time and is updated atomically when sx or
126+
x-lock is being acquired or when move_ownership is called.
127+
A thread is only allowed to set the value of this field to its
128+
thread_id i.e.: a thread cannot set writer_thread to some other
129+
thread's id.
130+
The only reasonable way (except reporting in debug info) to use
131+
this field is to compare it to own thread's native handle
132+
AFTER checking that lock->recursive flag is set, to see if we
133+
are the current writer.
134134
waiters: May be set to 1 anytime, but to avoid unnecessary wake-up
135135
signals, it should only be set to 1 when there are threads
136136
waiting on event. Must be 1 when a writer starts waiting to
@@ -222,14 +222,9 @@ void rw_lock_create_func(
222222
lock->lock_word = X_LOCK_DECR;
223223
lock->waiters = 0;
224224

225-
/* We set this value to signify that lock->writer_thread
226-
contains garbage at initialization and cannot be used for
227-
recursive x-locking. */
228225
lock->recursive.store(false, std::memory_order_relaxed);
229226
lock->sx_recursive = 0;
230-
/* Silence Valgrind when UNIV_DEBUG_VALGRIND is not enabled. */
231-
memset((void *)&lock->writer_thread, 0, sizeof lock->writer_thread);
232-
UNIV_MEM_INVALID(&lock->writer_thread, sizeof lock->writer_thread);
227+
lock->writer_thread = std::thread().native_handle();
233228

234229
#ifdef UNIV_DEBUG
235230
lock->m_rw_lock = true;
@@ -499,10 +494,10 @@ ibool rw_lock_x_lock_low(
499494
ulint line) /*!< in: line where requested */
500495
{
501496
if (rw_lock_lock_word_decr(lock, X_LOCK_DECR, X_LOCK_HALF_DECR)) {
502-
/* lock->recursive also tells us if the writer_thread
503-
field is stale or active. As we are going to write
504-
our own thread id in that field it must be that the
505-
current writer_thread value is not active. */
497+
/* lock->recursive == true implies that the lock->writer_thread is the
498+
current writer. As we are going to write our own thread id in that field it
499+
must be the case that the current writer_thread value is not the current
500+
writer anymore, thus recursive flag must be false. */
506501
ut_a(!lock->recursive.load(std::memory_order_relaxed));
507502

508503
/* Decrement occurred: we are writer or next-writer. */
@@ -512,7 +507,8 @@ ibool rw_lock_x_lock_low(
512507

513508
} else {
514509
if (!pass && lock->recursive.load(std::memory_order_acquire) &&
515-
os_thread_eq(lock->writer_thread, os_thread_get_curr_id())) {
510+
os_thread_eq(lock->writer_thread.load(std::memory_order_relaxed),
511+
os_thread_get_curr_id())) {
516512
/* Decrement failed: An X or SX lock is held by either
517513
this thread or another. Try to relock. */
518514
/* Other s-locks can be allowed. If it is request x
@@ -562,10 +558,10 @@ ibool rw_lock_sx_lock_low(
562558
ulint line) /*!< in: line where requested */
563559
{
564560
if (rw_lock_lock_word_decr(lock, X_LOCK_HALF_DECR, X_LOCK_HALF_DECR)) {
565-
/* lock->recursive also tells us if the writer_thread
566-
field is stale or active. As we are going to write
567-
our own thread id in that field it must be that the
568-
current writer_thread value is not active. */
561+
/* lock->recursive == true implies that the lock->writer_thread is the
562+
current writer. As we are going to write our own thread id in that field it
563+
must be the case that the current writer_thread value is not the current
564+
writer anymore, thus recursive flag must be false. */
569565
ut_a(!lock->recursive.load(std::memory_order_relaxed));
570566

571567
/* Decrement occurred: we are the SX lock owner. */
@@ -578,7 +574,8 @@ ibool rw_lock_sx_lock_low(
578574
thread or another thread. If it is this thread, relock,
579575
else fail. */
580576
if (!pass && lock->recursive.load(std::memory_order_acquire) &&
581-
os_thread_eq(lock->writer_thread, os_thread_get_curr_id())) {
577+
os_thread_eq(lock->writer_thread.load(std::memory_order_relaxed),
578+
os_thread_get_curr_id())) {
582579
/* This thread owns an X or SX lock */
583580
if (lock->sx_recursive++ == 0) {
584581
/* This thread is making first SX-lock request

0 commit comments

Comments
 (0)