Add atomic loads and stores and barriers #9247

kjbracey · 2019-01-03T16:05:46Z

Description

Add atomic load and store functions, and add barriers to the existing atomic functions.

File currently has no explicit barriers - we don't support SMP, so don't need CPU barriers.

But we do need to worry about compiler barriers - particularly if link time optimisation is activated so that the compiler can see inside these functions. The assembler or intrinsics that access PRIMASK for
enter/exit critical act as barriers, but LDREX, STREX and simple volatile pointer loads and stores do not.

Pull request type

[ ] Fix
[ ] Refactor
[ ] Target update
[X] Functionality change
[ ] Docs update
[ ] Test update
[ ] Breaking change

Reviewers

@c1728p9, @pan-

kjbracey · 2019-01-03T16:15:21Z

I'm in two minds about this one - if people are directly accessing atomics with raw load or store, they're probably doing something weird and should be using something higher-level, but maybe there's no good alternative. This is being added to cover the _pending flag in InternetSocket - see #9248.

pan-

I'm puzzled with the atomic load/store implementation proposed - I must be missing something 😕 .
From what I understand MBED_BARRIER is a compiler fence; nothing more. I believe we should use __LDREX? and __STREX? on M3 and M4 architecture and we can use a critical section on M0 architecture.

pan- · 2019-01-03T16:23:38Z

platform/mbed_toolchain.h

+ * Stop the compiler moving memory accesses.
+ */
+#ifdef __CC_ARM
+#define MBED_BARRIER() __memory_changed()


It would be nice to have a detailed description - and maybe more appropriate naming - of what MBED_BARRIER is; where it should be used, what it does and what it doesn't do.

From my understanding asm volatile("" : : : "memory") forces the compiler to not reorder the code it produces however there is no guarantee that the code won't be reordered by the CPU or that memory writes have completed. It doesn't replaces CPU barrier instructions.

c1728p9 · 2019-01-03T17:22:27Z

platform/mbed_critical.h

+MBED_FORCEINLINE void core_util_atomic_flag_clear(volatile core_util_atomic_flag *flagPtr)
+{
+    MBED_BARRIER();
+    flagPtr->_flag = false;


Why isn't there a memory barrier after _flag is set to false?

You're right, one is needed there. I was thinking it wasn't needed for single-CPU, but sequential consistency requires it for the same reason you would need two DMBs (https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html). Assuming we want sequential consistency (the C++11 default), then we need to guarantee that

atomic_store(x); // release atomic_load(y); // acquire

can't be reordered. So that has to end up as:

BARRIER // for release semantics - keep preceding above x = foo; BARRIER // stop store+load being reordered bar = y; BARRIER // acquire semantics - keep following below

And it's store that gets the extra sequential-consistency barrier (like the extra DMB), because stores are probably less common.

c1728p9 · 2019-01-03T17:24:56Z

platform/mbed_critical.h

+ */
+MBED_FORCEINLINE uint8_t core_util_atomic_load_u8(const volatile uint8_t *valuePtr)
+{
+    uint8_t value = *valuePtr;


Why isn't there a memory barrier before *valuePtr is read?

Because load has "acquire" semantics, as per C++11/C11. It only needs to stop stuff from afterwards moving before it.

kjbracey · 2019-01-04T09:03:01Z

I'm puzzled with the atomic load/store implementation proposed - I must be missing something 😕 .

mbed_critical currently has no SMP support, and has no CPU barriers at all. These new functions are consistent with the rest of the file - compiler barriers only, assuming 1 CPU, and protect against other threads or interrupts on that single CPU.

I believe we should use __LDREX? and __STREX? on M3 and M4 architecture

Those don't have any barrier/ordering functionality in the CPU. ~~The LDREX and STREX macros/intrinsics only provide compiler barriers~~ - if you were doing SMP you would either need to add DMB instructions for CPU barriers (ARMv7) or use LDAEX and STLEX to get CPU acquire/release semantics (ARMv8).

So these implementations are to add the necessary equivalent compiler barrier on a plain load or store, that we get automatically whenever we happen to use ~~LDREX/STREX or~~ touch PRIMASK.

(Edit, oops, no, LDREX and STREX are not actually compiler barriers, necessarily. I'm going to need to add some.)

Without the barrier, this can go wrong:

if (!atomic_flag_test_and_set(&flag)) {
     do some stuff_to_state_protected_by_flag();
     atomic_flag_clear(&flag);
}

The "do some stuff" could be reordered after the flag clear by the compiler, so that we are modifying the state with the flag clear. Existing attempts to do the equivalent with a "volatile bool" suffer the same problem - they should change to use atomic_flag to get the barrier that's being added here.

With LTO off, this was avoided because the "call outside this C file" to atomic_flag_clear is an effective compiler barrier in itself. But the moment LTO is on, and it can see inside the definition (or once it's made inline, like it is in this patch), that reordering will happen.

An alternative would be to implement the atomic loads and stores like asm volatile("STR %0,[%1]" : : : "memory"), so that they are barriers like the ~~STREX~~ or PRIMASK, but that would significantly impede code generation.

pan-

Thanks for taking the time to answer my concerns @kjbracey-arm ; it all makes sense now.

kjbracey · 2019-01-04T12:57:05Z

Reworked the comments more - provide both MBED_COMPILER_BARRIER and MBED_BARRIER (which is currently MBED_COMPILER_BARRIER because non-SMP). I can see that ARMv8 atomics will want "compiler barrier" specifically, while ARMv7 atomics want "DMB or compiler barrier", so provide both.

kjbracey · 2019-01-14T08:32:18Z

Don't think this needs any more work - maybe more review.

0xc0170 · 2019-01-16T12:33:30Z

@SenRamakri @deepikabhavnani If you could review this one, it's nearly ready for CI

deepikabhavnani

Looks good to me 👍

cmonr · 2019-01-17T19:40:14Z

CI started

mbed-ci · 2019-01-17T19:50:50Z

Test run: FAILED

Summary: 1 of 1 test jobs failed
Build number : 1
Build artifacts

Failed test jobs:

jenkins-ci/mbed-os-ci_unittests

Add atomic load and store functions, and add barriers to the existing atomic functions. File currently has no explicit barriers - we don't support SMP, so don't need CPU barriers. But we do need to worry about compiler barriers - particularly if link time optimisation is activated so that the compiler can see inside these functions. The assembler or intrinsics that access PRIMASK for enter/exit critical act as barriers, but LDREX, STREX and simple volatile pointer loads and stores do not.

0xc0170 · 2019-01-18T11:40:18Z

CI restarted

mbed-ci · 2019-01-18T22:46:49Z

Test run: SUCCESS

Summary: 11 of 11 test jobs passed
Build number : 2
Build artifacts

As barriers were added in ARMmbed#9247, review comments pointed out that atomic stores needed a second barrier, and they got one. But atomic_flag_clear was missed.

kjbracey mentioned this pull request Jan 3, 2019

InternetSocket: use atomics, not volatile #9248

Merged

pan- requested changes Jan 3, 2019

View reviewed changes

cmonr added release-version: 5.12.0-rc1 needs: work labels Jan 3, 2019

c1728p9 reviewed Jan 3, 2019

View reviewed changes

kjbracey force-pushed the atomic_load_store branch from 2990e9e to 834c508 Compare January 4, 2019 10:47

kjbracey changed the title ~~Add atomic loads and stores with barriers~~ Add atomic loads and stores and barriers Jan 4, 2019

pan- approved these changes Jan 4, 2019

View reviewed changes

kjbracey force-pushed the atomic_load_store branch from 834c508 to 92a4afb Compare January 4, 2019 12:54

kjbracey mentioned this pull request Jan 4, 2019

__LDA(EX)/__STL(EX) don't constrain compiler reordering ARM-software/CMSIS_5#494

Closed

kjbracey force-pushed the atomic_load_store branch 2 times, most recently from de9d6a4 to 196e1ed Compare January 10, 2019 12:09

0xc0170 requested a review from a team January 14, 2019 09:36

cmonr added needs: review and removed needs: work labels Jan 15, 2019

0xc0170 approved these changes Jan 16, 2019

View reviewed changes

deepikabhavnani approved these changes Jan 17, 2019

View reviewed changes

cmonr added needs: CI and removed needs: review labels Jan 17, 2019

cmonr added needs: work and removed needs: CI labels Jan 17, 2019

kjbracey force-pushed the atomic_load_store branch from 196e1ed to 703e440 Compare January 18, 2019 07:32

kjbracey added the needs: CI label Jan 18, 2019

cmonr removed the needs: work label Jan 18, 2019

cmonr added ready for merge and removed needs: CI labels Jan 19, 2019

0xc0170 merged commit a23a850 into ARMmbed:master Jan 21, 2019

0xc0170 removed the ready for merge label Jan 21, 2019

kjbracey deleted the atomic_load_store branch January 24, 2019 09:39

This was referenced Jan 29, 2019

Second barrier for core_util_atomic_flag_clear #9528

Merged

SharedPtr: use atomic load #9581

Merged

kjbracey mentioned this pull request Feb 11, 2019

Atomic extensions: 64-bit, bool, exchange #9600

Merged

Add atomic loads and stores and barriers #9247

Add atomic loads and stores and barriers #9247

Uh oh!

Conversation

kjbracey commented Jan 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Pull request type

Reviewers

Uh oh!

kjbracey commented Jan 3, 2019

Uh oh!

pan- left a comment

Choose a reason for hiding this comment

Uh oh!

pan- Jan 3, 2019

Choose a reason for hiding this comment

Uh oh!

c1728p9 Jan 3, 2019

Choose a reason for hiding this comment

Uh oh!

kjbracey Jan 4, 2019

Choose a reason for hiding this comment

Uh oh!

c1728p9 Jan 3, 2019

Choose a reason for hiding this comment

Uh oh!

kjbracey Jan 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kjbracey commented Jan 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pan- left a comment

Choose a reason for hiding this comment

Uh oh!

kjbracey commented Jan 4, 2019

Uh oh!

kjbracey commented Jan 14, 2019

Uh oh!

0xc0170 commented Jan 16, 2019

Uh oh!

deepikabhavnani left a comment

Choose a reason for hiding this comment

Uh oh!

cmonr commented Jan 17, 2019

Uh oh!

mbed-ci commented Jan 17, 2019

Test run: FAILED

Uh oh!

0xc0170 commented Jan 18, 2019

Uh oh!

mbed-ci commented Jan 18, 2019

Test run: SUCCESS

Uh oh!

Uh oh!

kjbracey commented Jan 3, 2019 •

edited

Loading

kjbracey Jan 4, 2019 •

edited

Loading

kjbracey commented Jan 4, 2019 •

edited

Loading