sinelaw sinelaw - 1 month ago 13
C Question

Do atomic_store/load from <stdatomic.h> work for unaligned, cross-cache-line data on Intel?

Will data stored with atomic_store, and loaded with atomic_load always appear consistent?

Specifically: A C11 program accesses 64-bit data placed deliberately on the boundary between cache lines on a modern Intel CPU. It uses atomic_store & atomic_load (from

<stdatomic.h>
) to access this data from multiple threads (running on different cores).

Will the data always appear consistent, or will loading it (atomic_load) sometimes have some bytes belonging to an old value, and other bytes belonging to a newer value?

Here are the essential struct and variable definitions and the interesting part of the program, happening in a loop, in parallel from multiple threads:

struct Data {
uint8_t bytes[CACHELINE__BYTECOUNT - 4];
atomic_uint_fast64_t u64;
} __attribute__((packed)) __attribute__((aligned ((CACHELINE__BYTECOUNT))));

#define VAL1 (0x1111111111111111)
#define VAL2 (0xFFFFFFFFFFFFFFFF)

static struct Data data = { .u64 = VAL1 };

...

for (uint32_t j = 0; j < 1000; j++) {
atomic_store(&data.u64, VAL1);
atomic_store(&data.u64, VAL2);
}
const uint64_t val = atomic_load(&data.u64);
/* is 'val' always VAL1 or VAL2? */


(Full runnable program: https://gist.github.com/sinelaw/1230d4675d6a4fff394110f17e463954)

Checking it with gcc 6.3.0 and clang 3.7 shows it isn't atomic:

$ clang -std=c11 -Wall -Wextra /tmp/atomic.c -o /tmp/atomic -lpthread
$ /tmp/atomic
ERROR: oh no, got: 11111111FFFFFFFF


So either there's a bug in the program, or I misunderstood
<stdatomic.h>
, or there's a bug in the compilers.

Art Art
Answer Source

A correctly written program can not get an object that isn't correctly aligned. A correctly aligned int64 can't cross cache lines.

So the answer to your question is: there's a bug in your program. A bug deliberately introduced by you through using non-standard constructs (__attribute__) to break things.

It would be crazy for the compiler to go out of its way to ensure that stdatomic works for unaligned values because that would require a global lock which is what stdatomic is specifically there to avoid.