Janek Janek -4 years ago 72
Java Question

What does a LoadLoad barrier really do?

In Java, when we have two threads sharing the following variables:

int a;
volatile int b;

if thread 1 does:

a = 5;
b = 6;

Then a StoreStore barrier is inserted between these two instructions and 'a' is being flushed back to the main memory.

Now if thread 2 does:

if(b == 6)

a LoadLoad barrier is inserted between and we have a guarantee that if the new value of 'b' is visible then new value of 'a' is visible as well. But how actually this is achieved? Does LoadLoad invalidate the CPU caches/registers? Or just instructs a CPU to fetch the values of the variables that follow read from volatile again from CPU?

I have found this information about LoadLoad barrier (http://gee.cs.oswego.edu/dl/jmm/cookbook.html):

LoadLoad Barriers The sequence: Load1; LoadLoad; Load2 ensures that
Load1's data are loaded before data accessed by Load2 and all
subsequent load instructions are loaded. In general, explicit LoadLoad
barriers are needed on processors that perform speculative loads
and/or out-of-order processing in which waiting load instructions can
bypass waiting stores. On processors that guarantee to always preserve
load ordering, the barriers amount to no-ops.

but it does not really explain how this is achieved.

Answer Source

I will give one example on how this is achieved. You can read more on the details here. For x86 processors as you indicated LoadLoad ends up being no-ops. In the article I linked Mark points out that

Doug lists the StoreStore, LoadLoad and LoadStore

So in essence the only barrier needed is a StoreLoad for x86 architectures. So how is this achieved on low level?

This is an excerpt from the blog:

Here's the code it generated for both volatile and non-volatile reads:

nop                       ;*synchronization entry
mov    0x10(%rsi),%rax    ;*getfield x

And for volatile writes:

xchg   %ax,%ax
movq   $0xab,0x10(%rbx)
lock addl $0x0,(%rsp)     ;*putfield x

The lock instruction is the StoreLoad as listed by Doug's cookbook. But the lock instruction also synchronizes all reads with other processes as listed

Locked instructions can be used to synchronize data written by one processor and read by another processor.

This reduces the overhead of having to issue LoadLoad LoadStore barriers for volatile loads.

All that being said, I will reiterate what assylias noted. The way it happens should not be important to a developer (if you are interested in processor/compiler implementer that is another story). The volatile keyword is kind of an interface saying

  1. You will get the most up to date read which is written by another thread
  2. You will not get burned by JIT compiler optimizations.
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download