Motig Motig - 28 days ago 7
C# Question

Thread.VolatileRead() vs Volatile.Read()

We are told to prefer Volatile.Read over Thread.VolatileRead in most cases due to the latter emitting a full-fence, and the former emitting only the relevant half-fence (e.g. acquire fence); which is more efficient.

However, in my understanding,

Thread.VolatileRead
actually offers something that
Volatile.Read
does not, because of the implementation of
Thread.VolatileRead
:

public static int VolatileRead(ref int address) {
int num = address;
Thread.MemoryBarrier();
return num;
}


Because of the full memory barrier on the second line of the implementation, I believe that
VolatileRead
actually ensures that the value last written to
address
will be read.
According to Wikipedia, "A full fence ensures that all load and store operations prior to the fence will have been committed prior to any loads and stores issued following the fence.".

Is my understanding correct? And therefore, does
Thread.VolatileRead
still offer something that
Volatile.Read
does not?

Answer

I may be a little late to the game, but I would still like to chime in. First we need to agree on some basic definitions.

  • acquire-fence: A memory barrier in which other reads and writes are not allowed to move before the fence.
  • release-fence: A memory barrier in which other reads and writes are not allowed to move after the fence.

I like to use an arrow notation to help illustrate the fences in action. An ↑ arrow will represent a release-fence and a ↓ arrow will represent an acquire-fence. Think of the arrow head as pushing memory access away in the direction of the arrow. But, and this is important, memory accesses can move past the tail. Read the definitions of the fences above and convince yourself that the arrows visually represent those definitions.

Using this notation let us analyze the examples from JaredPar's answer starting with Volatile.Read. But, first let me make the point that Console.WriteLine probably produces a full-fence barrier unbeknownst to us. We should pretend for a moment that it does not to make the examples easier to follow. In fact, I will just omit the call entirely as it is unnecessary in the context of what we are trying to achieve.

// Example using Volatile.Read
x = 13;
var local = y; // Volatile.Read
↓              // acquire-fence
z = 13;

So using the arrow notation we more easily see that the write to z cannot move up and before the read of y. Nor can the read of y move down and after the write of z because that would be effectively same as the other way around. In other words, it locks the relative ordering of y and z. However, the read of y and the write to x can be swapped as there is no arrow head preventing that movement. Likewise, the write to x can move past the tail of the arrow and even past the write to z. The specification technically allows for that..theoretically anyway. That means we have the following valid orderings.

Volatile.Read
---------------------------------------
write x    |    read y     |    read y
read y     |    write x    |    write z
write z    |    write z    |    write x

Now let us move on to the example with Thread.VolatileRead. For the sake of the example I will inline the call to Thread.VolatileRead to make it easier to visualize.

// Example using Thread.VolatileRead
x = 13;
var local = y; // inside Thread.VolatileRead
↑              // Thread.MemoryBarrier / release-fence
↓              // Thread.MemoryBarrier / acquire-fence
z = 13;

Look closely. There is no arrow (because there is no memory barrier) between the write to x and the read of y. That means these memory accesses are still free to move around relative to each other. However, the call to Thread.MemoryBarrier, which produces the additional release-fence, makes it appear as if the next memory access had volatile write semantics. This means the writes to x and z can no longer be swapped.

Thread.VolatileRead
-----------------------
write x    |    read y
read y     |    write x
write z    |    write z

Of course it has been claimed that Microsoft's implementation of the CLI (the .NET Framework) and the x86 hardware already guarantee release-fence semantics for all writes. So in that case there may not be any difference between the two calls. On an ARM processor with Mono? Things might be different in that case.

Let us move on now to your questions.

Because of the full memory barrier on the second line of the implementation, I believe that VolatileRead actually ensures that the value last written to address will be read. Is my understanding correct?

No. This is not correct! A volatile read is not the same as a "fresh read". Why? It is because the memory barrier is placed after the read instruction. That means the actual read is still free to move up or backwards in time. Another thread could write to the address, but the current thread might have already moved the read to a point in time before that other thread committed it.

So this begs the question, "Why do people bother using volatile reads if it seemingly guarantees so little?". The answer is that it absolutely guarantees that the next read will be newer than the previous read. That is its value! That is why a lot of lock-free code spins in a loop until the logic can determine that the operation was completed successfully. In other words, lock-free code exploits the concept that the later read in a sequence of many reads will return a newer value, but the code should not assume that any of the reads necessarily represent the latest value.

Think about this for a minute. What does it even mean for a read to return the latest value anyway? By the time you use that value it might not be the latest anymore. Another thread may have already written a different value to the same address. Can you still call that value the latest?

But, after considering the caveats of what it even means to have a "fresh" read discussed above, you still want something that acts like a "fresh" read then you would need to place an acquire-fence before the read. Note that this is clearly not the same thing as a volatile read, but it would better match a developers intuition of what "fresh" means. However, the term "fresh" in the case is not an absolute. Instead, the read is "fresh" relative to the barrier. That is it cannot be any older than the point in time in which the barrier was executed. But, as was mentioned above, the value may not represent the latest value by the time you use or make decision based on it. Just keep that in mind.

And therefore, does Thread.VolatileRead still offer something that Volatile.Read does not?

Yes. I think JaredPar presented a perfect example of a case where it can offer something additional.

Comments