ConditionRacer ConditionRacer - 3 months ago 18
C# Question

Why does Peterson's lock fail in this test?

I'm experimenting with locks that don't require atomic instructions. Peterson's algorithm seemed like the simplest place to start. However, with enough iterations, something goes wrong.

Code:

public class Program
{
private static volatile int _i = 0;
public static void Main(string[] args)
{
for (int i = 0; i < 1000; i++)
{
RunTest();
}
Console.Read();
}

private static void RunTest()
{
_i = 0;
var lockType = new PetersonLock();
var t1 = new Thread(() => Inc(0, lockType));
var t2 = new Thread(() => Inc(1, lockType));

t1.Start();
t2.Start();

t1.Join();
t2.Join();
Console.WriteLine(_i);
}

private static void Inc(int pid, ILock lockType)
{
try
{
for (int i = 0; i < 1000000; i++)
{
lockType.Request(pid);
_i++;
lockType.Release(pid);
}
}
catch (Exception ex)
{
Console.WriteLine(ex);
}
}
}

public interface ILock
{
void Request(int pid);
void Release(int pid);
}

public class NoLock : ILock
{
public void Request(int pid) { }
public void Release(int pid) { }
}

public class StandardLock : ILock
{
private object _sync = new object();

public void Request(int pid)
{
Monitor.Enter(_sync);
}

public void Release(int pid)
{
Monitor.Exit(_sync);
}
}

public class PetersonLock : ILock
{
private volatile bool[] _wantsCs = new bool[2];
private volatile int _turn;

public void Request(int pid)
{
int j = pid == 1 ? 0 : 1;
_wantsCs[pid] = true;
_turn = j;
while (_wantsCs[j] && _turn == j)
{
Thread.Sleep(0);
}
}

public void Release(int pid)
{
_wantsCs[pid] = false;
}
}


When I run this, I consistently get < 2,000,000. What's going on?

Answer

The problem here is these two statements:

_wantsCs[pid] = true;
_turn = j;

The memory model of .NET and C# allows these two writes to be reordered.

To force them to not be reordered, add a memory barrier between them:

_wantsCs[pid] = true;
Thread.MemoryBarrier();
_turn = j;

and you will get the expected output every time.

Note that this very problem is described on the Wikipedia page for Peterson's Algorithm in the note section (shortened down here, go read the article for the full notes):

Notes
...
Most modern CPUs reorder memory accesses to improve execution efficiency (see memory ordering for types of reordering allowed). Such processors invariably give some way to force ordering in a stream of memory accesses, typically through a memory barrier instruction. Implementation of Peterson's and related algorithms on processors which reorder memory accesses generally requires use of such operations to work correctly to keep sequential operations from happening in an incorrect order.

(my emphasis)