Pawel Troka Pawel Troka - 1 month ago 6
C# Question

Concurrent Collection with fastest possible Add, Remove and Find the highest

I am doing some heavy computations in C# .NET and when doing these computations in parallel.for loop I must collect some data in collection, but because of limited memory I can't collect all results, so I only store the best ones.

Those computations must be as fast as possible because they are already taking too much time. So after optimizing a lot I find out that the slowest thing was my

ConcurrentDictionary
collection. I am wondering if I should switch to something with faster add, remove and find the highest (perhaps a sorted collection) and just use locks for my main operation or I can do something good using
ConcurrentColletion
and speed up it a little.

Here is my actual code, I know it's bad because of this huge lock, but without it I seem to lose consistency and a lot of my remove attempts are failing.

public class SignalsMultiValueConcurrentDictionary : ConcurrentDictionary<double, ConcurrentBag<Signal>>
{
public int Limit { get; set; }
public double WorstError { get; private set; }

public SignalsDictionaryState TryAddSignal(double key, Signal signal, out Signal removed)
{
SignalsDictionaryState state;
removed = null;

if (this.Count >= Limit && signal.AbsoluteError > WorstError)
return SignalsDictionaryState.NoAddedNoRemoved;

lock (this)
{
if (this.Count >= Limit)
{
ConcurrentBag<Signal> signals;
if (TryRemove(WorstError, out signals))
{
removed = signals.FirstOrDefault();
state = SignalsDictionaryState.AddedAndRemoved;
}
else
state = SignalsDictionaryState.AddedFailedRemoved;
}
else
state = SignalsDictionaryState.AddedNoRemoved;

this.Add(key, signal);
WorstError = Keys.Max();
}
return state;
}

private void Add(double key, Signal value)
{
ConcurrentBag<Signal> values;
if (!TryGetValue(key, out values))
{
values = new ConcurrentBag<Signal>();
this[key] = values;
}

values.Add(value);
}
}


Note also because I use absolute error of signal, sometimes (should be very rare) I store more than one value on one key.

The only operation used in my computations is
TryAddSignal
because it does what I want -> if I have more signlas than limit then it removes signal with highest error and adds new signal.

Because of the fact that I set
Limit
property at the start of the computations I don't need a resizable collection.

The main problem here is even without that huge lock,
Keys.Max
is a little too slow. So maybe I need other collection?

usr usr
Answer

Keys.Max() is the killer. That's O(N). No need for a dictionary if you do this.

You can't incrementally compute the max value because you are adding and removing. So you better use a data structure that is made for this. Trees usually are. The BCL has SortedList and SortedSet and SortedDictionary I believe. One of them was based on a fast tree. It has min and max operations.

Or, use a .NET collection library with a priority queue.

Bug: Add is racy. You might overwrite a non-empty collection.

Comments