I'm writing a stateful service that is hosted in Service Fabric. The service's job is to consume messages from an external queue, transform them and place them onto our own messaging system. Throughput can be up 6k messages / sec according to the suppliers docs.
I've configured the service into multiple partitions to spread the message load, and each partition has min 2/max 3 replicas. To recover from a failure i can subscribe to the suppliers queue and pass in a timestamp from which point i wish to receive messages. To do this i'm storing the timestamp of the last message processed in service state. Due to the volume of messages i decided to do this 'save' on a timer (and allow potential dups of messages downstream)
This is the code that is called by the time:
private async void _timer_Elapsed(object sender, ElapsedEventArgs e)
var saveRetryPolicy = Policy
.WaitAndRetryAsync(5, retryAttempt =>
await saveRetryPolicy.ExecuteAsync(async () =>
using (var tx = _stateManager.CreateTransaction())
var state = await _stateManager.TryGetAsync<IReliableDictionary<string, long>>(TimestampStateName);
await state.Value.AddOrUpdateAsync(tx, TimestampStateName, _lastTXTimestamp,
(s, l) => _lastTXTimestamp);
var s =
await _stateManager.GetOrAddAsync<IReliableDictionary<string, long>>(tx, TimestampStateName);
Answer from comments:
Make sure you don't start the timer on all replicas, but only on the primary replica.