Schaemelhout Schaemelhout - 18 days ago 6
Node.js Question

add thousands of messages to an Azure Storage Queue

I am trying to add about 6000 messages to my Azure Storage Queue in an Azure Function with Node.js.

I have tried multiple ways to do this, right now I wrap the
QueueService method in a

Promise
and resolve the 6000 promises through a
Promise.map
with a concurrency of about 50 using Bluebird.

const addMessages = Promise.map(messages, (msg) => {
//returns a promise wrapping the Azure QueueService method
return myQueueService.addMessage(msg);
}, { concurrency: 50 });

//this returns a promise that resolves when all promises have resolved.
//it rejects when one of the promises have rejected.
addMessages.then((results) => {
console.log("SUCCESS");
}, (error) => {
console.log("ERROR");
});


My QueueService is created with an
ExponentialRetry
policy.




I have had mixed results using this strategy:


  • All messages get added to my queue and the promise resolves correctly.

  • All messages get added to my queue and the promise does not resolve (or reject).

  • Not all messages get added to my queue and the promise does not resolve (or reject).






Am I missing something or is it possible for my calls to sometimes take 2 minutes to resolve and sometimes more than 10 minutes?

In the future, I probably am going to have to add about 100.000 messages, so I'm kind of worried about the unpredictable result I have now.


What would be the best strategy to add a large number of messages in Node (in an Azure Function)?


EDIT:

Not sure how I missed this, but a pretty reliable way to add my messages to my Storage Queue is to use the queue output binding of my Azure Function:

https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue#storage-queue-output-binding

Makes my code a lot easier as well!

for (var i = 0; i < messages.length; i++) {
context.bindings.outputQueue.push(messages);
}

Answer

What triggers this function? What I would recommend, instead of having a single function add all of those messages, is to fan out and allow those functions to scale and take better advantage of concurrency by limiting the amount of work they're doing.

With I'm proposing above, you'd have the function that handles the trigger you have in place today queue up the work that would in turn be processed by another function that performs the actual work of adding a (much) smaller number of messages to the queue. You may need to play with the numbers to see what works well based on your workload, but this pattern would allow those functions to better scale (including across multiple machines), better handle failures and improve reliability and predictability.

As an example, you could have the number of messages in the message you queue to trigger the work, and if you wanted 1000 messages as the final output, you could queue 10 messages instructing your "worker" functions to add 100 messages each. I would also recommend playing with much smaller numbers per function.

I hope this helps!

Comments