alkis alkis - 1 month ago 9
Node.js Question

Prevent dangerous regex halting the application

Regexes are dangerous, very dangerous. Regex operations are processed by the main thread, the one that is listening to the event loop. Is it possible to make sure that a dangerous regex won't halt my application? Should I be passing the regex operations to a thread pool of my own? Is there a norm for this? Of course testing, monitoring etc. will be done, but is there a generic approach of preventing this kind of disasters?

Answer

You may use a trick using Node.js’s core vm module. (available in v0.12.x, v4.x and v5.x branches) described in the Mitigating Catastrophic Backtracking in Node.js Regular Expressions. The idea is to set a timeout to a regex match operation and terminate matching once it reached a specified period of time.

Here is a snippet from the article you may leverage:

const util = require('util'); 
const vm = require('vm'); 
var sandbox = { 
    result: null 
}; 
var context = vm.createContext(sandbox);     
console.log('Sandbox initialized: ' + vm.isContext(sandbox)); 
var script = new vm.Script('result = /^(A+)*B/.test(\'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAC\');'); 
try{
     // One could argue if a RegExp hasn't processed in a given time. 
    // then, its likely it will take exponential time. 
    script.runInContext(context, { timeout: '1000' }); // milliseconds 
} 
catch(e){ 
    console.log('ReDos occurred'); // Take some remedial action here... 
} 
console.log(util.inspect(sandbox)); // Check the results