NanoWizard NanoWizard - 4 months ago 24
Javascript Question

Efficient computation of n choose k in Node.js

I have some performance sensitive code on a Node.js server that needs to count combinations. From this SO answer, I used this simple recursive function for computing n choose k:

function choose(n, k) {
if (k === 0) return 1;
return (n * choose(n-1, k-1)) / k;
}


Then since we all know iteration is almost always faster than recursion, I wrote this function based on the multiplicative formula:

function choosei(n,k){
var result = 1;
for(var i=1; i <= k; i++){
result *= (n+1-i)/i;
}
return result;
}


I ran a few benchmarks on my machine. Here are the results of just one of them:

Recursive x 178,836 ops/sec ±7.03% (60 runs sampled)
Iterative x 550,284 ops/sec ±5.10% (51 runs sampled)
Fastest is Iterative


The results consistently showed that the iterative method is indeed about 3 to 4 times faster than the recursive method in Node.js (at least on my machine).

This is probably fast enough for my needs, but is there any way to make it faster? My code has to call this function very frequently, sometimes with fairly large values of
n
and
k
, so the faster the better.

EDIT



After running a few more tests with le_m's and Mike's solutions, it turns out that while both are significantly faster than the iterative method I proposed, Mike's method using Pascal's triangle appears to be slightly faster than le_m's log table method.

Recursive x 189,036 ops/sec ±8.83% (58 runs sampled)
Iterative x 538,655 ops/sec ±6.08% (51 runs sampled)
LogLUT x 14,048,513 ops/sec ±9.03% (50 runs sampled)
PascalsLUT x 26,538,429 ops/sec ±5.83% (62 runs sampled)
Fastest is PascalsLUT


The logarithmic look up method has been around 26-28 times faster than the iterative method in my tests, and the method using Pascal's triangle has been about 1.3 to 1.8 times faster than the logarithmic look up method.

Note that I followed le_m's suggestion of pre-computing the logarithms with higher precision using mathjs, then converted them back to regular JavaScript
Number
s (which are always double-precision 64 bit floats).

Answer

Never compute factorials, they grow too quickly. Instead compute the result you want. In this case, you want the binomial numbers, which have an incredibly simple geometric construction: you can build pascal's triangle, as you need it, and do it using plain arithmetic.

Start with [1] and [1,1]. The next row has [1] at the start, [1+1] in the middle, and [1] at the end: [1,2,1]. Next row: [1] at the start, the sum of the first two terms in spot 2, the sum of the next two terms in spot three, and [1] at the end: [1,3,3,1]. Next row: [1], then 1+3=4, then 3+3=6, then 3+1=4, then [1] at the end, and so on and so on. As you can see, no factorials, logarithsm, or even multiplications: just super fast addition with clean integer numbers. So simple, you can build a massive lookup table by hand.

And you should.

But you can't build an infinite lookup table, so you compromise: you start with a prespecified LUT, and a function that can "fill it up" to some term you need that's not in it yet:

module.exports = (function() {
  // step 1: a basic LUT with a few steps of Pascal's triangle
  var binomials = [
    [1],
    [1,1],
    [1,2,1],
    [1,3,3,1],
    [1,4,6,4,1]
  ];

  // step 2: a function that builds out the LUT if it needs to
  function binomial(n,k) {
    while(n >= binomials.length) {
      let s = binomials.length;
      nextRow = [];
      nextRow[0] = 1;
      for(let i=1, prev=s-1; i<s; i++) {
        nextRow[i] = binomials[prev][i-1] + binomials[prev][i];
      }
      nextRow[s] = 1;
      console.log(nextRow);
      binomials.push(nextRow);
    }
    return binomials[n][k];
  }

  return binomial;
}());

Since this is an array of ints, the memory footprint is tiny. For a lot of work involving binomials, we realistically don't even need more than two bytes per integer, making this a minute lookup table: we don't need more than 2 bytes until you need binomials higher than n=19, and the full lookup table up to n=19 takes up a measly 380 bytes. This is nothing compared to the rest of your program. Even if we allow for 32 bit ints, we can get up to n=35 in a mere 2380 bytes.

So the lookup is fast: either O(constant) for previously computed values, (n*(n+1))/2 steps if we have no LUT at all, or O(n²), and somewhere in between for terms we need that aren't in the LUT yet. Run some benchmarks for your application, which will tell you how big your initial lut should be, simply hard code that (seriously. these are constants, they're are exactly the kind of values that should be hard coded), and keep the generator around just in case.

However, do remember that you're in JavaScript land, and you are constrained by the JavaScript numerical type: integers only go up to 2^53, beyond that the integer property (every n has a distinct m=n+1 such that m-n=1) is not guaranteed. This should hardly ever be a problem, though: once we hit that limit, we're dealing with binomial coefficients that you should never even be using.

Comments