SepehrM SepehrM - 28 days ago 12
C# Question

Parallel calculation of BigInteger factorial

As a part of my BigDecimal library, I need to calculate the factorial of any given non negative integer. So I'm using .Net 4.0 's

System.Numerics.BigInteger
to be able to store huge numbers. Here is the function I'm using:

private BigInteger Factorial(BigInteger x)
{
BigInteger res = x;
x--;
while (x > 1)
{
res *= x;
x--;
}
return res;
}


It's working but not optimized. Now I want to use parallel computing, so here is what I've tried: (I have no experience with parallel programming)

public BigInteger Factorial(long x)
{
BigInteger res = 1;
ParallelLoopResult r = Parallel.For(2L, (x + 1), i =>
res *= i
);
return res;
}


The weird problem is that the above function works perfectly for small numbers like 5! but doesn't work for big numbers like 1000! and each time returns a completely different result. So I realized it's not thread safe and the problem is with the variable
res
. I wonder what the correct implementation is?

And It would be nicer if I could use BigInteger instead of long for variable
x
.

Answer

You need to make sure your parallel processes do not share any state.

For example, in the case of the factorial, I would do the following:

  • set a degree of parallelism (DOP) - usually the number of processors on your machine for compute-bound operations
  • divide the problem into independent subsets
  • process each subset in parallel
  • aggregate the results obtained from the parallel processes

This is somehow a simplified Map-Reduce.

The problem consists in multiplying together a set numbers. One way to divide this set into subsets is to use N parallel for loops where each start at a value i (where 0 < i <= N) with a step of N (and N = DOP).

Here's the code that does that:

/// <summary>
/// The max number of parallel tasks
/// </summary>
static readonly int DegreeOfParallelism = Environment.ProcessorCount;

public BigInteger Factorial(long x)
{
    // Make as many parallel tasks as our DOP
    // And make them operate on separate subsets of data
    var parallelTasks =
        Enumerable.Range(1, DegreeOfParallelism)
                    .Select(i => Task.Factory.StartNew(() => Multiply(x, i),
                                 TaskCreationOptions.LongRunning))
                    .ToArray();

    // after all tasks are done...
    Task.WaitAll(parallelTasks);

    // ... take the partial results and multiply them together
    BigInteger finalResult = 1;

    foreach (var partialResult in parallelTasks.Select(t => t.Result))
    {
        finalResult *= partialResult;
    }

    return finalResult;
}

/// <summary>
/// Multiplies all the integers up to upperBound, with a step equal to DOP
/// starting from a different int
/// </summary>
/// <param name="upperBoud"></param>
/// <param name="startFrom"></param>
/// <returns></returns>
public BigInteger Multiply(long upperBound, int startFrom)
{
    BigInteger result = 1;

    for (var i = startFrom; i <= upperBound; i += DegreeOfParallelism)
        result *= i;

    return result;
}

On my machine this computes 100000! in about 30 seconds and the result is what Wolfram Alpha says it should be.

Update

After running a few more tests, I discovered something which I didn't expect: printing out the 100000! result to the console takes ~18 seconds (the result has 456574 digits).

The results of the 100000! computation alone (without printing the number) the are:

  • ~10 seconds for parallel execution
  • ~16 seconds for sequential execution