Fast Double Factorial Functions ‼

Definitions and formulas.

The name 'doublegamma' is ad hoc. If you ever wonder where the name 'swing' in the text below comes from this plot might give you a clue. The factorial world behind the standard factorial is no longer monotonic! Looking only at the values of integers will deceive you‼

log(abs(gamma)) (red) versus log(abs(doublegamma)) (green).

Paul Zimmermann challenged us to write a faster function for calculating the double factorial: "Challenge: do better than mpz_fac_ui2 ..."

Zimmermann's implementation is based on an idea of Schönhage et alia  (from 1994 (!), see also the TP page).

I wrote a version which is not based on Schönhage's algorithm but also uses prime factorization. I named it 'SwingDouble'. It seems to be faster, but what might be even more significant, it uses less memory. Here is the output from a test run.

Some timings on a 2.83 Ghz Core 2:

Testing number:   1000000
SwingDouble:      0.252 sec memory:   83074 long
ZimmermannDouble: 0.571 sec memory: 2078497 long

Testing number:   1000001
SwingDouble:      0.461 sec memory:  156994 long
ZimmermannDouble: 0.602 sec memory: 2078499 long

Testing number:   2000000
SwingDouble:      0.754 sec memory:  156994 long
ZimmermannDouble: 1.494 sec memory: 4148932 long

Testing number:   2000001
SwingDouble:      0.908 sec memory:  297864 long
ZimmermannDouble: 1.436 sec memory: 4148934 long

Testing number:   4000000
SwingDouble:      1.555 sec memory:  297864 long
ZimmermannDouble: 2.954 sec memory: 8283145 long

Testing number:   4000001
SwingDouble:      2.331 sec memory:  566290 long
ZimmermannDouble: 3.308 sec memory: 8283147 long

I always test two consecutive numbers because both algorithms behave slightly different depending on the parity of the input (and, in the general case, on some modulus).

The C/C++ sources are here. They require the MPIR libraries. Note that the 'parallel' variants require additionally the boost libraries. 

A nice thing of the given implementation is that just by deleting two or three lines of code you get an implementation of the factorial function or by adding two lines of code a version which runs efficiently on parallel threads (see ParalleDouble). Some further lines of code will give a fast implementation of the binomial function.

Amazingly it was not before January 2010 that an asymptotically fast algorithm for the factorial function crept into one the leading BigNum libraries. Then MPIR 1.3 included an implementation of the Schönhage algorithm (without giving reference to its origin).

The MPIR implementation, however, is a good example for a poor implementation, violating the rules of modularity. It mingles the Schönhage algorithm with the sieve of Eratosthenes. Whenever MPIR might support a faster prime sieve (a quadratic sieve or a parallel sieve) in some later release the factorial function will not automatically benefit from this. In our implementation the connection to the prime sieve is encapsulated in a single call to a sieve function and the implementation of the factorial function is not affected by the details of the sieve implementation.


Skip if you are not interested in some esoteric background information.
There is another kind of numbers related to both the factorials and to the way the double factorial is implemented here via the swinging factorials: the double swinging factorials. The swinging factorials have in turn some interesting connections to the Al-Haytham primes, which are related to the Wilson primes and even to the Wieferich primes.

previous Back to the Homepage of Factorial Algorithms.