Factorial Benchmark 2010

Timings (seconds)
CPU: AMD 64 X2 Dual Core 4400, 2.31 GHz, RAM: 2.0 GB, OS: 32 bit Win XP
MPIR-1.3.0 Factorial Benchmark (29 Jan. 2010)
| (N x 10^6) ! | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 100 |
| ParallelPrimeSwing | 0.9 | 1.9 | 4.2 | 9.2 | 20.6 | 45.5 | 101.8 | 181.1 |
| ParallelPrimeSchoenhage | 0.9 | 2.1 | 4.6 | 10.2 | 22.6 | 50.2 | 112.1 | 197.4 |
| PrimeSwing | 1.0 | 2.1 | 4.8 | 10.7 | 23.9 | 53.1 | 118.9 | 209.0 |
| PrimeSchoenhage | 1.0 | 2.1 | 4.8 | 10.7 | 23.9 | 53.2 | 119.0 | 209.2 |
| MPIR-Library | 1.0 | 2.2 | 4.8 | 10.7 | 24.1 | 53.4 | 119.3 | 209.6 |
GMP-5.0.1 Factorial Benchmark (07 Feb. 2010)
| (N x 10^6) ! | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 100 |
| ParallelPrimeSwing | 0.8 | 1.9 | 4.6 | 10.9 | 25.0 | 58.1 | 129.7 | 218.5 |
| ParallelPrimeSchoenhage | 0.9 | 2.2 | 5.1 | 11.9 | 27.1 | 62.9 | 140.3 | 237.1 |
| PrimeSwing | 0.9 | 2.2 | 5.2 | 12.4 | 28.3 | 65.5 | 146.9 | 249.5 |
| PrimeSchoenhage | 0.9 | 2.2 | 5.3 | 12.4 | 28.4 | 65.8 | 147.2 | 249.7 |
| GMP-Library | 1.6 | 3.7 | 9.3 | 21.8 | 51.2 | 117.9 | 276.9 | 468.7 |

Note that this benchmark only compares a tiny fraction of the two multiple precision libraries. It compares the factorial function as provided by the libraries and it compares how some factorial functions implemented on the more basic functions of the libraries perform. So basically only the multiplication algorithms are tested. GMP 5.0.1 performs relatively poor compared to MPIR-1.3.0. It might be the case that earlier versions of GMP perform better. However, I did not test this. (I know that the developers of GMP are hard working to improve things.)
The following plot sums our findings up. It shows in units of [dd/ms] (decimal digit per millisecond) the performance of ParallelPrimeSwing (red) with MPIR 1.3.0 and the GMP library function (blue) in the region tested (higher figures are better here).

PPSMPIR := listplot([[1,6472],[2,6157],[4,5935],[8,5624],[16,5256],[32,4970],[64,4635],[100,4178]],color=red): LIBGMP := listplot([[1,3525],[2,3142],[4,2663],[8,2376],[16,2114],[32,1920],[64,1705],[100,1615]],color=blue):
You might also be interested to compare different implementations of the double factorial function. On that page you can also find the source code (see the class diagram below) of the above benchmark. Benchmark yourself on your favorite system!

Java Factorial Benchmark (2009)
Java 1.7.0 Sun VM (server), 32 bit
Arithmetic: Apfloat Library 1.5.2
(int32-based) by Mikko Tommila
CPU: AMD 64 X2 Dual, 2.3 GHz, RAM: 2.0 GB, OS: Win XP
Timings (seconds)
| (N x 10^6) ! | 1 | 2 | 4 | 8 | 16 | 32 |
| ParallelPrimeSwing | 4,2 | 8,5 | 18,1 | 39,1 | 84,5 | 174,7 |
| ParallelPrimeSplit | 4,3 | 9,1 | 18,8 | 40,0 | 101,1 | 180,1 |
| PrimeSwing | 4,3 | 9,5 | 20,1 | 49,0 | 106,8 | 192,2 |
| PrimeSchoenhage | 4,3 | 9,6 | 20,6 | 43,5 | 104,0 | 197,4 |
| PrimeVardi | 4,4 | 9,8 | 20,7 | 43,8 | 102,6 | 195,0 |
| PrimeSwingList | 4,4 | 9,8 | 20,7 | 44,1 | 104,0 | 199,4 |
| ParallelSplit | 6,9 | 15,0 | 33,3 | 80,2 | 188,2 | 366,8 |
| ParallelSwing | 8,8 | 18,7 | 40,6 | 88,2 | 224,6 | 425,0 |
| Split | 9,5 | 20,8 | 44,8 | 97,1 | 211,0 | 460,1 |
Ranking relative to PrimeSwing
| (N x 10^6) ! | 1 | 2 | 4 | 8 | 16 | 32 |
| ParallelPrimeSwing | 0,9 | 0,9 | 0,9 | 0,9 | 0,9 | 0,9 |
| ParallelPrimeSplit | 1,0 | 0,9 | 0,9 | 0,9 | 0,9 | 0,9 |
| PrimeSwing | 1,0 | 1,0 | 1,0 | 1,0 | 1,0 | 1,0 |
| PrimeSchoenhage | 1,0 | 1,0 | 1,0 | 1,0 | 1,0 | 1,0 |
| PrimeVardi | 1,0 | 1,0 | 1,0 | 1,0 | 1,0 | 1,0 |
| PrimeSwingList | 1,0 | 1,0 | 1,0 | 1,0 | 1,0 | 1,0 |
| ParallelSplit | 1,7 | 1,6 | 1,6 | 1,7 | 1,9 | 1,9 |
| ParallelSwing | 2,0 | 2,0 | 2,0 | 2,1 | 2,2 | 2,2 |
| Split | 2,2 | 2,2 | 2,2 | 2,2 | 2,4 | 2,4 |
| Overview | Conclusions | Source code | Download |
![]()