Recently William Hart posed the following question:

What can you do in 30 lines
of C or C++ code?

What William intends is a mathematical coding challenge. He explains: The idea is to implement mathematical operations following a strict set of rules:

No implementation of an algorithm may consume more than 30 lines (excluding comment and blank lines, which are not counted).
No two algorithms for computing the same mathematical operation may have the same complexity.
An additional top-level function which selects between various algorithms for a given operation may be present, but must also not exceed 30 lines.
All functions must implement recognized mathematical operations and be generally useful.
The implementations should not be unduly inefficient.
No line of code should contain more than one complete C statement.
No line of code may exceed 80 characters.
All code should be readable, commented and well spaced out (remember blank and comment lines are not counted in the total).
Macros of up to 7 lines are allowed, but must be generally useful.
All non-generic code must be implemented in pure C with no external libraries.
Each C module may have a C++ class and wrapper associated with it. There is no limit on the size of the class declaration.
All generic algorithms should be implemented in C++. The challenge is to come up with any of the following:
Implementations of new mathematical operations not already in CAS Challenge
Algorithms with a better complexity than those already implemented
Substantially faster version of a function within the space limit
New implementation of an existing function in less space

William has set up a GitHub repository to which you can contribute if you develop such things.

I really like William's idea very much. For me personally this is unfortunately clouded by the fact that I am neither proficient in C nor do I appreciate C much. If William had set up the challenge in D this would have been an excuse for me to learn D.

Nevertheless I wanted to learn the framework William had set up and so I decided to contribute a mathematical hello-world-function: the Jacobi symbol. To make it not quite so boring I included a small twist: it does not use the usual remainder function but the 'least absolute remainder' function (larem in the code below).

At least in one respect I do not conform to the idiom William presumably expects: instead of macros I use one-liner functions and I use boolean variables: this expresses my reservations with the way C handles these things.

   2:  // Copyright (c) 2013, Peter Luschny

   3:  // All rights reserved.

   4:  // Licence: CC BY-SA 3.0

   5:  // Attribution-ShareAlike 3.0 Unported

6:

7:

   8:  #include <iostream>

   9:  #include <cassert>

  10:  #include <stdbool.h>

  11:  using namespace std;

12:

  13:  #include "ZZp.hpp"

14:

  15:  int rem(const ZZ_t& a, unsigned b);

  16:  ZZ_t larem(const ZZ_t& a, const ZZ_t& b);

  17:  int jacobi_symbol(const ZZ_t& a, const ZZ_t& b);

  18:  bool jacobi_requires(const ZZ_t& a, const ZZ_t& b);

  19:  void test_jacobi(rand_t state);

20:

  21:  bool is_zero    (const ZZ_t& a) { return a.size == 0; }

  22:  bool is_positive(const ZZ_t& a) { return a.size > 0; }

  23:  bool is_negative(const ZZ_t& a) { return a.size < 0; }

  24:  bool is_nonnegative(const ZZ_t& a) { return a.size >= 0; }

25:

  26:  bool is_odd( const ZZ_t& a) { return is_zero(a) ? 0 : (a.n[0] & 1) == 1;}

  27:  bool is_even(const ZZ_t& a) { return is_zero(a) ? 1 : !is_odd(a); }

28:

  29:  void  halve(ZZ_t& a) { zz_div_2exp(&a, &a, 1); }

  30:  int rem(const ZZ_t& a, unsigned b) { return a.n[0] % b; }

31:

  32:  /* least absolute remainder */

  33:  ZZ_t larem(const ZZ_t& a, const ZZ_t& b)

  34:  {

  35:      ZZ_t q, r, h;

36:

  37:      zz_div_2exp(&h, &a, 1);

  38:      zz_divrem(&q, &r, &b, &a);

39:

  40:      if (zz_cmp(&h, &r) < 0)

  41:      {

  42:          zz_sub(&r, &r, &a);

  43:      }

  44:      return r;

  45:  }

46:

  47:  bool jacobi_requires(const ZZ_t& a, const ZZ_t& b) {

  48:      return is_odd(b) && is_nonnegative(a) && is_positive(b);

  49:  }

50:

  51:  int jacobi_symbol(const ZZ_t& A, const ZZ_t& B)

  52:  {

  53:      // assert(jacobi_requires(A, B));

54:

  55:      ZZ_t a = A, b = B, q, t, one = 1;

  56:      int j = 1, r;

  57:      bool remb4, remb8;

58:

  59:      if (is_zero(a))

  60:          return (one == b) ? 1 : 0;

61:

  62:      while  (!is_zero(a)) {

63:

  64:          remb4 = rem(b, 4) == 3;

65:

  66:          if (is_negative(a)) {

  67:              a = -a;

  68:              if (remb4) j = -j;

  69:          }

70:

  71:          r = rem(b, 8);

  72:          remb8 = (r == 3) || (r == 5);

73:

  74:          while (is_even(a)) {

  75:              halve(a);

  76:              if (remb8) j = -j;

  77:          }

78:

  79:          if (rem(a, 4) == 3) {

  80:              if (remb4) j = -j;

  81:          }

82:

  83:          t = a;

  84:          a = larem(a, b);

  85:          b = t;

  86:      }

87:

  88:      return zz_cmp(&one, &b) < 0 ? 0 : j;

  89:  }

90:

  91:  /* (a, b, j) example trace:

  92:  (109981, 737777, 1)

  93:  (-32090, 109981, 1)

  94:  (-2334, 16045, -1)

  95:  (-293, 1167, 1)

  96:  (-5, 293, -1)

  97:  (-2, 5, -1)

  98:  1

  99:  */

 100:

 101:  void test_jacobi(rand_t state)

 102:  {

 103:      ZZ_t a, b;

 104:      int r1, r2, i, j = 0;

 105:

 106:      cout << "jacobi_symbol... ";

 107:

 108:      static int JS[] = {-1,-1,1,1,-1,1,1,-1,1,-1,1,1,-1,1};

 109:

 110:      for (i = 0; i < 10; i++)

 111:      {

 112:          do {

 113:              randbits(a, state, n_randint(state, 500));

 114:              randbits(b, state, n_randint(state, 500));

 115:

 116:          } while (!jacobi_requires(a, b));

 117:

 118:          r1 = jacobi_symbol(a, b);

 119:          r2 = JS[j++];

 120:          assert(r1 == r2);

 121:

 122:          if (is_odd(a))

 123:          {

 124:              r1 = jacobi_symbol(b, a);

 125:              r2 = JS[j++];

 126:              assert(r1 == r2);

 127:          }

 128:      }

 129:      cout << "PASS" << endl;

 130:  }

 131:

 132:  int main()

 133:  {

 134:      rand_t state;

 135:      randinit(state);

 136:      test_jacobi(state);

 137:

 138:      cout << endl << "Done!" << endl;

 139:      cin.get();

 140:

 141:      return 0;

 142:  }

C versus C++

When should a function be implemented in C and when in C++?

One of the rules says: All generic algorithms should be implemented in C++ and where an underlying C module exists, functions should be pushed down into it where practical, especially if they may be useful by other C algorithms in the module.

William Hart further explains: "Certainly larem needs to be at the C level. For example it can be used in gcd, which is already at the C level. Of course jacobi could have gone at the C++ level, as it doesn't feature in many low level algorithms."

The nice thing is that William now implemented both functions at the C level and we can compare the two different representations.

   1:  /*

   2:  Copyright (c) 2013, William Hart

   3:  All rights reserved.

4:

   5:  Redistribution and use in source and binary forms, with or without modification,

   6:  are permitted provided that the following conditions are met:

7:

   8:  1. Redistributions of source code must retain the above copyright notice, this

   9:  list of conditions and the following disclaimer.

10:

  11:  2. Redistributions in binary form must reproduce the above copyright notice,

  12:  this list of conditions and the following disclaimer in the documentation and/or

  13:  other materials provided with the distribution.

14:

  15:  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND

  16:  ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED

  17:  WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.

  18:  IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,

  19:  INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,

  20:  BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,

  21:  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF

  22:  LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE

  23:  OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED

  24:  OF THE POSSIBILITY OF SUCH DAMAGE.

  25:  */

26:

  27:  /* w.b. hart */

  28:  void zz_larem(zz_ptr r, zz_srcptr a, zz_srcptr b)

  29:  {

  30:      long asize = ABS(a->size);

  31:      long bsize = ABS(b->size);

  32:      long rsize = bsize;

  33:      long qsize = asize - bsize + 1;

  34:      zz_t q, h, t;

35:

  36:      zz_init(t);

  37:      zz_copy(t, a);

38:

  39:      zz_init(q);

  40:      zz_init(h);

  41:      zz_fit(h, asize);

42:

  43:      zz_div_2exp(h, a, 1);

  44:      if (a->size < 0)

  45:          zz_add_1(h, h, 1);

46:

  47:      if (asize >= bsize) {

  48:          zz_fit(q, qsize);

49:

  50:          nn_divrem(q->n, t->n, asize, b->n, bsize);

51:

  52:          rsize = nn_normalise(t->n, rsize);

53:

  54:          t->size = a->size >= 0 ? rsize : -rsize;

55:

  56:          if ((a->size ^ b->size) < 0 && t->size != 0 && zz_cmpabs(t, h) >= 0)

  57:              zz_add(t, t, b);

  58:          else if (zz_cmpabs(t, h) >= 0)

  59:              zz_sub(t, t, b);

  60:      }

61:

  62:      zz_swap(t, r);

  63:      zz_clear(t);

  64:      zz_clear(h);

  65:      zz_clear(q);

  66:  }

67:

  68:  /* w.b. hart */

  69:  int zz_jacobi(zz_srcptr A, zz_srcptr B)

  70:  {

  71:      int j = 1, res, r8, remb4, remb8;

  72:      zz_t a, b;

73:

74:

  75:      if (zz_is_zero(A))

  76:          return zz_equal_1(B, 1);

77:

  78:      zz_init(a);

  79:      zz_init(b);

80:

  81:      zz_copy(a, A);

  82:      zz_copy(b, B);

83:

  84:      while (!zz_is_zero(a)) {

  85:          remb4 = (b->n[0] % 4) == 3;

86:

  87:          if (a->size < 0) {

  88:              zz_neg(a, a);

  89:              j = remb4 ? -j : j;

  90:          }

91:

  92:          remb8 = ((r8 = (b->n[0] % 8)) == 3 || r8 == 5);

93:

  94:          while ((a->n[0] % 2) == 0) {

  95:              zz_div_2exp(a, a, 1);

  96:              if (remb8) j = -j;

  97:          }

98:

  99:          j = (a->n[0] % 4) == 3 && remb4 ? -j : j;

 100:

 101:          zz_larem(b, b, a);

 102:          zz_swap(a, b);

 103:      }

 104:

 105:      res = zz_cmp_1(b, 1) > 0 ? 0 : j;

 106:

 107:      zz_clear(a);

 108:      zz_clear(b);

 109:

 110:      return res;

 111:  }

Can we see a performance difference?

The increase in complexity is obvious when we compare the C with the C++ version. Does it pay back in terms of efficiency? Let us look at a small benchmark.

		MS-VC	GNU-GCC (minGW)	GNU-GCC (Ubuntu)
Jacobi C++	larem C++	0.510	0.754	0.276
Jacobi C++	larem C	0.316	0.420	0.186
Jacobi C	larem C	0.312	0.417	0.184

Fair warning: people who understand cflags might get different results. In any case it is obvious that William did an excellent job in implementing larem. On the other hand it also shows that there is no big difference between Jacobi C++ and Jacobi C.

The benchmark used is:

   1:  #include <iostream>

   2:  #include <chrono>

   3:  #include "ZZ.h"

4:

   5:  using namespace std;

6:

   7:  int jacobi_symbol(const ZZ_t& a, const ZZ_t& b);

   8:  int jacobi_symbol_mix(const ZZ_t& A, const ZZ_t& B);

   9:  bool jacobi_requires(const ZZ_t& a, const ZZ_t& b);

10:

  11:  int main()

  12:  {

  13:      ZZ_t a, b;

  14:      rand_t state;

  15:      long i, rep;

  16:      chrono::time_point<chrono::system_clock> start, end;

17:

  18:      start = chrono::system_clock::now();

19:

  20:      for (rep = 0; rep < 5; rep++)

  21:      {

  22:          randinit(state);

  23:          for (i = 0; i < 10000; i++)

  24:          {

  25:              do {

  26:                  randbits(a, state, n_randint(state, 500));

  27:                  randbits(b, state, n_randint(state, 500));

  28:              } while (!jacobi_requires(a, b));

29:

  30:              // jacobi_symbol(a, b);

  31:              // jacobi_symbol_mix(a, b);

  32:              jacobi(a, b);

  33:          }

  34:      }

35:

  36:      end = chrono::system_clock::now();

  37:      chrono::duration<double> elapsed_seconds = end - start;

38:

  39:      cout << "elapsed time: " << elapsed_seconds.count() / rep

  40:           << " seconds."      << endl;

41:

  42:      cout << endl << "Done!" << endl;

  43:      cin.get();

44:

  45:      return 0;

  46:  }

But of course this is only the first part of the benchmark: C versus C++. Perhaps more important is to look at the performance difference larem versus rem. Jacobi is usually implemented with the positive remainder; indeed I have not yet seen an implementation which uses the least absolute remainder. Does it pay to switch to larem? I wrote a second test program with Sage and counted the number of calls to the remainder function.

counter_larem = 0
counter_rem = 0

for r in [ZZ.random_element(72) for _ in range(200)]:
    for s in [ZZ.random_element(72) for _ in range(200)]:
        a = ZZ.random_element(10^r)
        b = ZZ.random_element(10^s)
        if is_even(b): b += 1
        J = Jacobi_rem(a, b)
        L = Jacobi_larem(a, b)
        j = jacobi_symbol(a, b)
        if J <> j or L <> j:
            print "ERROR", a, b, J, L, j

print "larem", counter_larem 
print "rem  ", counter_rem 
print (counter_larem/counter_rem).n()

The result is interesting:

larem-calls = 1013304
rem-calls   = 1256957
larem/rem   = 0.806

It indicates that we can indeed save quite a few division steps if we use the least absolute remainder instead of the positive remainder.

All in all we can conclude that it will be hard to implement a more efficient Jacobi symbol with two 30-lines C-functions.

Next step: is_prime(n).

"The problem of distinguishing prime numbers from composite numbers is known to be one of the most important and useful in arithmetic. Further, the dignity of the science itself seems to require that every possible means be explored for the solution of a problem so elegant and so celebrated." (Gauss in Disquisitiones Arithmeticae (1801))

Excitingly we are now only 30 lines of code away from the solution of this problem!

   1:  def isprime(n):

2:

   3:      if n <= 1: return false

   4:      if n <= 3: return true

   5:      if is_even(n): return false

6:

   7:      k = (n-1)//2

   8:      l = floor(2*log(n)^2)

   9:      for a in (2..min(n-1, l)):

  10:          j = jacobi_symbol(a, n)

  11:          if j == 0: return false

  12:          m = power_mod(a, k, n)

  13:          if j == 1:

  14:              if m-1 <> 0: return false

  15:          else:

  16:              if m+1 <> n: return false

17:

  18:      return true

19:

20:

  21:  def power_mod(a, e, n):

22:

  23:     if e == 0: return 1

  24:     res = 1

25:

  26:     for b in reversed(e.bits()):

  27:         res = (res*res) % n

  28:         if b == 1: res = (res*a) % n

29:

  30:     return res

Admittedly these are 30 lines of Python/Sage code but certainly can also be written in 2x 30 lines of C code in accordance with William's challenge.

Note that this is not a probabilistic prime test; assuming ERH is_prime(n) returns true if and only if n is prime. Thus even the NSA cannot build a backdoor in this algorithm. And the dignity of science now requires to prove the Extended Riemann Hypotheses.

It is also quite efficient as it computes the answer in O((log n)^5). Indeed it is more efficient than similar implementations in some popular computer algebra systems (for example in sympy) which first check if a and n are relatively prime. But this is dispensable as the Jacobi symbol returns 0 if they are not.

Recall that we based our computation of the Jacobi symbol on the least absolute remainder (larem) instead of the least positive remainder as it is done more often. Similarly it would have been more consistent to implement power_larem instead of power_mod. Actually the only difference would be to add the line "if n//2 < res: return res - n" before the final return in power_mod. Then in is_prime the lines "if j == 1: if m-1 <> 0: return false" and "else: if m+1 <> n: return false" collapse to the much more intuitive "if j <> m: return false". Clearly Jacobi designed his symbol with the least absolute remainder in mind.

Finally note that l = floor(0.96090602783640285*n.nbits()^2) can be used instead of l = floor(2*log(n)^2) to avoid the evaluation of log(n).

Reference: E. Bach and J. Shallit: Algorithmic Number Theory. 'larem' implements formula (4.9) page 79, 'jacobi_symbol' is on page 113, and 'is_prime' is the deterministic Solovay-Strassen test on page 284.

What can you do in 30 linesof C or C++ code?

C versus C++

Can we see a performance difference?

Next step: is_prime(n).

What can you do in 30 lines
of C or C++ code?