Coding for the OEIS

Good coders have a personal style — and bad coders also.

In this post I want to explain how good programs in the program sections of the OEIS database look like in my opinion and why.

I will proceed exclusively by examples and add some basic principles in the comments. I will update this list from time to time when I have been annoyed by some contribution or add new examples when I meet good ones (many excellent examples are of course already in OEIS!)

The goal of this post is to enhance the OEIS user experience and although my criticism will sometimes be merciless I am aware that my criteria and ratings are subjective.

In addition I must confess that there is no single rule which I have not violated in my contributions. Thus this post is written above all to educate myself.

The definition of A034895 reads: "Dropping any digit gives a prime number."

P:=proc(q) local d, k, n; for n from 1 to q do d:=ilog10(n); for k from 0 to d do if not isprime(trunc(n/10^(k+1))*10^k+(n mod 10^k)) then break; fi; od; if k=d+1 then print(n); fi; od; end: P(10^4);

A034895_list := proc(up_to) local k, n, L, prime; L := NULL; for n from 1 to up_to do prime := true; for k from 0 to ilog10(n) while prime do prime := isprime(trunc(n/10^(k+1))*10^k+(n mod 10^k)); od; if prime then L := L,n fi; od; [L] end: A034895_list(1673);

*Do formatting!*

If the program is worth to be published then there is also enough room for a good formatting.

*Use a systematic name.*

Here 'A034895_list' instead of 'P'. This avoids name conflicts and one can include the function easily into a larger context or a library without having to adapt the name. The name should also indicate the type of the return value: Aaaaaaa(n) returns the value of the function evaluated at n whereas Aaaaaaa_list(len) returns the initial segment of the sequence of length len and Aaaaaaa_list(up_to) the initial values below or equal to up_to.

*Do not use a 'print' command in the function!*

Rather return a value (an integer) or a list of integers from the function. Then print this list — if this is what you want. Of course in many cases one only wants to process the list and the print statements are contraproductive.

*Always give a meaningful return value.*

For example if you execute P(10) virtually nothing (visible) will happen and the user will be irritated. Executing A034895_list(10) will return [], the empty list.

*Learn and use the syntax and the idiomatic style of the computer language you use!*

For example the loop

for k from 0 to d do if not isprime(..) then break fi; od;

is correct but not idiomatic for the language used. Maple is not 'C' and offers here the nice construct

for k from 0 to d while prime do prime := isprime(..); od;

This logic is also easier to understand than the negated logic of the 'break' construct.

*Avoid pseudo-mathematification!*

Use library functions which are appropriate for the problem at hand.

A034895 is a base sequence. It's a 'nice' base sequence, well, but that's a matter of taste. Anyway, base sequences are not really mathematics. They are rooted in symbolic representations and the computational tools for handling them are in the first place the tools of string processing.

The use of the functions 'ilog10' and 'n mod k' in P is not entirely appropriate. They pretend mathematics where there is none. We can do also very well without them, as the variant below shows, which is much more honest in my eyes.

A034895_list := proc(up_to) local prime, k, n, m, j, K, L, S; S := NULL; for n from 1 to up_to do L := convert(n, base, 10); prime := true; for k from 1 to nops(L) while prime do K := subsop(k = NULL, L); m := add(K[j]*10^(j-1), j=1..nops(K)); prime := isprime(m) od; if prime then S := S,n fi; od; [S] end:

Still this is not the best way to code A034895. Before we proceed we switch to sequence A051362, the "super-prime numbers" as Jaime Gutierrez named them. The only difference is that we want the original number also to be prime. All we have to do is to replace in line 6 "prime := true" by "prime := isprime(n)". On Math.StackExchange there is a very enjoyable discussion of this sequence Deleting any digit yields a prime.

One question remains open: Why did the original author chose to implement the function as a list? The most common reason to implement a function returning a list rather than returning a single item is efficiency. If the overhead of computing a single item is large compared to the overhead computing a list then implementing a list is natural. However if this is not the case it is more flexible to work with filters. |

In our example this amounts essentially to remove the for-loop over n from the function body and rename 'A051362_list' to 'is_A051362'.

is_A051362 := proc(n) local L, K, j, k, m, prime; L := convert(n, base, 10); prime := isprime(n); for k from 1 to nops(L) while prime do K := subsop(k = NULL, L); m := add(K[j]*10^(j-1), j=1..nops(K)); prime := isprime(m) od; prime end:

To reintroduce the function A051362_list we now define:

A051362_list := n -> select(is_A051362, [$1..n]);

I think the increase in conciseness and readability of the implementation is visible.

The OEIS has two main types of sequences, the normal ones like n^2 and the 'lists', which enumerate integers with some special attribute like the list of the prime numbers. The first step when implementing a sequence is to identify the type of the sequence.

In our example we looked at numbers with the property that dropping any digit from their decimal number representation gives the decimal representation of a prime number. Thus we are looking at a list in the sense of the OEIS.

We want to associate an implementation pattern with an OEIS-list.
According to Wikipedia a software design pattern
*"is a description or template for how to solve a problem that can be used
in many different situations. Patterns are formalized
best practices that the programmer can use to solve common problems when designing
an implementation."*

In the discussion above we outlined such an implementation pattern. Roughly it says: If a sequence is a list then implement the function in two steps:

- Implement an isAaaaaaa function of type integer -> boolean which answers whether or not an integer n has the property Aaaaaaa.
- Implement a function Aaaaaaa_list(up_to) based on the isAaaaaaa function filtering those values in the range (1..up_to) which satisfy the property.

This is the filter (or selection) pattern and we recommend it as the template for implementing OEIS-lists.

On a rule of the thumb basis 99% of all sequence implementations in the OEIS need none of the more sophisticated abilities Mathematica or Maple offer to the power user. So why use Maple or Mathematica when there is the open source software Sage? Note that you can start using Sage without going through the hassles of installing Sage by using Sage in the cloud, you only need a browser to do so. Besides this there are no hassles to install Sage as you can download binaries from a server of a nearby university.

Sage uses Python as a base language which emphasizes code readability and enables clear and succinct programs. And this is exactly the kind of programs which should be presented in the OEIS.

def is_A051362(n): prime = is_prime(n) if prime: L = ZZ(n).digits(10) for k in range(len(L)): K = L[:]; del K[k] prime = is_prime(ZZ(K, base=10)) if not prime: break return prime A051362_list = lambda n: filter(is_A051362, range(n))

Of course one could also layer A051362 on A034895. To do so one has only to restrict the range of the filter to prime numbers. Assuming that the prime numbers were also generated by a filter (predicate 'is_prime') this effectively amounts to a composition of filters. Indeed, the possibility to easily compose filters is one of the great advantages of this pattern.

Using Sage with IPython offers a nice and almost Mathematica-like workflow (only better formatted). In an IPython notebook the functions look like this:

While we are talking about Sage we might look at another example of the filter implementation pattern which shows how nicely this system integrates into OEIS. Note that the sloane.Annnnnn functions (written mainly by Jaap Spies) are build into Sage.

A formula like

a(n) = (5 * (1 - (-1)^n) + 2 * n * n) / 4

is easy to compute. However, seen as a sequence the following form is somewhat more illuminating:

a(n) = 2*a(n-1) - 2*a(n-3) + a(n-4).

It shows how a term of the sequence depends on preceding terms. This is an example of a linear recurrence with constant coefficients. The constant coefficients are here the coefficients of [a(n-1), a(n-2), a(n-3), a(n-4)], i. e. (2, 0, -2, 1).

The advantages of writing a(n) in the form of a recurrence are obvious: it exhibits a structure of the sequence which cannot be seen by the naked eye looking at the formula and it is computationally advantageous to use this form. The closed formula needs 3 multiplication, 2 additions, 1 division and the evaluation of the term (-1)^n. In contrast the recurrence formula needs 2 multiplications and 2 additions. So it is more efficient.

In Sage the recurrence can be implemented like this:

def G(): a, b, c, d = 0, 3, 2, 7 while True: yield a a, b, c, d = b, c, d, a + 2*(d - b) A245581 = G() [A245581.next() for n in range(20)]

We did not mention it yet, but it is obvious that we also need 4 initial values to start. So we could generalize the program by calling the generator with the initial values.

def G(A, B, C, D): a, b, c, d = A, B, C, D while True: yield a a, b, c, d = b, c, d, a + 2*(d - b)

In this form the program represents at least 34 sequences in the OEIS!

A001840, A001859, A002620, A002623, A004652, A006578, A007590, A014616, A033638, A035608, A047838, A061925, A077043, A085622, A092634, A097063, A102214, A110907, A114113, A121470, A137928, A139595, A142717, A164486, A173511, A174929, A176222, A179272, A184005, A194073, A195605, A209350, A245578, A245581

All these sequences are computed in the same way and differ only by the
initial terms thrown in. For example `G(0,3,2,7)` is A245581,
`G(1,10,18,22)` is A245578 and `G(1,0,1,5)` is A209350.
You can find thousands of such examples
listed in the index to the OEIS.

By the way, the sequence `G(0,3,2,7)`, seen as a real function, has a nice
property: it is symmetric to the y-axis. And indeed the identity a(n) = a(-n) holds for the terms of
the sequence A245581.

Some users submit scripts which look like contributions to the infamous 'Pimp My Formula' TV-show. As seen in the database:

[seq(ceil(binomial(n+1, 1)*2^(n-1)), n=-1..29)];

The author does not respect the offset (which is 0), apparently does not know what `binomial(n+1, 1)`
is, or does know it, but wants make things look more
interesting.
Neither the function 'binomial' nor the
function 'ceil' is necessary here in any way.
Not only is there no reason to use the function 'ceil', its use is misleading: it
suggests that the definition is based on non-integral numbers, which is not the case.
*Into the trash!*

Recommended form:

A057711 := n -> `if`(n<3, n, n*2^(n-2));

*Do not beef up scripts. Make them simple!*

Beefing up scripts is the criterion which identifies the mediocre programmer.

*Do not use library functions if not necessary!*

*Implement*mathematical formulas, do not copy them without thought.

A slightly more intellectually stimulating example is this one:

Select[ Prime@ Range[2, 110], JacobiSymbol[-1, #] == -1 &]

The Jacobi symbol is a non-trivial function and an implementation requires a non-trivial effort, as you can see for example
here. In the event
that it is not clear what the author wants to compute,
this is the solution: *Primes of form 4n+3*.
However why the author suggests this particular program one can only speculate: the name 'Jacobi' surely adorns
every script.
But is the use of it appropriate? Or is it only a pimped script?

In my opinion this script should never have entered the database in particular since there already was a perfect solution by Alonso del Arte:

Select[4 Range[150] - 1, PrimeQ]

The recommended form with Sage:

A002145_list = lambda n: filter(is_prime, range(3, n, 4))

To prevent any misunderstandings: The fact that for primes p of the form 4n+3 the identity Legendre(-1, p) = -1 holds (because (-1)^((p-1)/2)) = -1 for p = -1 (mod 4)) is noteworthy, but only in the comment or formula section of A002145, not as a method of calculation.