## Squares in Arithmetic Progression (mod p)

```It's well known that there are no four squares in arithmetic
progression among the integers, but there does exist a non-trivial
arithmetic progression of four or more squares (i.e., quadratic
residues) modulo p for each prime p greater than 11.  To prove this,
consider four integers a,b,c,d and the four quantities

A = ab+cd     B = ac+bd     C = ab-cd     D = ac-bd

so that we have

B^2 - A^2  =  D^2 - C^2  =  (a^2 - d^2)(c^2 - b^2)

and
C^2 - B^2  =  (a^2 - d^2)(b^2 - c^2) - 4abcd

The four squared quantities are in arithmetic progression if we set
these two differences equal to each other, which gives the condition

(a^2 - d^2)(b^2 - c^2)  =  2abcd

Solving for b gives
____________________
b =  c -------------------------------
a^2 - d^2

If there exists an integer m such that

a^4 - (ad)^2 + d^4  =  m^2      (mod p)            (1)

the preceding equation gives the proportionality between b and c mod
p, so we can select the convenient solution

b = ad - m          c = a^2 - d^2

and use this to construct a 4-term arithmetic progression of squares.
For example, in Z_13 we can set a=1 and d=5, and we find that the
square root is m=9, so we can set b = 9 and c = 2 (mod 13), and then
we can compute four values whose squares are in arithmetic progression
(mod 13) as follows

A  =  ab+cd  =   6           6^2 = -3
B  =  ac+bd  =   8           8^2 = -1
C  =  ab-cd  =  12          12^2 =  1
D  =  ac-bd  =   9           9^2 =  3

Thus, one possible way of proving that there is a 4-term arithmetic
progression of squares modulo every prime p greater than 11 would be
to show that there is a solution of the congruence (1) (with a^2 not
congruent to d^2 mod p) for every prime p, and such that the step size
of the corresponding progression is not zero modulo p for any prime
p greater than 11.  To prove this, note the following values of m^2
in integers

a   d              m^2              ad(a^2 - d^2)
--- ---     -------------------    -----------------
1   2        13                    (2)(3)
2   9      6253  =  (13^2)(37)     (2)(3^2)(7)(11)
3   5       481  =  (13)(37)       (2^4)(3)(5)

The first of these shows that there is a solution for every prime
modulo which 13 is a square (i.e., a quadratic residue), and the
second shows that there is a solution for every prime modulo which
37 is a square.  The only primes not covered are those modulo which
13 and 37 are both non-squares, but in that case the product (13)(37)
is a square, so the third case above shows that there is also a
solution for these primes.  (We have made use of the easily proven
fact that the product of two non-squares mod p is a square mod p.)
The only thing that can prevent this from giving a non-trivial
arithmetic progression of squares mod p is if the step size is zero
mod p.  The step size is

The factor (ad-m) vanishes modulo p if and only if ad = m (mod p),
which implies that m^2 - (ad)^2 = (a^2 - d^2) = 0 (mod p).  Hence
we are assured of a non-trivial solution for every prime except
those that divide ad(a^2 - d^2) for one or more of these three cases.
Thus the only primes with no non-trivial 4-term arithmetic progression
of squares are 2, 3, 5, 7, and 11, which completes the proof.

For a less elegant proof, we can exhibit an unavoidable set of
combinations of Legendre symbols for a small set of residues, and then
show that each of these combinations yields an arithmetic progression
of (at least) four quadratic residues.  For example, letting [r]
denote the Legendre symbol for the residue r modulo p (where 1
signifies a square, -1 signifies a non-square, and * signifies "don't
care"), Kurt Foster noted that for each prime p greater than 13 the
combination of Legendre symbols for the residues -1, 2, 3, 5, and 13
must fall into one of the following nine cases:

[-1]  [2]  [3]  [5]  [13]    4-square progression       d
----  ---  ---  ---  ----    --------------------     -----
1    1    *    *    *        -1   0   1   2           1
1    *    1    *    *        -3  -1   1   3           2
1   -1   -1    *    *        -6  -1   4   9           5
*    1    1    *    *         1   2   3   4           1
-1    *   -1    1    *        -3   1   5   9           4
-1    1   -1   -1    *        -6   1   8  15           7
-1   -1    *   -1    *        -5  -2   1   4           3
-1   -1    *    1   -1       -13  -2   9  20          11
*    *    *    1    1         1   5   9  13           4

Notice that we are making use of the fact that the product of two
the 3rd row the residues 2 and 3 are stipulated to be NON-squares,
which implies that 6 is a square.  Also, since -1 is stipulated to
be a square, it follows that -6 is a square.  Of course, the residues
4 and 9 are always squares, so in this case we have the arithmetic
progression of four squares -6, -1, 4, 9 with a common difference
of 5.  Similarly each of the other rows gives an arithmetic
progression of four squares.

To prove that the combination of Legendre symbols for those five
residues MUST fall into (at least) one of those nine combinations
(for any prime greater than 13), we can simply apply Boolean
expansion.  For convenience, let -1 be denoted by 0, so we have
the nine "mincuts"

11***
1*1**
100**
*11**
0*01*
0100*
00*0*
00*10
***11

We can now extract from this list those combinations compatible with
a left-most bit equal to 0, and those compatible with a left-most bit
equal to 1, leading to the following two sub-sets of mincuts (dropping
the leading bit, since it is 0 in the left-hand set and 1 in the right
hand set)

0            1

11**         1***
*01*         *1**
100*         00**
0*0*         11**
0*10         **11
**11

Now we need to show that the four bits of each of these two subsets
cover all possible combinations.  Each of these subsets can be split
into two sub-subsets, representing the combinations of the four bits
compatible with a leading 0 or 1 in those four bits.  Thus we have
(again dropping the leading bit, since it is stipulated)

00    01          10    11

01*   1**         1**   ***
*0*   01*         0**   1**
*10   00*         *11   1**
*11   *11               *11

Obviously the column "11" is covered, because the remaining bits are
all wildcards.  Also, the column "10" is covered, because it contains
"1**" and "0**".  So, we need only expand the "00" and "01" columns,
as shown below:

000   001     010  011

1*    0*      1*   **
0*    10      0*   11
10    11      11
11

Thus all four of these subsets provide complete coverage, so we have
proven that the original set of nine combinations of Legendre symbols
is unavoidable (for primes greater than 13).

Is the above set of nine the MINIMAL unavoidable set, or is there an
unavoidable set with fewer than nine progressions?  Or involving
fewer than five Legendre symbols?  It turns out there is a smaller
unavoidable set, consisting of just eight progressions and involving
just four Legendre symbols.  Again letting [r] denote the Legendre
symbol for the residue r modulo p, we have an unavoidable set given
by the stipulations

Legendre symbols             Progression             d
-----------------------     -----------------       -----
[-1]=[2]=[5]                -5  -2   1   4           3
[-1]=[3], [5]=1             -3   1   5   9           4
[-1]=[2]=-1, [3]=[5]=1      -6   5  16  27          11
[-1]=[5], [2]=1             -5   2   9  16           7
[2]=[3]=1                    1   2   3   4           1
[-1]=1, [2]=[3]             -6  -1   4   9           5
[-1]=[3]=1                  -3  -1   1   3           2
[-1]=[2]=1                  -1   0   1   2           1

Notice that the set contains progressions with common differences of
2, 3, 5, 7, and 11, as did the original set.  This is to be expected,
because our Legendre symbols are unavoidable for ALL primes, so the
only reason we do not have non-trivial progressions for p = 2, 3, 5,
7, or 11 is that in these cases the applicable common difference
equals the prime itself.  For example, the quadratic residues mod 7
satisfy the conditions in the 4th row of the above table, but the
common difference for that row is 7, leading to the degenerate
progression 2,2,2,2...

These questions can be generalized to cover the cases of k squares
in arithmetic progression modulo m.  For any natural number m, let
f(m) denote the length of the longest arithmetic progression of
squares among the integers modulo m, with step size co-prime to m.
The values of f(m) for composite m are determined by the simple rule
that f(ab) = min[f(a),f(b)], so ultimately f(m) equals the least
value of f(p) for any prime p that divides m.  The only exception is
that f(2m)=f(m) for odd values of m.  Hence it is sufficient to
consider only prime arguments.  The primes p with f(p) equal to 2
through 6 are listed below

f(p)
------
2      2   3
3      5   7  11
4     13  19  29  31  37
5     17  23  41  43  47  59 113 139
6     53  61  67  79  89 107 137 149 157 163 167 173 179
181 199 229 257 349 373 617

This shows that for every prime p greater than 3 the value of f(p)
exceeds 2.  Likewise for every prime greater than 11 we see that
f(p) exceeds 3, which is the same as saying that the integers modulo
p contain at least four squares in arithmetic progression for every
p greater than 11.  This is the proposition we proved above.

The table also indicates that for every prime p greater than 37
the integers modulo p contain at least FIVE squares in arithmetic
progression, and for every prime p greater then 139 they contain at
least SIX squares in arithmetic progression, and for every prime p
greater than 617 they contain at least SEVEN squares in arithmetic
progression.  If we continue this process, we find the sequence of
thresholds
3, 11, 37, 139, 617, 1069, 3389, ...

These values increase by roughly a factor of 3 or 4 (or PI?) on each
level.  The number of primes on the first several levels are 2, 3, 5,
8, 20, 30, ...

To actually prove that, for every prime p greater than 37, there is an
arithmetic progression of five squares (mod p), we could proceed as
we did in the case of four squares.  We expect that the set of un-
avoidable progressions must include progressions with step sizes equal
to each of the primes  2, 3, 5, 7, 11, 13, 19, 29, 31, and 37.  Our
objective is to minimize the total number of Legendre symbols, and
to include cases that involve both signs of these symbols, in order
to maximize the coverage.  Building on the unavoidable set for 4-term
progressions, we might begin with a table based on the four Legendre
symbols [-1], [2], [3], [5], and [7] as shown below.

Legendre symbols             Progression             d
-----------------------  ----------------------     -----
[2]=[3]=1                0   1   2   3   4           1
[-1]=[2]=1              -2  -1   0   1   2           1
[-1]=[3]=[5]=1              -3  -1   1   3   5       2
[-1]=[3]=[7], [5]=1         -7  -3   1   5   9       4
[-1]=[2]=[5]            -8  -5  -2   1   4           3
[-1]=1, [2]=[3]=[7]         -6  -1   4   9  14       5
[-1]=[3]=[5], [2]=1        -12  -5   2   9  16       7
[-1]=[2]=[7]                -8   3  14  25  36      11

This covers all but seven of the 32 possible combinations of those
five Legendre symbols.  The task is to find conditions, probably
involving one or two more Legendre symbols, and progressions with
step sizes of 13, 19, 29, 31, and 37 to complete the unavoidable set.
What is the smallest unavoidable set of five squares in arithmetic
progression, and how many Legendre symbols are needed?

Incidentally, it is known that for all k there are k CONSECUTIVE
squares mod p for all sufficiently large p.  This follows from Van
der Waerden's theorem, which also immediately solves the original
problem, since we're partitioning p things into 2 classes, and an
arithmetic progression of non-squares can be multiplied by a non-
square to give an arithmetic progression of squares.  Further,
Szemeredi's theorem implies that for sufficiently large p, ANY
subset of size p/2 (indeed, any fixed fraction of p) contains a
k-term arithmetic progression.

This is certainly borne out by observations, since it's not hard to
find primes modulo which there are quite long arithmetic progressions
of squares.  For example, in Z mod 109 we have sets of ten squares in
arithmetic progression:

x    =  10  50  39  18  33   1  49  21  15   3
x^2  =  -9  -7  -5  -3  -1   1   3   5   7   9

Even better, in Z mod 2689 we have 25 consecutive squares, ranging
from -12 to +12 (mod 2689).  The valencies of f(p) for all the primes
less than 5000 are distributed as shown below

f(p):  2 3 4 5  6  7  8   9  10  11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
v(f):  2 3 5 8 20 30 61 118 130 124 69 53 19  8  4  2  0  0  0 11  0  0  0  2

The other prime less than 5000 with f(p) = 25 is 3529.  The first prime
p with f(p) = 21 is 1009.
```