## Infinite Descent versus Induction

```It is often said that the principle of "infinite descent", developed
by Fermat in his study of Diophantine analysis, is equivalent to
induction.  In assessing this claim we first need to understand what
kind of "equivalence" is being claimed.  Typically those who claim
such an equivalence will substantiate it with a demonsration whereby
they re-write a proof by descent in the form of a proof by induction,
by means of some set of logical connectives and the other primitive
axioms of Peano arithmetic.  However, it's obvious that since Peano's
axioms for the natural numbers consist of nothing other than

(i) the naturals have a first member
(ii) each member has a unique successor
(iii) there are no loops in the chain of successors
(iv) mathematical induction

the entirety of Peano arithmetic, including every theorem of Peano
arithmetic, can be derived from these axioms.  Would it be suitable
to replace (iv) with "infinite descent"?  If the two are completely
equivalent, then the answer would have to be yes, but I would argue
that the real answer is no.  It is true that the axiom of induction
is implicit in the principle of infinite descent, but it is just as
true that the other three axioms are implicit in the principle of
infinite descent.  This means that infinite descent is of a higher
order than these basic axioms, and is constructed by invoking a
combination of them.  From this point of view, it makes no more
(and no less) sense to say that infinite descent is equivalent to
induction than to say that infinite descent is equivalent to the
unique successor axiom, or to the first member axiom.  All of these
axioms (plus the tacit axioms of logic) are implicated in the
principle of infinite descent, but no one of them individually
can rightly be said to be "equivalent" to infinite descent.

Those who argue that infinite descent is equivalent to induction
typically refer to a proof such as the following for the fact that
the sum of the first n integers is n(n+1)/2:

If not, there is a least integer, say N, for which this is
false.  N =/= 0, since 0 = 0(0+1)/2.  Since N - 1 < N, we
must have the sum of the first N - 1 integers is (N-1)N/2 and
adding N to both sides and doing obvious arithmetic, we reach the
contradiction that the sum of the first N numbers is N(N+1)/2.

However, this example is actually a proof by induction, not a
proof by infinite descent.  The proof supposes the existence of
a least integer N such that the sum from 1 to N is NOT equal to
N(N+1)/2.  So far so good.  A proof by infinite descent would
then show that there is an absolutely SMALLER number with the same
property.  This is the key distinction: the number must have a
smaller MAGNITUDE than N, not just PRECEED N in the ordered
sequence of numbers.

Pure induction is a chain of implications on the truth values
of a given ordered sequence of propositions, and the induction
is "activated" by citing the truth value(s) of one (or more)
particular propositions (such as the case N=0 in the above
example), from which the others follow.  Notice that the actual
magnitudes of the indices of the sequence are irrelevant; all
that matters is their discrete order.  (For a related discussion,
see Representing Sets of Pure Order).

In contrast, a proof by infinite descent operates essentially on
the magnitudes of the indices of an ordered sequence of propositions,
and shows that the indices cannot all differ by more than any fixed
finite magnitude "epsilon".  This is a subtle but important concept,
analagous to the different kinds of justifications that can be given
for the fact that the slope of  y = x^2  is  2x.  We know that for
some small non-zero increment dx the slope near any given x is
approximately equal to [(x+dx)^2 - x^2]/dx, which is 2x + dx.  Now,
on one level it seems that we can just set dx equal to 0, and we're
done.  However, the key step in our derivation involved a division
by dx, so we really need to invoke a rigorous "infinite descent"
argument leading to the concept of a LIMIT for a sequence of
values of dx approaching nearer to zero than any fixed finite
magnitude.  Thus, there is some validity in the claim that Fermat's
infinite descent concept is one of the proto-types for reasoning
about infintesimals and limiting sequences, which is perhaps not
surprising in view of Fermat's own contributions to the foundation
of the differential calculus.

To illustrate more clearly the distinction between pure induction
and infinite descent, let's define the cumulative sum S(k) by
the recurrence S(k) = S(k-1) + k with S(0) = 0, and suppose that
S(N) = N(N+1)/2 + D for some non-zero integer D.  The question
is, can we necessarily construct a number with the same property
but a smaller absolute value?  There may be a way, but we can't
simply use the number N-1, because this doesn't ensure a smaller
magnitude, i.e., we can continue to subtract 1 indefinitely (into
negative values), and the numbers at some point become larger in
magnitude, even though they are lesser in value.  Thus, we may be
able to use a chain of implication from N to N-1 as part of an
induction argument, but not an infinite descent (at least not
without some other constraint on the indices).

Notice that if N has the above property then simple subtraction
gives
S(N-1)   =   S(N) - N   =   (N-1)N/2 + D

so N-1 has the same property.  Therefore, if we can find ANY integer
k such that S(k)=k(k+1)/2 we can assert that D=0 for all integers
(positive and negative.)  However, this is NOT a proof by infinite
descent, it is a proof by induction from the particular value of k
for which we know that S(k) is of the form k(k+1)/2.

In contrast, consider a proof by infinite descent that sqrt(2) is
irrational.  We suppose there exist a pair of positive integers
n0,n1 such that  1 + (n1/n0) = sqrt(2).  Now consider the integer
n2 = n0 - 2n1.  It's easy to verify that 1 + (n2/n1) = sqrt(2),
from which it follows that n2 = [sqrt(2)-1] n1.  Now, we have
the inequality 0 < [sqrt(2)-1] < 1 , from which it follows that
0 < n2 < n1.  Repeating this procedure, we have n3 = n2 - 2n1,
and so on.  Thus, we have an infinite sequence of positive integers
n0, n1, n2,... of strictly decreasing magnitude, which is impossible
because there are only a finite number of positive integers less
than any given integer.  This is an example of a genuine proof by
infinite descent.

Much of the confusion in this area arises from a failure to clearly
distinguish between descent and infinite descent.  We often see
classifications of various "induction hypotheses" allowing us to
assert that the proposition P(n) is true for all natural numbers n
if any of these conditions are satisfied

WEAK      P(0) and ( P(n-1) => P(n) )
STRONG     P(0) and P(1) and ... and P(n-1)  =>   P(n)
DESCENT   -P(0) or -P(1) or  ... or -P(n-1)  <=  -P(n)

Whether it is appropriate to say that these conditons are "equivalent"
depends on precisely what kind of equivalence we mean, because they
also involve logical connectives and other common notions, but in any
case this definition of "DESCENT" is certainly close to pure induction,
since it is simply a contrapositive formulation of strong induction
(i.e.  Q => P  is equivalent to  -Q <= -P,  its contrapositive).

However, the above definition of "DESCENT" is quite distinct from the
principle of infinite descent.  Here we see a reason for much of the
confusion in this area, because people often fail to recognize this
distinction (especially since the term "descent" is often used as
shorthand for "infinite descent").

It's important to keep in mind that the essential distinction between
"induction" and "infinite descent" is not whether the argument
proceeds in an "upward" or "downward" direction, nor whether it's
applied in a positive or negative logical sense.  The essential
distinction is that induction relies on a principle of discrete
ORDER and completeness, whereas "infinite descent" represents a
higher-order principle of discrete absolute MAGNITUDE combined
with something very much like the pigeon-hole principle.

The above schema for "DESCENT" clearly doesn't embody the principle
of "infinite descent", because it simply asserts that if the integer
n does not have a specific property P, then at least one of the
integers 0 to n-1 must also not have the property P.  From this
it follows that 0 must not have the property.  Thus, if we happen to
know that 0 DOES have the property, we have a contradiction.  This
is indeed equivalent to induction, but it isn't an "infinite descent"
argument.

A true argument by "infinite descent" doesn't assert something about
the set of numbers 0 to n-1, it asserts THE EXISTENCE of an integer
that is both non-zero and smaller in *magnitude* than n.  Specifically,
it asserts that if an integer n has the property P, then THERE EXISTS
a non-zero integer of magnitude less than |n| with the same property.
This implies an INFINITE sequence of strictly decreasing integer
magnitudes beginning with |n|, which is impossible.  Notice that
the argument does not depend on knowledge of the P-status of ANY
particular number.  Also, notice that we can apply this form of
argument in any context where every element has a non-negative
"norm" (i.e., magnitude), and the difference between any two
distinct magnitudes is greater than some fixed constant.

To express the essence of induction and infinite descent a little
more precisely, let's first consider the classical notion of
mathematical induction in a context slightly beyond the naturals,
namely, the integers.  An inductive argument relies on the fact
that we can arrange the elements of some infinite set in a definite
ordered linear sequence, one that can be indexed with the integers.
Thus we have (in general) a "doubly infinite" ordered sequence of
elements
..., e[-2], e[-1], e[0], e[1], e[2], ...

and we can assert that this sequence contains ALL the elements e[]
of the set in question.  Now, induction says that if the possession
of a specific property P by the element e[j] for any arbitrary
integer j implies that e[j+1] also has the property P, and if we
can establish for some specific integer k that e[k] possesses the
property P, then EVERY element e[j] for j greater than or equal to
k possesses the property P.

Needless to say, we can replaces e[j+1] with e[j-1], and replace
"greater than" with "less than" in the above proposition, and the
resulting proposition also holds.  Thus, we can "ascend" or "descend"
with this argument.  Furthermore, we can also logically negate
either of these propositions (using DeMorgan's rule) to arrive at
equivalent true statements.  However, in all cases, the statements
produced in this way lead us to a definite conclusion ONLY with the
addition of knowledge of the P-status of at least one particular
element.  Thus, the essential technique of mathematical induction
consists of establishing the P-status of one or more element(s)
e[k] by prior means, and then invoking some general relation
between the P-status of one element and the P-status of some others
in the ordered sequence of elements.  Notice that we haven't used
any notion of the relative "magnitudes" of the elements e[] (they
may not even HAVE magnitudes).  We have simply used the ordering
and completeness of the sequence.

Now let me describe my conception of "infinite descent", which seems
to be different than some popular conceptions.  We have (in general)
an infinite set of elements, each of which has a finite non-negative
"magnitude" (norm), and the difference between any two distinct
magnitudes is greater than some fixed finite quantity D.  Notice that
these elements need not be given in any particular linear order.  For
example, we might have elements (e.g., Gaussian integers) that can be
indexed by two integers

etc.
etc...  e[-2,-2]  e[-1,-2]  e[0,-2]  e[1,-2]  e[2,-2] ...etc
etc...  e[-2,-1]  e[-1,-1]  e[0,-1]  e[1,-1]  e[2,-1] ...etc
etc...  e[-2, 0]  e[-1, 0]  e[0, 0]  e[1, 0]  e[2, 0] ...etc
etc...  e[-2, 1]  e[-1, 1]  e[0, 1]  e[1, 1]  e[2, 1] ...etc
etc...  e[-2, 2]  e[-1, 2]  e[0, 2]  e[1, 2]  e[2, 2] ...etc
etc.

Suppose we can show that if any one of these elements has the property
P, then the property P must also be possessed by another element whose
magnitude is strictly less than the magnitude of the former.  Clearly
this is impossible, because it implies an infinite descent in steps
no smaller than D, beginning from a finite starting magnitude, whereas
there are only finitely many distinct magnitudes less than any given
magnitude.

It might seem this argument can be re-cast in the form of a simple
induction on the linear sequence of magnitudes, but it can't.  To see
why, suppose we arrange the elements of our set according to their
magnitudes, and we get something like this

0         1       sqrt(2)
-------   -------   --------
e[1, 0]   e[-1,-1]
e[0, 0]   e[0, 1]   e[ 1,-1]      etc.
e[-1,0]   e[-1, 1]
e[0,-1]   e[ 1, 1]

To apply induction here, we would first need to regard all the
elements of a certain magnitude as "equivalent" with respect to
the property P (i.e., they all possess P or none of them possess P).
This is already a slight problem, since many properties depend on
more than just the norm of an element.  However, we can probably
fix that up.

The real problem becomes apparent if we try to apply the Proposition
which says that if the elements of a certain magnitude m[k] possess a
certain (negated) property notP, then this property notP is also
possessed by AT LEAST one of the magnitudes m[0] to m[k-1].  Fair
enough, but this doesn't prove that m[k] cannot possess notP.  Of
course, if we happen to know that m[0] does not possess notP, then
we would have a contradiction, proving that m[k] must possess P.
This is simply the contra-positive of the inductive argument that
if m[0] possess P, and if m[j]'s possession of P implies m[j+1]'s
possession of P for any given j, then we know m[k] possesses P.

However, suppose we cannot determine by independent means whether
m[0], or any other specific magnitude, possesses P.  Does our
induction argument (whether we phrase it in positive or contra-
positive form) enable us to conclude anything now?  Evidently not,
because it relies on knowledge of the P-status of some specific
element to make it effective.

Now suppose we're given one additional piece of information:  If m[k]
possesses P, then there is a magnitude m[j] strictly less than m[k]
such that m[j] possesses P.  This information still does not enable
us to use our induction argument (positive or contra-positive) to give
a definite result, because we still haven't been given the P-status
of any magnitude.  However, we CAN now invoke an infinite descent to
prove that NO magnitude can possess P.  We reach this conclusion
simply from the fact that there are only a finite number of distinct
magnitudes less than any given magnitude, whereas we have shown that
if m[k] possesses P there must be infinitely many distinct magnitudes
less than m[k].  This argument is actually much closer to the pigeon-
hole principle than to induction.  In fact, one could argue that it
IS the pigen-hole principle.

In summary, induction (whether applied upward or downward, and in the
positive or contra-positive sense) always relies on the known P-status
of at least one element, whereas the conclusion in an infinite descent
is based purely on the structure of the implicative relation and the
field of available elements.  Thus, they clearly operate on a
fundamentally different principle.  One could make a strong case
that infinite descent is equivalent to a specialized application of
the pigeon-hole principle (which is another high-order principle),
but not to induction.

One more comment on the above schema for DESCENT:

-P(0) or -P(1) or  ... or -P(n-1)  <=  -P(n)

Notice that this is explicitly a FINITE descent implication, and
can only be made effective by stipulating the P-status of one or
more indicies.  The schema for a true infinite descent looks quite
different, because as explained previously, it isn't an implication
over a specific set of integers (such as those from 0 to n-1).
Rather, it is an implication of the EXISTENCE of a positive integer
of smaller magnitude with the same property, ad infinitum.  So it
looks like this:

INFINITE DESCENT       P(n) =>  {m | P(m) AND |m| < |n|}

which can also be expressed in the form

P(n0)  =>  P(n1) and P(n2) and P(n3) and ... ad infinitum

where  |n0| > |n1| > |n2| > ...  ad infinitum

By the way, it's worth being clear about the effect of taking the
contra-positive of the usual statement of induction.  Suppose we
know that P(k) implies P(k+1), and let's consider two cases.  First,
suppose we can show that P(20) is TRUE.  In this case we can conclude
that P(21), P(22),... etc are all TRUE, but we cannot say anything at
all about P(19), P(18), etc.  Now let's switch cases, and suppose we
can show that P(20) is FALSE.  In that case we know P(19), P(18),...,
P(0), P(-1), P(-2)... must all be FALSE, but we know nothing about
the status of P(21), P(22) and so on.  In view of this, consider the
basic definition of mathematical induction

If P(k) => P(k+1), and if P(N), then P(k) for all k > N

Notice that "k" is an arbitrary index, and the symbol "+1" merely
signifies the "next" discrete index in the sequence.  This arbitrary
quality of the origin and discrete step sizes is an essential aspect
of pure induction, which tends to get glossed over if we set k to the
numerical magnitude 0 and regard "+1" as numerical addition.  By doing
that, we introduce non-inductive elements to the schema, which then
can be exploited in various ways, such as by taking the contra-
positive.

However, if we are careful to define the pure induction schema with
arbitrary indicies  of order as shown above, then notice that taking
the contra-positive of this statement does not yield an essentially
different statement, because the contra-positive of  P(k)=>P(k+1)  is
simply  notP(k)=>notP(k-1), and the implication of notP(N) is simply
notP(k)  for all k less than (viz, prior to) N.  So, if we denote
notP by Q and if we map the indicies j=10-k, the contra-positive of
the basic induction schema is

If Q(j) => Q(j+1), and if Q(N), then Q(j) for all j > N

which is formally identical to the previous expression, and still
fundamentally different from an infinite descent.
```