Ancient Square Roots

 

The ancient Babylonians had a nice method of computing square roots that can be applied using only simple arithmetic operations. To find a rational approximation for the square root of an integer N, let m be any number such that m2 is less than N. Then m is slightly less than the square root of N, and so N/m is slightly greater than the square root of N. It follows that the average of these two numbers gives an even closer estimate

 

 

Iterating this formula leads to m values that converge very rapidly on the square root of N. This formula is sometimes attributed to Heron of Alexandria, because he described it in his "Metrica", but it was evidently known to the Babylonians much earlier.

 

This formula can be seen as a special case of a more general formula for the nth roots of N, which arises from the ancient Greek "ladder arithmetic", which we would call linear recurring sequences. The basic "ladder rule" for generating a sequence of integers to yield the square root of a number N is the recurrence formula

 

 

where m is the largest integer such that m2 is less than N. The ratio sn+1/sn approaches √A + N as n goes to infinity. The sequence sn is closely related to the continued fraction for √N. This method does not converge as rapidly as the Babylonian formula, but it's often more convenient for finding the "best" rational approximations for denominators of a given size.

 

As an example, to find r = √7 we have N=7 and m=2, so the recurrence formula is

 

 

If we choose the initial values s0 = 0 and s1 = 1 we have the following sequence

 

 

On this basis, the successive approximations and their squares are

 

 

Incidentally, applying this method to the square root of 3 gives the upper and lower bounds used by Archimedes in his calculation of π. (See Archimedes and √3.)

 

The reasoning by which the ancient Greeks arrived at this "ladder arithmetic" is not known, but from a modern perspective it follows directly from the elementary theory of difference equations and linear recurrences. If we let q denote the ratio of successive terms q = sn+1/sn, then the proposition is that q approaches √N + m. In other words, we are trying to find the number q such that (q – m)2 = N. Expanding this and re-arranging terms gives

 

 

It follows that the sequence of numbers s0=1, s1=q, s2=q2, s3=q3,.. satisfies the recurrence

 

 

as can be seen by dividing through by sn–2 and noting that q equals sn–1/sn–2 and q2 = sn/sn–2. Not surprisingly, then, if we apply this recurrence to arbitrary initial values s0, s1, the ratio of consecutive terms of the resulting sequence approaches q. (This is shown more rigorously below.)

 

It's interesting that if we assume an exact rational (non-integer) value for √A and exercise the algorithm in reverse, we generate an infinite sequence of positive integers whose magnitudes are strictly decreasing, which of course is impossible. From this the Greeks might have inferred that there cannot be an exact rational expression for any non-integer square root, and thereby discovered irrational numbers.

 

A similar method can be used to give the 3rd, 4th, or any higher root of any integer using only simple integer operations. In fact, this same method can be applied to find the largest real root of arbitrary polynomials. To illustrate, consider the general polynomial of degree d

 

 

The corresponding linear recurrence is

 

 

If we let r1, r2, ..., rd denote the d distinct roots of the polynomial, then sn is given by

 

 

where the coefficients A1, A2, …, Ad are constants which may be determined by d initial values of the sequence. If r1 is the greatest of the roots (in terms of the norms) and is a positive real number then, as n increases, the first term of sn becomes dominant, and so the ratio sn+1/sn approaches

 

 

Hence we can use the recurrence relation (2) to generate arbitrarily precise rational approximations of the largest root of (1). This is why the ratio of successive Fibonacci numbers 1, 1, 2, 3, 5, 8, 13, 21, ... goes to the dominant root of x2 – x – 1, which is [1 + √5]/2.

 

To illustrate, consider the "Tribonacci sequence", whose recurrence formula is

 

 

and whose characteristic equation is the cubic

 

 

The roots of this polynomial are

 

 

If we arbitrarily select the initial values s0=1, s1=2, and s2=4, we have the coefficients

 

 

All these numbers can be expressed as nested radicals given by the solution of the cubic. The nth term of the recurrence is

 

 

Notice that the magnitudes of r2 and r3 are less than 1, so as we take higher powers these terms become progressively smaller, and soon become negligible. On the other hand, r1 is greater than 1, so as n increases the first term becomes larger, and we have sn → A1 r1n as n goes to infinity, and so sn/sn–1 approaches the root r1.

 

There are many fascinating things about linear recurrences. For example, they can be used to test numbers for primality. (See the article on Symmetric Pseudoprimes.) By the way, to avoid complex numbers, we can also express sn for the Tribonacci sequence as

 

 

where

 

Naturally we can also use this method to generate rational approximations for the jth root of any integer M, simply by considering the polynomial xj – M = 0. The only difficulty is that the magnitudes of all n roots of this polynomial (the complex jth roots of M) are all equal, so the sequence doesn't converge. This is clear from the recurrence sn = M sn–j, which just gives j disjoint geometric sequences. However, we can solve this problem by a simple change of variables. If we define y such that x = y – k where kj is the greatest jth power less than M, we can make this substitution to give the polynomial (y–k)j – M = 0. For example, with j=2 (for square roots) we have

 

 

The corresponding recurrence is the same as for the "ladder arithmetic" square root method given earlier. Notice that in order to give the roots of the characteristic polynomial very unequal magnitudes, so that the positive real square root would be dominant and cause rapid convergence, we make the substitution x = y – k, which just shifts the roots in the positive real direction, so that the negative square root was closer to zero and the positive square root was further from zero.

 

The same simple method can be applied to higher-order roots, noting that the nth roots of the number N are arranged in a circle on the complex plane, centered on the origin. By simply shifting this circle of roots in the positive real direction we can make the magnitude of the positive real root greater than the magnitude of any of the other roots (because the radius of the original circle of roots is smaller than the new positive real root). To illustrate, the cube root of 2 can be found using this method by means of the recurrence

 

 

With the initial values 0,0,1 we have the sequence

 

 

Each sequence of 3 terms is commonly divisible by 3, so we can factor these out to reduce the growth rate of the numbers. This gives

 

 

and so on. In this way we can find the rational approximations

 

 

The cube of 635/504 is 1.999998024...

 

This approach gives a simple linear recurrence, but when determining higher order roots it is clearly not very efficient, because the nth roots of the number N are uniformly arranged in a circle on the complex plane, centered on the origin. It's easy to see that we can't make the magnitude of the positive real root much greater than the magnitudes of the neighboring roots by just shifting in the positive real direction.

 

To improve the convergence, let's approach the problem systematically. The polynomial xn – N = 0 has n complex roots x1, x2, ..., xn, all of equal magnitude. Let x1 denote the positive real root, and suppose X is a positive real estimate of x1. We can then define a polynomial with the roots w1, w2, ...,wn given by

 

 

Since X is a positive real number, it is closer to x1 than to any of the other x roots, and so x1 – X is the smallest denominator, and x1 + X is the largest numerator. Consequently, w1 is the w value with the largest magnitude. Solving the above expression for xj gives

 

 

Substituting xj into the original polynomial xn – N = 0 gives an equation with the roots wj

 

 

Expanding this and clearing the denominator gives the polynomial

 

 

Since w1 is the largest root (probably by a large margin), and it's one of at most two real roots, and the other real root is the smallest, a rough estimate of w1 is just the sum of the w roots, which equals the coefficient of wn–1 of the monic form of the polynomial (i.e., the above polynomial divided by the first coefficient). Letting W denote this coefficient, we have

 

 

An improved estimate for x1 is therefore X(W+1)/(W–1), which gives

 

 

Although this looks quite different, it is actually a generalization of the ancient Babylonian (and Heron's) method for square roots. To see this, note that a slightly less efficient application of this method would be to replace w with 2(w+1), which leads to the substitution

 

 

If we substitute this expression for x into the original equation xn – N = 0 and expand, we get a polynomial in w such that the real positive x root maps to the largest w root, and the remaining w roots are relatively small (although they are not as small as with the previous substitution, which is why this formula isn't quite as efficient). For square roots we have n=2 and we take the coefficient of w of the monic form of this polynomial as a good estimate of the largest root w, leading to the result

 

 

which is the ancient Babylonian/Heron formula. In contrast, using the more nearly optimum substitution with the same approach leads to the formula for square roots

 

 

which converges slightly faster than the ancient formula.

 

It's interesting to notice the appearance of the quantity (w+1)/(w–1) in this context, considering the relation between root extractions and logarithms (if xn = N then ln(x) = ln(N)/n), because we also have ln[(w+1)/(w–1)] = 2[1/w + 1/(3w3) + 1/(5w5) + ...]. It is true that the substitution (w+1)/(w–1) is optimal for extracting the nth root?

 

Return to MathPages Main Menu