The ancient Babylonians had a nice method of computing square roots that can be applied using only simple arithmetic operations. To find a rational approximation for the square root of an integer N, let k be any number such that k^2 is less than N. Then k is slightly less than the square root of N, and so N/k is slightly greater than the square root of N. It follows that the average of these two numbers gives an even closer estimate k + N/k k_new = ------- 2 Iterating this formula leads to k values that converge very rapidly on the square root of N. This formula is sometimes attributed to Heron of Alexandria, because he described it in his "Metrica", but it was evidently known to the Babylonians much earlier. It's interesting to note that this formula can be seen as a special case of a more general formula for the nth roots of N, which arises from the ancient Greek "ladder arithmetic", which we would call linear recurring sequences. The basic "ladder rule" for generating a sequence of integers to yield the square root of a number N is the recurrence formula s[j] = 2k s[j-1] + (N-k^2) s[j-2] where k is the largest integer such that k^2 is less than N. The ratio s[j+1]/s[j] approaches sqrt(A) + N as j goes to infinity. The sequence s[j] is closely related to the continued fraction for sqrt(N). This method does not converge as rapidly as the Babylonian formula, but it's often more convenient for finding the "best" rational approximations for denominators of a given size. As an example, to find r = sqrt(7) we have N=7 and k=2, so the recurrence formula is simply s[j] = 4s[j-1] + 3s[j-2]. If we choose the initial values s[0]=0 and s[1]=1 we have the following sequence 0, 1, 4, 19, 88, 409, 1900, 8827, 41008, 190513, 885076, ... On this basis, the successive approximations and their squares are r r^2 --------------------- ----------------- (4/1) - 2 4.0 (19/4) - 2 7.5625 (88/19) - 2 6.9252... (409/88) - 2 7.0104... (1900/409) - 2 6.99854... (8827/1900) - 2 7.000201... (41008/8827) - 2 6.9999719... (190513/41008) - 2 7.00000390... (885076/190513) - 2 6.999999457... etc. Incidentally, applying this method to the square root of 3 gives the upper and lower bounds used by Archimedes in his calculation of pi. (See Archimedes and Sqrt(3).) The reasoning by which the ancient Greeks arrived at this "ladder arithmetic" is not known, but from a modern perspective it follows directly from the elementary theory of difference equations and linear recurrences. If we let q denote the ratio of successive terms q = s[j+1]/s[j], then the proposition is that q approaches sqrt(N) + k. In other words, we are trying to find the number q such that (q - k)^2 = N Expanding this and re-arranging terms gives q^2 - 2k q - (N-k^2) = 0 It follows that the sequence of numbers s[0]=1, s[1]=q, s[2]=q^2,.., s[k]=q^k,... satisfies the recurrence s[n] = 2k s[n-1] + (N-k^2) s[n-2] as can be seen by dividing through by s[n-2] and noting that q equals s[n-1]/s[n-2] and q^2 = s[n]/s[n-2]. Not surprisingly, then, if we apply this recurrence to arbitrary initial values s[0], s[1], the ratio of consecutive terms of the resulting sequence approaches q. (This is shown more rigorously below.) It's interesting that if we assume an EXACT rational (non-integer) value for sqrt(A) and exercise the algorithm in reverse, we generate an infinite sequence of positive integers whose magnitudes are strictly decreasing, which of course is impossible. From this the Greeks might have inferred that there can not be an exact rational expression for any non-integer square root, and thereby discovered irrational numbers. A similar method can be used to give the 3rd, 4th, or any higher root of any integer using only simple integer operations. In fact, this same method can be applied to find the largest real root of arbitrary polynomials. To illustrate, consider the general polynomial x^k - c1 x^(k-1) - c2 x^(k-2) ... - ck = 0 (1) The corresponding linear recurrence is s[n] = c1 s[n-1] + c2 s[n-2] + ... + ck s[n-k] (2) If we let r1,r2,...,rk denote the k distinct roots of this polynomial, then s[n] is given by s[n] = A1 r1^n + A2 r2^n + ... + Ak rk^n where the coefficients A1,A2,..,Ak are constants which may be determined by n initial values of the sequence. If r1 is the greatest of the roots (in terms of the norms) and is a positive real number then, as n increases, the first term of s[n] becomes dominant, and so the ratio s[n+1]/s[n] approaches [A r1^(n+1)] / [A r1^n] = r1 Hence we can use the recurrence relation (2) to generate arbitrarily precise rational approximations of the largest root of (1). This is why the ratio of successive Fibonacci numbers 1,1,2,3,5,8,13,21,... goes to the dominant root of x^2 - x - 1, which is [1+sqr(5)]/2. To illustrate, consider the "Tribonacci sequence", whose recurrence formula is s[n]=s[n-1]+s[n-2]+s[n-3], and whose characteristic equation is the cubic x^3 - x^2 - x - 1 = 0 The roots of this polynomial are r1 = 1.839286755214161.. r2 = -0.419643377607081.. + i 0.606290729207199.. r3 = -0.419643377607081.. - i 0.606290729207199.. If we arbitrarily select the initial values s[0]=1, s[1]=2, and s[2]=4, we have the coefficients A1 = 1.137451572282629.. A2 = -0.068725786121314.. + i 0.12352180763419.. A3 = -0.068725786121314.. + i 0.12352180763419.. All these numbers can be expressed as nested radicals given by the solution of the cubic, but I'm too lazy to type them. Anyway, the nth term of the recurrence is s[n] = A1 r1^n + A2 r2^n + A3 r3^n Notice that the magnitudes of r2 and r3 are less then 1, so as we take higher powers these terms become progressively smaller, and soon become negligible. On the other hand, r1 is greater than 1, so as n increases the first term becomes larger, and we have s[n] -> A1 r1^n as n goes to infinity, and so s[n]/s[n-1] approaches the root r1. There are many fascinating things about linear recurrences like this. For example, they can be used to test numbers for primality. (See the article on "Symmetric Pseudoprimes".) By the way, to avoid complex numbers, we can also express s[n] for the Tribonacci sequence as s[n] = A1 r1^n + 2B^n [u cos(nq) - v sin(nq)] where B = 0.737352705760327... q = 2.176233545491871... u = -0.068725786121314.. v = 0.12352180763419.. Naturally we can also use this method to generate rational approximations for the jth root of any integer M, simply by considering the polynomial x^j - M = 0. The only difficulty is that the magnitudes of all n roots of this polynomial (the complex jth roots of M) are all equal, so the sequence doesn't converge. This is clear from the recurrence s[k] = M s[k-j], which just gives k disjoint geometric sequences. However, we can solve this problemby a simple change of variables. If we define y such that x = y - k where k^j is the greatest nth power less than M, we can make this substitution to give the polynomial (y-k)^j - M = 0. For example, with j=2 (for square roots) we have y^2 - 2k y - (M - k^2) = 0 The corresponding recurrence is s[n] = 2k s[n-1] + (M-k^2) s[n-2] which is the same as the "ladder arithmetic" square root method given earlier. Notice that in order to give the roots of the characteristic polynomial very unequal magnitudes, so that the positive real square root would be dominant and cause rapid convergence, we make the substitution x = y - k, which just shifts the roots in the positive real direction, so that the negative square root was closer to zero and the positive square root was further from zero. The same simple method can be applied to higher-order roots, noting that the nth roots of the number N are arranged in a circle on the complex plane, centered on the origin. By simply shifting this circle of roots in the positive real direction we can make the magnitude of the positive real root greater than the magnitude of any of the other roots (because the radius of the original circle of roots is smaller than the new positive real root). To illustrate, the cube root of 2 can be found using this method by means of the recurrence s[j] = 3s[n-1] - 3s[n-2] + 3s[n-3] With the initial values 0,0,1 we have the sequence 0, 0, 1, 3, 6, 12, 27, 63, 144, 324, 729, 1647, 3726,... Each sequence of 3 terms is commonly divisible by 3, so we can factor these out to keep the terms from growing so fast. This gives 0,0,1,3,6,12,... 1,2, 4,9,21,48,... 3, 7,16,36,81,183,... 12,27, 61,138,312,705,... 46,104,235,531,1200,2712,... and so on. In this way we can find the rational approximations s[6]/s[5] - 1 = 5/4 s[13]/s[12] - 1 = 29/23 s[17]/s[16] - 1 = 63/50 s[25]/s[24] - 1 = 635/504 The cube of 635/504 is 1.999998024... This approach gives a simple linear recurrence, but when determining higher order roots it is clearly not very efficient, because the nth roots of the number N are uniformly arranged in a circle on the complex plane, centered on the origin. It's easy to see that we can't make the magnitude of the positive real root much greater than the magnitudes of the neighboring roots by just shifting in the positive real direction. To improve the convergence, let's approach the problem systematically. The polynomial x^n - N = 0 has n complex roots x1,x2,..,xn, all of equal magnitude. Let x1 denote the positive real root, and suppose X is a positive real estimate of x1. We can then define a polynomial with the roots w1,w2,..,wn given by xj + X wi = ------ xj - X Since X is a positive real number, it is closer to x1 than to any of the other x roots, and so x1 - X is the smallest denominator, and x1 + X is the largest numerator. Consequently, w1 is the w value with the largest magnitude. Solving the above expression for xj gives wj + 1 xj = X ------ wj - 1 Substituting xj into the original polynomial x^n - N = 0 gives an equation with the roots wj n / w + 1 \ n X ( ------- ) - N = 0 \ w - 1 / Expanding this and clearing the denominator gives the polynomial n n n n-1 ( X - N ) w + n( X + N ) w + ... = 0 Since w1 is the largest root (probably by a large margin), and it's one of at most two real roots, and the other real root is the smallest, a rough estimate of w1 is just the sum of the w roots, which equals the coefficient of w^(n-1) of the monic form of the polynomial (i.e., the above polynomial divided by the first coefficient). Letting W denote this coefficient, we have n X + N W = n -------- n X - N An improved estimate for x1 is therefore X(W+1)/(W-1), which gives n (n+1)X + (n-1)N X_new = X ---------------- n (n-1)X + (n+1)N Although this looks quite different, it is actually a generalization of the ancient Babylonian (and Heron's) method for square roots. To see this, note that a slightly less efficient application of this method would be to replace w with 2(w+1), which leads to the substitution 2w + 3 x = X ------ 2w + 1 If we substitute this expression for x into the original equation x^n - N = 0 and expand, we get a polynomial in w such that the real positive x root maps to the largest w root, and the remaning w roots are relatively small (although they are not as small as with the previous substitution, which is why this formula isn't quite as efficient). For square roots we have n=2 and we take the coefficient of w of the monic form of this polynomial as a good estimate of the largest root w, leading to the result 2 X + N X + N/X X_new = X -------- = ------- 2 2 2X + 0 which is the ancient Babylonian/Heron formula. In contrast, using the more nearly optimum substitution with the same approach leads to the formula for square roots 2 X + 3N X_new = X --------- 2 3X + N which converges slightly faster than the ancient formula. It's interesting to notice the appearance of the quantity (w+1)/(w-1) in this context, considering the relation between root extractions and logarithms (if x^n = N then ln(x) = ln(N)/n), because we also have ln[(w+1)/(w-1)] = 2[1/w + 1/(3w^3) + 1/(5w^5) + ...]. It is true that the substitution (w+1)/(w-1) is optimal for extracting the nth root?

Return to MathPages Main Menu