The e in Petersburg
In the decimal representation of the transcendental number e the
first occurrance of a string of 10 digits consisting of a permutation
of all 10 of the numerals 0,1,2,..,9 begins at the 1730th digit, i.e.,
the 1729th digit past the decimal point. These 10 digits are
...0719425863...
If the decimal digits of a number are regarded as essentially random,
what is the EXPECTED location of the first such occurrance? The most
direct approach to this problem is to define a Markov model with 9
states S_1, S_2,.., S_9, where S_n represents a state in which the n
most recent digits are distinct. For example, if we take the first
10 digits of e we have [2718281828], and this is in the state S_2,
because the last two digits (8 and 2) are distinct, whereas the last
three digits are not distinct (because there are two 8's).
The state of a string of digits tells us everything we need to
know about that string, i.e., from the state we can determine the
probability of transitioning to any other state when we shift one
digit to the right. This means we discard the lest-most digit and
append a new randomly chosen digit on the right. Suppose the current
string is in State 4, which means the digits look something like this
[XXXXXBDCBA]
The last four digits are distinct, A,B,C,D, but the 5th from the last
digit is B (it could also have been any of A,C, or D), so the last
five digits are not distinct. The preceding digits (denoted by X's)
don't matter. If we now discard the leftmost digit and append a new
randomly chosen numeral on the right, what are the State transition
probabilities? Since we are dealing with the base 10, there are 10
possible numerals, and if the new digit is any of the six numerals
other than A,B,C,D we will transition to State 5. This obviously has
probability 6/10. On the other hand, if the new digit is D, the state
will remain S_4, and this has probability 1/10. If the new digit is
C the state will transition to S_3, and this transition has probability
1/10, and so on. In this way we can determine the probability of the
transition from any given state to any other state, and this leads to
the transition matrix
_ _
| 1 1 1 1 1 1 1 1 1 |
| 9 1 1 1 1 1 1 1 1 |
| 0 8 1 1 1 1 1 1 1 |
1 | 0 0 7 1 1 1 1 1 1 |
M = -- | 0 0 0 6 1 1 1 1 1 |
10 | 0 0 0 0 5 1 1 1 1 |
| 0 0 0 0 0 4 1 1 1 |
| 0 0 0 0 0 0 3 1 1 |
|_ 0 0 0 0 0 0 0 2 1 _|
Given any column vector P of state probabilities, we can produce the
state probabilities for next set of digits (i.e., after moving one
place to the right) by simply multiplying by M. It's worth noting
that we have not included the "10th state" in our model, so the last
column in M doesn't sum to 1. As a result, the total probability in
the none states of our model will steadily decrease. The probability
of the 10th state is just 1 minus the probabilities of the other nine
states.
For the initial set of state probabilities P[1] we note that the
probability of a random string of 10 digits being in State k is
given by taking one of 10 choices for the last digit, one of 9 for
the 2nd to last, and so on, down to the kth from the last digit.
Then we have k choices for the "spoiler digit", which must be the
same as one of the last k (distinct) digits, and then we have 10
choices each for the remaining 10-(k+1) digits. Now, since the
total number of possible strings (of all States) is 10^10, the
probability of a given random string being in the kth State is
k 10!
P[1]_k = ---------------- (1)
(10-k)! 10^(k+1)
Of course this gives P[1]_1 = 1/10 and P[1]_10 = 10!/10^10. So, this
gives us the nine components of our initial column vector P[1]. Then
we have P[2] = M P[1], and in general
P[n+1] = M^n P[1]
If we let {P[j]} denote the sum of the components of P[j], then the
probability of having entered State 10 by the jth string of digits
is 1 - {P[j]}. Thus the probability of entering State 10 on exactly
the jth string is [1-{P[j]}] - [1-{P[j-1]}]. Hence the expected number
of strings to be examined before finding a complete permutation of the
10 numerals can be expressed as
/ \ / \
1 ( [1-{P[1]}] ) + 2 ( [1-{P[2]}] - [1-{P[1]}] )
\ / \ /
/ \ / \
+ 3 ( [1-{P[3]}] - [1-{P[2]}] ) + 4 ( [1-{P[4]}] - [1-{P[3]}] )
\ / \ /
+ ...
= 1 + {P[1]} + {P[2]} + {P[3]} + ...
= 1 + { (M^0 + M^1 + M^2 + M^3 + ...) P[1] }
The sum of the geometric series of transition matrices is (I-M)^-1,
and so if we let R denote the row vector
R = [1 1 1 1 1 1 1 1 1]^T
we have the expected number E(B) of digits for the base B=10
-1
E(B) = 1 + R (I+M) P = 3526013/1134
= 3109.358906526...
This same method can obviously be applied to other bases. In general
for the base B we we define the arrays and vectors with indices ranging
from 1 to B
/ 1 if m=n
I[m,n] = ( Identity matrix
\ 0 if m != n
/ B-n if n = m-1
M[m,n] = ( 1 if n > m-1 Transition matrix
\ 0 if n < m-1
m B!
P[m] = -------------- State column vector
(B-m)! B^(m+1)
In terms of these matrices the expected number of digits for the
base B is
E(B) = 1 + R (I-M)^-1 P
The table below shows the result for bases from 2 to 20.
E(B)
------
B E(B) E(B-1) E(B) (B-1)! E(B)
--- ------------ -------- ------------------ ------------
2 2.0000 2 2
3 5.0000 2.5000 5 10
4 12.6666 2.5333 38/3 76
5 31.5833 2.4934 379/12 758
6 78.2000 2.4759 391/5 9384
7 194.0722 2.4817 34933/180 139732
8 485.0984 2.4995 152806/315 2444896
9 1223.0468 2.5212 78275/64 49313250
10 3109.3589 2.5423 3526013/1134 1128324160
11 7963.0504 2.5609 3612039703/453600 28896317624
12 20520.0176 2.5769
13 53151.9215 2.5902
14 138270.8627 2.6014
15 361004.9270 2.6108
16 945426.8069 2.6188
17 2482475.8890 2.6257
18 6533289.2310 2.6317
19 17228381.2502 2.6370
20 45511474.3001 2.6416
Notice that E(B) increases by a factor of about 2.5 or a bit more on
each step. It's tempting to conjecture that the asymptotic ratio of
successive values of E(B) is e.
It's also interesting to explore other ways of approaching this
problem, even if they only give approximate answers. For example,
we might estimate that the MEAN location of the first occurrance is
roughly 3060, taking into account the fact that such occurrances tend
to be clustered more than would be expected if each sequence of 10
digits was independent. (They are not independent because they
overlap.) If we let q denote the quantity 10!/(10^10), then the
inclusion/exclusion principle gives the following probabilities that
the first n sets of 10 digits (i.e., the first n+9 digits) contains
a complete permutation string
Inc-Exc
n [1-(1-q)^n]/q Prob/q
1 1.0000.. 1
2 1.9996.. 1.9
3 2.9989.. 2.79
4 3.9978.. 3.677
5 4.9964.. 4.5627
6 5.9946.. 5.44769
7 6.9924.. 6.332219
8 7.9898.. 7.2164033
9 8.9869.. 8.10029...
10 9.9836.. 8.9839....
20 19.9312.. 17.8045....
100 98.2248.. < 87.61
400 372.3874 < 335.20
1000 838.7784 < 775.28
These values agree exactly with the results of the Markov model.
The center column gives the probabilities based on the assumption
that each string of 10 digits is independent, which gives a
population with a mean of about -1/ln(1-q) = 2755.23. In contrast,
the actual probabilities, taking into account the dependence of
overlapping strings, gives a cumulative probability approaching
1 - e^(-rn) where r is approximately 1/3062. Thus the mean location
of the first occurrance is about 3062. If we generate sequences of
"random" decimal digits and make 5 runs of 5000 trials each, using a
different random number generator for each run, we get the average
results 3044, 3062, 3045, 3068, and 3074, for an overall mean of
3058.6. Notice that this appears to be systematically less than the
exact value of 3109.35 given by the Markov model.
To explain this, it's useful to examine the probabilities of having
found a complete permutation after examining the first n sets of 10
consecutive digits. We can compute a table of values using the
recurrence P[n+1] = M^n P[1] for the column vectors P[j], and then
letting P(j) = 1 - SUM P[j] denote the cumulative probability of
having reached the 10th state. Several values are tabulated below,
and they agree exactly with the inclusion/exclusion values. (Note
that the tabulated values are the probabilities divided by
q = 10!/10^10.)
n P(n)/q
----- ------------
1 1.0000000
2 1.9000000
3 2.7900000
4 3.6770000
5 4.5627000
6 5.4476900
7 6.3322190
8 7.2164033
... ...
24 21.325
56 49.325
120 104.468
248 211.405
504 412.515
1016 768.302
2040 1326.014
4088 2015.841
8184 2557.570
16376 2741.519
32760 2755.658
65528 2755.731
One method for quickly estimating the expected value of the number
of strings from this table would be to observe that the cumulative
probabilities P(n) of being in the 10th state have virtually an
exponential distribution, i.e., P(n) = 1 - exp(-rn) where r is the
"rate". Taking the value of
P(1016) = 768.302*q = 1 - exp(-1016*r)
we can compute the expected value (assuming an exponential
distribution)
1 n
E = ---- = ----------------- = 3108.544
r -ln(1-768.302*q)
which is extremely close to the value of 3109. The result is
virtually the same for any n>500 or so. However, the method is very
sensitive to the value of P(n). I had previously estimated P(1000)
as being slightly less than 775 based on just the first 3 terms of
the inclusion/exclusion formula. This is fairly accurate, but if I
compute a rate from this value it comes out as r = 1/3028.
The explanation for the "systematic error" is that while E_inf[n]
is 3109, the expected value for trials of 18600 digits or fewer is
E_18600[n] = 3062. The random number generator determined the
expected values out of 25000 trials and not a single one exceeded
18600 digits, so it isn't surprising that the results underestimate
the true theoretical value. We would have needed to run many more
trials to get the theoretical result. There's a strong resemblence
to the Petersburg Paradox here, because the theoretical expected
value is forced up by some extremely unlikely outcomes that we may
never observe "in practice".
As an example, if we examine the first 2 million digits of the number
e, we find that there are 761 occurrances of complete permutations on
10 consecutive digits, which implies that the mean "distance" between
occurrances is only about 2585 digits. This is shown in the plot
below (using digits of e computed by Nemiroff & Bonnell):
The reason the expected time for the first occurrance (3109.35)
is greater than the mean time between occurrances is that these
occurrances are not independent. Indeed, the 2nd and 3rd complete
permutations of decimal numerals in e are on consecutive strings of
10 digits. In other words, the occurrances tend to occur in "clumps",
and so there are more and longer "dry spells" than we would expect
based on the average frequency of occurrence if the events were
independent. To generate the closed-loop response of this system
we use the complete 10x10 transition matrix
_ _
| 1 1 1 1 1 1 1 1 1 1 |
| 9 1 1 1 1 1 1 1 1 1 |
| 0 8 1 1 1 1 1 1 1 1 |
1 | 0 0 7 1 1 1 1 1 1 1 |
N = -- | 0 0 0 6 1 1 1 1 1 1 |
10 | 0 0 0 0 5 1 1 1 1 1 |
| 0 0 0 0 0 4 1 1 1 1 |
| 0 0 0 0 0 0 3 1 1 1 |
| 0 0 0 0 0 0 0 2 1 1 |
|_ 0 0 0 0 0 0 0 0 1 1 _|
Then from any initial state vector P[0] the state vector after k
steps is
P[k] = N^k P[0]
If we take as our initial state vector P the a_priori probabilities
for the states given by equation (1), then we find that this is the
equilibrium condition, i.e., we have N^k P = P for all k, which
shows that P is an eigenvector for every N^k. As k increases the
columns of N^k approach P.
Hence this state vector P gives the probabilities for being found in
any of the 10 states on a randomly selected string. Consequently we
would expect the frequency of being in state 10 to correspond with
a mean steps between occurrences of 1/P_10 = 2755.73. This differs
from the actual mean (2585) over the first 2 million steps by about
6.1 percent. In contrast, if we check the first million digits of
the number pi (using digits computed by Duane Bailey) we find
excellent agreement with the predicted value of 2755, as shown below:
Since the decimal digits of e are presumed to be normally distributed,
the 6.1% discrepancy in the frequency of complete permutations led
me to suspect that there might be a systematic error on the digits of
e computed by Nemiroff & Bonnell, so I acquired another set of digits
computed independently by James Davis, but the result agree exactly
with the set from Nemiroff & Bonnell. So, it seems (at least over this
range) that the decimal digits of e really do exhibit a statistically
significant non-normality. However, this eventally resolves into a more
normal distribution, as discussed in Is e Normal?.
Return to MathPages Main Menu