Is Pi normal?



by Stan Wagon, from The Math. Intell. 7, 65 [Ref.]

The idea of normality, first introduced by E. Borel in 1909, is an attempt to formalize the notion of a real number being random. The definition is as follows:

A real number x is normal in base b if in its representation in base b all digits occur, in an asymptotic sense, equally often. In addition, for each m, the b^m different m-strings must occur equally often. In other words, lim n->infinity N(s,n)/n = b^-m for each m-string s, where N(s,n) is the number of occurrences of s in the first n base-b digits of x. A number that is normal in all bases is called normal.

The apparent randomness of Pi's digits had been observed prior to the precise definition of normality. De Morgan, for example, pointed out that one would expect the digits to occur equally often, but yet the number of 7's in the first 608 digits is 44, much lower than expected. However, it turned out that his count was based on inaccurate data.

There are lots of normal numbers - Borel proved that the set of non-normal numbers has measure zero - but it is difficult to provide concrete examples. While an undergraduate at Cambridge University, D. Champernowne proved that 0.12345678910111213 ... is normal in base 10, but an explicit example of a normal number is still lacking.

The question of Pi's normality only scratches the surface of the deeper question whether the digits of Pi are "random". That normality is not sufficient follows from the observation that a truly random sequence of digits ought to be normal when only digits in positions corresponding to perfect squares are examined. But if all such positions in a normal number are set to 0, the number is still normal. On the other hand, more rigorous definitions of "random" exclude Pi because Pi's decimal expansion is a recursive sequence.

Thus deeper questions are lurking, but so little is known about Pi's decimal expansion that it is reasonable to focus on whether Pi is normal to base ten. To put our ignorance in perspective, note that it is not even known that all digits appear infinitely often: perhaps

Pi = 3.1415926.....01001000100001000001...

In order to gather evidence for Pi's normality one would like to examine as many digits as possible. Those who have pursued the remote digits of Pi have often been pejoratively referred to as "digit hunters", but certain recent developments have added some glamor to the centuries-old hunt.

[...]

Kanada, by calculating 6,442,450,000 decimal digits in 1995, found the following frequency distribution for pi-3 up to 6,000,000,000 decimal places, which show no unusual deviation from expected behavior:

'0': 599963005;  '1': 600033260;  '2': 599999169;  '3': 600000243
'4': 599957439;  '5': 600017176;  '6': 600016588;  '7': 600009044
'8': 599987038;  '9': 600017038;  Chi square = 9.00

Moreover, the speed with which the relative frequencies are approaching 1/10 agrees with theory. Consider the digit 7 for example. Its relative frequencies in the first 10^i digits (i = 1, . . ., 7) are 0, .08, .095, .097, .10025, .0998, .1000207, which seem to be approaching 1/10 at the speed predicted by probability theory for random digits, namely at a speed approximately proportional to 1/squr(n). The poker test is relevant to the question of normality in base ten, and Table 1 contains the frequencies of poker hands from the first ten million digits; there is no significant deviation from the expected values.

Type of hand
Expected number Actual number
No two digits the same 604,800 604,976
One pair 1,008,000 1,007,151
Two pair 216,000 216,520
Three of a kind 144,000 144,375
Full house 18,000 17,891
Four of a kind 9,000 8,887
Five of a kind 200 200
Table 1: Distribution of the first two million poker hands in the digits of pi.


Writers over the years have been fond of mentioning that 20 decimals of pi suffice for any application imaginable. Moreover, the millions of digits now known shed absolutely no light on how to prove pi's normality. But these criticisms miss the point. Huyghens, in using an extrapolative technique to extend Archimedes' calculations, was the first to use an important technique that, in this century, has come to be known as the Romberg method for approximating definite integrals. And the arithmetic-geometric mean algorithms and their refinements are closely connected to the fastest known techniques for evaluating multi-precision transcendental functions. Thus digit-hunting has an importance that goes beyond the mere extension of the known decimal places of pi.