# The Prime Number Theorem explained

People have asked me to explain how Gauss’ Prime Number Theorem actually works. I mentioned it in my “How to Find Prime Numbers” post, but never explained it, so here is a short and (relatively) simple guide.

To begin with, let’s take a look at the trends in the density of primes up to a certain integer. We can find the density of primes by calculating p / x, where x is the number of integers and p is the number of primes from 1 to x. If we make a table of the density to each power of 10, it will look like this:

 x p Density 10 4 40% 100 25 25% 1,000 168 16.8% 10,000 1,229 12.29% 100,000 9,592 9.592% 1,000,000 78,498 7.8498% 10,000,000 664,579 6.64579% 100,000,000 5,761,455 5.761455%

You can see that the density is always decreasing, but the rate is also decreasing. This is very similar to a logarithmic pattern. In fact, if we plot the graph of y = ln(x), mirror it along the horizontal axis so that it becomes y = 1 / ln(x), and compare it to the graph of the density of prime numbers, there is a striking similarity.

Green: y = 1 / ln(x)
Blue: Density of primes

You’ll find that as the value of x increases, the correlation between the two graphs becomes stronger and stronger. The density of primes follows an almost instinctive natural curve.

This link makes it possible for us to predict the density of primes up to any value of x with the formula 1 / ln(x). The higher the value of x, the higher the accuracy of the prediction. To find the expected number of primes, you only need to find the area under the graph of the density, which can be found using integration. Gauss refined this a little to create the logarithmic integral Li(x), which gives a slightly more accurate estimate for the lower values of x. I ought to give Adrien-Marie Legendre some credit here as well, because he and Gauss both independently conjectured Li(x) at around the same time. Gauss is generally more associated with this, however, mainly because of his legacy.

l talk about how this links into the Riemann hypothesis (and more) in my previous post, which you can read here.

Thank you for reading.

-Ollie James

# How To Find Prime Numbers

The concept of a prime number is very simple. It’s one we’ve all been taught quite early on. If a number has only two factors, those factors being 1 and itself, it is a prime number. Simple enough. However, you will probably also know as far as we are aware, prime numbers do not have a pattern of regular occurrence as, say, even numbers do. This is what makes them so captivating. Even after we have known about prime numbers for millennia, we haven’t found an exact formula for finding primes. I’ll be taking you through the progress humanity has made so far on what is one of the most sought-after problems ever to arise in mathematics.

Prime numbers have been a source of academic anguish almost since the birth of human understanding of mathematics. So many millions of (arguably, mostly fruitless) hours have been poured into trying to further our understanding of these mysterious numbers. The first evidence of our knowledge of primes is in the Rhind Papyrus, an Ancient Egyptian document dating back to c. 1600BC.

The Rhind Papyrus, which is currently located at the British Museum

The papyrus demonstrates various ancient Egyptian fractions, and shows that they have different forms for prime numbers and composite (non-prime) numbers. After this, we have little evidence of progress with primes until we reach the Ancient Greeks, c. 300BC.

~

We first meet Euclid, an Ancient Greek mathematician. Often referred to as the ‘father of Geometry’, the majority of his work was based around geometric principles, but his collection of works, called Elements, included some number theory. Most notably, he proved that there were an infinite number of primes. His proof is as follows.

Suppose there are a finite number of primes – specifically, n primes. When we list them, they create the list p1, p2, p3, … , p.

Let’s now create a new number called p, where p = p1 x p2 x p3 x … x pn + 1. This is larger than any of the primes in the list, so it can’t equal any of them. Also, since the list of p1, p2, p3, … , pn contains all primes, p cannot be prime. Therefore p must be divisible by at least one prime.

However, if we divide p by any one of our primes, we will always get a remainder of 1. This contradiction tells us that our assumption of there being a finite number of primes cannot be true, and that there must be an infinite number of primes.

This may seem like common sense to some of us, but had Euclid not constructed this proof when he did, our understanding of primes could have been far less developed than it is today.

Euclid was also responsible for a theorem that one would likely not expect from a mathematician of his time. In the final pages of the number theory section of his Elements, he proved that 2p-1(2p-1) is a perfect number, where (2p-1) is prime. If you’re unaware, a perfect number is a positive integer with factors that add up to the number. Six is the first perfect number, as its factors, 1, 2 and 3, add up to 6. In the same way as primes, we are currently unaware of a formula that dictates when the next perfect number will occur.

We will revisit Euclid’s theorem later.

~

Shortly after Euclid’s work, in 276BC, another Greek mathematician by the name of Eratosthenes devised a method of finding all primes up to a certain integer. You have probably been taught this method, called the Sieve of Eratosthenes, as a way of easily determining which numbers are prime and which are not.

To begin, create a number grid of the numbers 2 to N (where N is the integer you want to find primes up to).

Next, starting at 2, mark off all the multiples of 2 (excluding 2 itself).

Once they are all marked off, move to the next number that has not been marked off, and mark off all the multiples of that number (excluding the number itself).

Continue the process until you cannot mark any more numbers off. All of the remaining numbers are primes.

A completed sieve, where N = 99.

Once again, this may seem very simple to us in the modern world, as our understanding has developed far, far beyond that of the Ancient Greeks, but the Sieve was a fundamental foundation for the search for primes. It is also still widely used in programming when generating small quantities of primes due to its simplicity. However, it is an inefficient algorithm compared to many of the ones created for the purpose of discovering new primes, which we will cover later.

By the way, if you’re wondering why we exclude 1 from the number grid, it’s because 1 is not prime. The definition of a prime number is “a number whose only two integer factors are 1 and itself”. Seeing as 1 only has one factor, it is not considered prime.

~

We must now travel rather far forward on the timeline to the 17th Century, where we are introduced to Pierre de Fermat, a French lawyer and mathematician. Though he is renowned for his many developments in various areas of mathematics, he is also known for coming up with ideas and not proving them. Fermat’s Last Theorem (probably his most famous), which proposes that no three positive integers a, b and c can satisfy the equation an + bn = cn where n is an integer greater than 2, is an example of this. In a note he wrote in a book, he teased that he had proof, but there was not enough space to write it. The note was discovered after his death. Whether he had proof or not we shall never know, but in 1994 Sir Andrew Wiles finally proved the theorem.

Pierre de Fermat, apparently in his later years.

Earlier in his life, Fermat came up with another theorem of similar circumstances. He stated that 2(2^n) + 1 is prime where n is any positive integer. He also demonstrated that this theorem was true up to n = 4.

It turns out that his theorem was, unfortunately, not true, as we will soon see. However, his theorem was slightly altered by an acquaintance of his, namely Marin Mersenne, who was also a French mathematician as well as a theologian and philosopher. Mersenne took inspiration from Fermat and from Euclid’s theorem on perfect numbers, and compiled a list of all values of p for which 2p – 1 was prime.

2, 3, 5, 7, 13, 17, 19, 31, 67, 127, 257

His list was actually rather flawed. It was later proven that 267 – 1 and 2257 – 1 are in fact composite. Mersenne also missed a few values of p that are prime: 61, 89 and 107. Nevertheless, it is impressive that he managed to create a list with only a few errors by hand. In fact, the largest prime calculated by hand is still 2127 – 1. Primes of the form 2p – 1 are called Mersenne primes in his honour.

Marin Mersenne, the man to which primes of the form 2^p – 1 are attributed.

Now Mersenne primes might not seem that remarkable, as 2p – 1 doesn’t provide us with any sort of pattern in which primes appear. While this is true, a wonderful rule is tied to these primes.

If 2p – 1 is prime, then p is prime.

It is the first glimpse we see of solid, uninterrupted proof of certain primes. Even today, almost all equations, expressions, conjectures, theorems etc. to do with primes are not true in every instance. It means that whenever a new Mersenne prime is discovered (which, by the way, are denoted as Mp), we can say without a doubt that p is also prime. As a little bonus, because 2p – 1 is taken directly from Euclid’s theorem that we saw earlier, a new perfect number is discovered every time a Mersenne prime is discovered. Unfortunately, the rule does not work the other way around; we can’t say that if p is prime, then 2p – 1 is prime. Substituting p with the first prime that is not in Mersenne’s list, which is 11, easily proves this.

211 – 1 = 2047

2047 = 23 x 89

If the rule did work both ways, we would have an infallible method of generating primes. Of course, mathematics won’t let us win that easily.

In the modern search for primes, we aren’t looking through every single positive integer in the hopes that we find a new prime. While primes bigger than any other are still rarely found, we search for Mersenne primes. We do this because we know that we only need to check Mersenne primes for which p is prime, giving us a much higher discovery rate than checking every integer, and making the search as efficient as is currently possible.

~

Around a century later, a new figure arose in the field of mathematics. His name was Leonhard Euler (pronounced oi-ler), a Swiss mathematician born in 1707. In summary, Euler tied up all the loose ends of what had already been discovered about primes, and contributed his own developments.

Firstly, Euler refuted Fermat’s theorem that 2(2^n) + 1 is always prime, simply by showing that when n = 5, the result is composite.

22^5 + 1 = 4292967297

4292967297 = 641 x 6700417

Next, Euler added his own proof to Euclid’s ancient theorem on perfect numbers (remember it?). Euler acknowledged Euclid’s proof, and thereafter proved that every even perfect number is of the form 2p-1(2p-1). The theorem is named the Euclid-Euler theorem in recognition of each of their contributions. It is still unknown whether any odd perfect numbers exist or not, but we are sure that none with less than 1500 digits exist.

Euler then moved on to his own ideas on how primes could be generated. He began looking into polynomials, and eventually came up with the first known prime generating polynomial.

f(n) = n2 + n + 41

For the first 39 integer values of n, f(n) is prime, and is also for most values shortly beyond n = 39. While this is evidently not a solid pattern, it was of great interest to many because of the density of the primes. It is certainly now seen as a step forward, not specifically in the search for primes, but in prime theory and the ‘regular irregularity’ of primes.

Leonhard Euler, who is the only mathematician so far to have two numbers named after him.

If you’re interested in mathematics and have delved even slightly into other areas, it is very likely you have already heard of Euler; if not, you will certainly come across him again. His works span across almost every area of mathematics, and cover an array of physics problems as well. He was a monumental mathematical exemplar, and few come close to equalling the volume of his contributions to the academic world.

I’ve allowed myself one paragraph to devote to my favourite mathematician, so I hope it convinces you to read a little more about his work (perhaps start with Euler’s identity). Let’s get back on track.

~

Further towards the end of the 18th Century, it became clear that if an exact formula for primes existed, it was well beyond their reach at the current time. A number of mathematicians found themselves taking a step back and searching for notable patterns in the distribution of prime numbers. The first significant step forward in this area was made in 1798 by yet another French mathematician by the name of Adrien-Marie Legendre.

I’m just going to take a second to explain briefly about logarithms, which are rather heavily involved in the next section. A logarithm is the reverse function of the exponent, or power, as you may know it.

If y = ax

Then loga(y) = x

For example, as 25 = 32, log2(32) = 5.

If a logarithm is written without a base (which is the number written in subscript), it means the logarithm is in base 10.

log(x) = log10(x)

Similarly, if a logarithm is written as ln, it means it has base e, which is approximately equal to 2.718. Among many other applications, on the graph of y = ex, the gradient is equal to the y-value at any point.

ln(x) = loge(x)

The graph of y = e^x .

Moving back to 1798, Legendre conjectured that the number of primes below a certain number, x, was approximately:

­­­­­

Coincidentally, a young German mathematician, Carl Friedrich Gauss, also conjectured this at around the same time. Fifteen-year-old Gauss, however, was not satisfied with the limited accuracy of this theorem. He set to work, and came up with an even more accurate formula for calculating the number of primes below x:

He called this formula Li(x), where the ‘Li’ stands for ‘Logarithmic integral’, which is exactly what it is. Now, sadly, there was no known way of telling exactly how accurate this formula was. It was more accurate than the original Legendre-Gauss conjecture with all the various values of x that Gauss tried, but mathematicians wanted to know just how inaccurate Li(x) could get compared to the actual number of primes below x.

A few decades later, in 1859, a German chap called Bernhard Riemann was fiddling about with a series that Euler had come up with, which is:

Which happens to be equal to:

Notice how all the bottom right numbers are the primes in order?

For most values of s, however, the sequence adds to infinity (provided it continues to infinity, which is what we’re assuming here). This isn’t considered interesting, and despite Euler’s best efforts, he was unable to make anything better.

Riemann decided he would continue from where Euler had left off, and try to make a sequence where any value of s would not total to infinity. He eventually came up with this, which is called the Riemann Zeta function.

This may look horrible, but don’t worry, you don’t need to understand it exactly (If you want to understand it, I suggest reading Thomas Wright’s ‘A Friendly Introduction to the Riemann Hypothesis’, which explains it well and is entertaining at the same time). What Riemann has done here is created a function where, for any value of s:

If Z(s) does not equal infinity, then ζ(s) will produce the same value.

If Z(s) does equal infinity, then ζ(s) will produce a non-infinite value.

This is only untrue for ζ(1), which is undefined.

Now, it so happens that for any negative even value of s, ζ(s) = 0. Riemann began looking at positive values of s for which ζ(s) also equals 0. He found several values, all of which were in the form of 0.5 + it, where i = √(-1) and t is any real number (that is, any number which is not the root of a negative number). He could not find any positive values of s that were not in that form. From this information, he formed the famous Riemann hypothesis:

If ζ(s) = 0 and s is not a negative even number, then s = 0.5 + it.

From what I’ve told you, it doesn’t look like the whole Riemann escapade links into prime numbers very much. Fear not, for this is where it all ties together.

Remember Li(x)? Well, it turns out that the Riemann hypothesis and Li go hand in hand. The following sentence is why Li and the Riemann hypothesis are considered to be so important.

If the Riemann hypothesis is true, then Li(x) never differs from the number of primes below x by more than (√x)ln(x).

Therefore, if the Riemann hypothesis is proven, we will always be able to tell (within a considerably small degree of accuracy) how many prime numbers there are below any number, and our understanding of primes will become incredibly more developed. It’s no wonder that the Riemann hypothesis has been placed on the list of Millennium Prize Problems, and anyone who can prove it will be awarded a prize of \$1,000,000.

~

I appreciate that the last section was rather confusing, so let’s move toward the end of our journey to something I imagine to be a little more pleasing. In 1963, a bored Stanislaw Ulam was sitting in a scientific meeting when he began to absent-mindedly draw all the natural numbers (1, 2, 3, 4, …) in a square spiral. He then circled all of the primes, and noticed that the majority of them seemed to line up in diagonals.

You can begin to see the diagonal ‘spokes’. Ulam took his discovery to the Los Alamos Scientific Laboratory, where he took advantage in recent developments in computing to use MANIAC II, a first-generation computer, to generate the same spiral he drew at a much larger scale.

There’s no doubt that Ulam must have been astounded at what he had discovered while doodling. Mathematicians had known that there was some regularity in primes, but to have visual proof is a step forward from having to trudge through equations to see the patterns.

Ulam published his findings soon afterwards, and other mathematicians began to analyse Ulam’s spiral. It wasn’t long before it was discovered that Euler’s prime generating polynomial function, n2 + n + 41, was represented by one of the lines. As computational capabilities were rapidly increasing, mathematicians were able to find more and more obscure polynomial functions, each of which represented a line on the spiral. Like Euler’s, none of the functions generated a prime for every value of n. Many of the functions did, however, produce a higher number of consecutive primes than Euler’s. The function with the largest number of consecutive primes to date is:

which gives 56 consecutive primes from n = 0 to n = 55.

~

This whole post I’ve been nattering on about primes and how these discoveries are so important. To mathematicians, yes, but how are prime numbers actually useful?

Until very recently, prime numbers were just a concept – a fascinating one, but ultimately rather useless in the real world. This would explain why up until the 18th Century, when recreational mathematics was on the rise, very few people devoted lots of their time to prime numbers. However, that all changed with the development of computers. In the 1970s a series of cryptographers theorised and eventually created an encryption system that didn’t require the entire process of encryption to be secret. This is now known as public-key cryptography. Prime numbers are at the heart of this encryption system, and with larger prime numbers comes greater security. The hunt for primes is on now more than it ever has been.

~

Before I finish, I’d briefly like to introduce a project called GIMPS, which stands for the Great Internet Mersenne Prime Search. The project began in 1995, and has since found 13 Mersenne primes out of the known 48. From the time the first prime was found by GIMPS, every Mersenne prime has also been found by GIMPS.

What makes this project so effective is that it utilises everyone who is willing to participate. By downloading and running a small program, a participant can ‘lend’ a fraction of their processing power to the overall search for primes. If you’d like to be a part of a search that has been going on since 1600BC, you can learn more about GIMPS here(There are cash prizes if you find a prime.)

Thank you for reading.

-Ollie James