Logo
Logo
Innovation­game

A practical method for predicting any sequence of prime numbers

April 2012


Abstract

Generations of mathematicians have looked for patterns among primes.  If such patterns exist, it would become simple to predict the next prime in a sequence without any knowledge of any of the preceding primes.  This method arises from the simple observation that the first two prime numbers are 2 and 3.  Based on this, all prime numbers greater than 3 must be 1(Mod 6).  It is based on what we have termed the P Number Cayley table.  The method constructs a sequence of prime numbers, Pn, within a specified range starting with the lowest value of n, by taking each value of n in turn and checking to see whether it appears in the Cayley Table.  There are patterns in the rows of the Cayley table that allow the calculations to be simplified to the extent that calculations can be made with the minimum of CPU time.

Preamble

Generations of mathematicians have looked for patterns among primes.  If such patterns exist, it would become simple to predict the next prime in a sequence without any knowledge of any of the preceding primes.  Although it has proved impossible to find a pattern among primes, the author has previously identified an algorithm based on patterns in prime candidates [1], but did not develop the ideas further at the time.  Using this method, it is possible to predict any prime, such as the smallest prime greater than 1013, without reference to any other prime.

Outline of the Basic Method

This method arises from the simple observation that the first two prime numbers are 2 and 3. Based on this, all prime numbers greater than 3 must be 1(Mod 6).  All candidate primes greater than 3 can be designated Pn, where the index, n, is the P–number of the prime candidate.  Thus P1 = 5, P2 = 7, P3 = 11, P44 = 13, P5 = 17, etc.  Where n is odd, Pn = -1 Mod(6) and where n is even, Pn = 1 Mod(6).

Now where n is odd Pn = 3n + 2 [1]
And where n is even Pn = 3n + 1 [2]
Conversely n = Pn Div(3) [3]

Now the product of every pair of prime candidates is another prime candidate, as shown in [4]:

For i, j > 0 (6i + 1)(6j + 1) = 36ij + 6i + 6j + 1 [4a]
For i, j < 0 (6i - 1)(6j - 1) = 36ij - 6i - 6j + 1 [4b]
For i<0<j (6i - 1)(6j + 1) = 36ij + 6i - 6j - 1 [4c]
For j<0<i (6i + 1)(6j - 1) = 36ij - 6i + 6j - 1 [4c]

In all three cases the product can be expressed as 6c±1.  The method is based on what we have termed the P–Number Cayley table shown in Table 1.

  1 2 3 4 5 6 7 8 9 10
1 8 11 18 21 28 31 38 41 49 51
2 11 16 25 30 39 44 53 58 67 72
3 18 25 40 47 62 69 84 91 106 113
4 21 30 47 56 73 82 99 108 125 134
5 28 39 62 73 96 107 130 141 164 175
6 31 44 69 82 107 120 145 158 183 196
7 38 53 84 99 130 145 176 191 222 237
8 41 58 91 108 141 158 191 208 241 258
9 48 67 106 125 164 183 222 241 280 299
10 51 72 113 134 175 196 237 258 299 320

Table 1: P-Number Cayley Table.  The numbers of the rows and columns, respectively i and j, are the P–Numbers of prime candidates to be multiplied together.  The table entries, r(i,j), are such that Pr = PiPj.

It is immediately obvious that the table is reflected around the leading diagonal, which means that we need only concern ourselves with the entries above the leading diagonal.  The method constructs a sequence of prime numbers, Pn, within a specified range starting with the lowest value of n, by taking each value of n in turn and checking to see whether it appears in the Cayley Table.  This starts by proceeding along the leading diagonal until a value of r(j,j) is found that is not less than n.  The value or r(j,j) is stored for use with the next value of n. All the columns to the left of the diagonal contain values less than n so can be neglected.

The process then travels up the column until a value of n(i,j) is found that is not greater than n, so that all the values of above it are less than n.  The process then proceeds along the row until a value of r(i,j) is found that is not less than n, so that all the values of above it are greater than n.  The sequence of column and row is repeated until there are no remaining values to check in the top row.  As soon as a vaue of r is found that matches n, the process stops and that value of n is discarded.  If no match is found, Pn is prime.  The process for the next value of n begins at the diagonal stored during the calculation for the previous value of n.

Detail of the Method

Calculating the value of the next diagonal

Because this method predicts a series of prime numbers, it is easier to calculate the value of the leading diagonal from the preceding one, rather than making the whole calculation again.  Given that a diagonal is assigned the index r(j,j):

If j is even P(j) = 3j + 1 [5]
so P(j)2 = 9j2 + 6j + 1 [6]
giving r(j,j) = 3j2 + 2j [7]
Also P(j+1) = 3(j + 1) + 2 [8]
so P(j+1)2 = 9j2 + 30j + 25 [9]
giving r(j+1,j+1) = 3j2 + 10j + 8 [10]
From [7] and [10] r(j+1,j+1) - r(j,j) = 3j2 + 10j + 8 - (3j2 + 2j) [11]
So r(j+1,j+1) - r(j,j) = 8(j+1) [12]

If j is odd P(j) = 3j + 2 [13]
so P(j)2 = 9j2 + 12j + 4 [14]
giving r(j,j) = 3j2 + 4j + 1 [15]
Also P(j+1) = 3(j + 1) + 1 [16]
so P(j+1)2 = 9j2 + 24j + 16 [17]
giving r(j+1,j+1) = 3j2 + 8j + 5 [18]
From [14] and [17] r(j+1,j+1) - r(j,j) = 3j2 + 8j + 5 - (3j2 + 4j + 1) [19]
So r(j+1,j+1) - r(j,j) = 4(j+1) [20]

It is worth noting that [11] and [18] demonstrate by induction that the leading diagonal is always divisible by 8, given that r(0,0) = 0 (for the sake of completeness P0 has been assumed to be 1).

Calculating the row decrements

To calculate the row decrements, we need to look at odd numbered columns and even numbered columns separately and also the decrements from odd rows to even rows and vice versa.

For an even column j, Pj =3j + 1. For an even row, i, Pi = 3i + 1.

So PiPj = (3i + 1)(3j + 1) [21]
Giving PiPj = 9ij + 3i + 3j + 1 [22]
Hence r(i,j) = 3ij + i + j [23]

For the preceding odd row, i - 1, P(i-1) = 3i - 1

So P(i-1)Pj = (3i - 1)(3j + 1) [24]
Giving P(i-1)Pj = 9ij + 3i - 3j - 1 [25]
Or P(i-1)Pj = 9ij + 3(i - 1) - 3j + 2 [26]
Hence r(i-1,j) = 3ij + i - j - 1 [27]

Thus, from [21] and [25], the decrement is given by

r(i,j) - r(i-1,j) = 3ij + i + j - (3ij + i - j - 1) [28]
So r(i,j) - r(i-1,j) = 2j + 1 [29]

For the further preceding row, i-2, P(i-2) = 3(i - 2) + 1

So P(i-2)Pj = (3i - 5) (3j + 1) [30]
Giving P(i-2)Pj = 9ij + 3i - 15j - 5 [31]
Or P(i-2)Pj = 9ij + 3(i - 2) - 15j + 1 [32]
Hence r(i-2,j) = 3ij + i - 5j - 2 [33]

Thus, from [27] and [33], the decrement is given by

r(i-1,j) - r(i-2,j) = 3ij + i - j -1 - (3ij + i - 5j - 2) [34]
So r(i-1,j) - r(i-2,j) = 4j + 1 [35]

For an odd column j, Pj =3j + 2. For an even row, i, Pi = 3i + 1.

So PiPj = (3i + 1)(3j + 2) [36]
Giving PiPj = 9ij + 6i + 3j + 2 [37]
Hence r(i,j) = 3ij + 2i + j [38]

For the preceding odd row, i - 1, P(i-1) = 3i - 1

So P(i-1)Pj = (3i - 1)(3j + 2) [39]
Giving P(i-1)Pj = 9ij + 6i - 3j - 2 [40]
Hence r(i-1,j) = 3ij + 2i - j -1 [41]

Thus, from [38] and [41], the decrement is given by

r(i,j) - r(i-1,j) = 3ij + 2i + j - (3ij + 2i - j -1) [42]
r(i,j) - r(i-1,j) = 2j + 1 [43]

For the further preceding row, i-2, P(i-2) = 3(i - 2) +1

So P(i-2)Pj = (3i - 5)(3j + 2) [44]
Giving P(i-2)Pj = 9ij + 6i - 15j - 10 [45]
Or P(i-2)Pj = 9ij + 6(i -2) - 15j + 2 [46]
Hence r(i-2,j) = 3ij + 2i - 5j - 4 [47]

Thus, from [41] and [47], the decrement is given by

r(i-1,j) - r(i-2,j) = 3ij + 2i - j -1 - (3ij + 2i - 5j - 4) [48]
r(i-1,j) - r(i-2,j) = 4j + 3 [49]

Because the matrix is symmetrical, reflected about the leading diagonal, proceeding along the rows above the leading diagonal is identical to proceeding down the columns below the leading diagonal (with i and j reversed) and so no further analysis of this is necessary.

Implementation

To implement the algorithm it is necessary to define the minimum and maximum range of interest. Let S be the lowest value allowed in the sequence.  The baseline P–Number, B, is given by:

B = S(Div 3) [50]

However, because S may or may not be a prime candidate, we need to find out where it lies in terms of Modulo 6.  So we assign it an M value

M = S(Mod 6) [51]

If M = 0 then the first prime candidate is S + 1 and if M = 1 then the first prime candidate is S, so B is the correct choice for n where Pn is the first of the prime candidates to be checked.  However, if M = 2, the first prime candidate is S + 3, but B is the same as for the two previous cases, so n = B + 1.  For M = 3, M = 4 and M = 5, B has been increased by 1 so, once again, n = B.

Now consider that the highest value allowed in the sequence is to be less than F.  The baseline P Number, B, is given by:

B = F(Div 3) [52]

However, again because F may or may not be a prime candidate, we need to find out where it lies in terms of Modulo 6.  So we assign it an M value

M = F(Mod 6) [53]

If M= 0 then the last prime candidate is F - 1 and if M = 1 then the last prime candidate is F - 2, so B - 1 is the correct choice for n, where Pn is the last prime candidate to be checked.  However, if M = 2, the last prime candidate is F - 1, but B is the same as for the two previous cases so n = B.  For M = 3, M = 4 and M = 5, B has been increased by 1 so, once again, n = B - 1.  Table 2 illustrates the choice of n(S) and n(F).

S or F 96 97 98 99 100 101
B 32 32 32 33 33 33
M 0 1 2 3 4 5
n(S) 32 32 33 33 33 33
Ps 97 97 101 101 101 101
n(F) 31 31 32 32 32 32
Pf 95 95 97 97 97 97

Table 2: The calculation of n(S) and n(F).  Ps and Pf are respectively the first and last prime candidates in a sequence.

Limitations of the Basic Method

The basic method was tested on a Dell computer using an Intel Core 2 Quad CPU Q8300 running at 2.5 GHz with 4 GB RAM, solving for all prime numbers in the first hundred numbers starting at each power of 10.  The computer was running other real time applications, including an Email application and a weather station, plus other interactive applications.

For numbers up to about 1010, the basic method produced results in a fraction of a second.  However for numbers in the region of 1013, the time extended to more than three hours and, for 1014, the time required was more than one day.  To understand the reason for this, it is necessary to consider Equations [29], [35], [43] and [49].  Where Pn is in the region of 1012, the row decrement is of the order 106.  However, as the calculations approach the upper rows of the Cayley table, the column increment is relatively small.  So the number of calculations required to proceed along a row run to many powers of 10.  Clearly what is required is a method using a single calculation for each row.  Fortunately there is a clear pattern in the Cayley table which makes this possible.

The Row Patterns

The visual evidence the row patterns is clearly seen in row 1, where alternate columns are 1 Mod(10) and (-2) Mod(10).  To explore this further, we need to consider the odd and even values of r(i,j) along each row.  For an even row and an even column we have:

r(i,j) = 3ij + i + j [23]
Which reduces to r(i,j) = j(3i + 1) + i [54]

Remembering the j is even, we can write j =2m for some m.  Also,for i even, (3i+1) = Pi:

So [54] becomes r(i,j) = 2m.Pi + i [55]
In other words r(i,j) = (i) Mod(2Pi) [56]

For an even row and an odd column we have:

r(i,j) = 3ij + 2i + j [38]
Which reduces to r(i,j) = j(3i + 1) + 2i [57]

Remembering the j is odd, we can write j =2m - 1 for some m.  Also, for i even, (3i+1) = Pi:

So [57] becomes r(i,j) = 2m.Pi + 2i - (3i + 1) [58]
Which simplifies to r(i,j) = 2m.Pi - (i + 1) [59]
In other words r(i,j) = -(i + i) Mod(2Pi) [60]

For an odd row and an even column we have:

P(i,j) = (3i + 2)(3j + 1) [61]
Or P(i,j) = 9ij +3i + 6j + 2 [62]
So r(i,j) = 3ij + i + 2j [63]
Which reduces to r(i,j) = j(3i + 2) + i [64]

Remembering the j is even, we can write j =2m for some m.  Also, for i odd, (3i+2) = Pi:

So [64] becomes r(i,j) = 2m.Pi + i [65]
In other words r(i,j) = (i) Mod(2Pi) [66]

For an odd row and an odd column we have:

r(i,j) = (3i + 2)(3j + 2) [67]
Or r(i,j) = 9ij +6i + 6j + 4 [68]
So r(i,j) = 3ij + 2i + 2j + 1 [69]
Which reduces to r(i,j) = j(3i + 2) + 2i + 1 [70]

Remembering the j is odd, we can write j =2m - 1 for some m.  Also,for i odd, (3i+2) = Pi:

So [70] becomes r(i,j) = 2m.Pi + 2i - (3i + 2) + 1 [71]
Which simplifies to r(i,j) = 2m.Pi + - (i + 1) [72]
In other words r(i,j) = -(i + i) Mod(2Pi) [73]

From [56], [60], [66] and [73] it can be seen that, for each row, i, it is only necessary to check whether n = (i) Mod(2Pi) or n = -(i + i) Mod(2Pi). This means that the number of calculation steps becomes greatly reduced to j - 1, where the starting diagonal is (j,j).

Implementation of Modified Algorithm

The modified algorithm was run for the first 1000 numbers above each power of 10 starting at 100, then from 103 to 1018. The final run took less than 5 minutes. It was not possible to run 1019 because it is beyond the range of 263, the limit of the computer used. The largest positive integer accepted by the system used for the test was 9223372036854775807, or 263 - 1, which is not a prime. The largest prime found below this value was 9223372036854775783.

Corollary

Because the method uses P numbers, it is possible to investigate prime candidates where Pn is greater than 263, but with an index, n, less than 263.  In order to do this, the diagonal calculation has to be modified because original method calculates the smallest diagonal greater than the n under investigation.  Where n is close to the limit of 263 - 1, the diagonal would be calculated as a negative integer, because numbers greater than 263 - 1 represent negative numbers within the computer's integer handling routines.  Therefore the difference between Pn and the previous value of diagonal is compared with the increment between the preceding diagonal and the required diagonal.  The increment can be calculated from [12] or [20].  If the difference is less than the increment then the next diagonal would be larger than n and the calculation can proceed.

In other words, if n - r(j,j) < r(j+1,j+1) - r(j,j) [74]

The index of the largest prime found was 9223372036854775799, which is the P–number of the prime 27670116110564327399 (~2.767 X 1019).  This is almost three times greater than 263.

References

[1] L D Howe, The P-No Algorithm, M500 95, 1986