Prime number generation

In continuation of the topic that began in this series of articles , today we will talk about the generation of primes. Generation is probabilistic and is guaranteed. So, our case is with a guarantee that allows you to use the numbers obtained in applications related to cryptography, which ensures, for example, the security of managing your money through banking applications. Further you will see how it is guaranteed to receive prime numbers of sufficient size for cryptography, as well as what problems can be if you neglect guarantees.

The general security scheme for data exchange is based on the complexity of factorizing a large composite number. Moreover, if there are many factors, then they will be small, and this greatly simplifies the hacking of the security system. Therefore, a large composite number is obtained by multiplying two prime numbers of a certain size. But here one problem is obvious - how to choose a sufficiently large prime number?

First, determine the size. In modern cryptography, the order of the composite number is determined by the formula

$2 ^ {2048}$ , that is, the actual order is 2048 in binary number systems. And the divisors of this composite number are of the order of 1024, that is, somewhere around the values

$2 ^ {1024}$ . Why 1024? Because ten years ago, a number of the order of 768 was factorized, and today it is very likely that the composite numbers of the order of 1024 are decomposed. So it was decided to double the order immediately, that is, to 2048 for reliability. Well, starting from this order is easy to determine order of necessary factors.

But what happens if the order of the factors is much less than 1024? Then, in an acceptable time, you can decompose a composite number, even if it is of the order of 2048. Therefore, it is necessary to guarantee that the factors chosen by us do have an order close to 1024 and are simple, that is, they are not factorized. But here the choice itself is carried out according to a probabilistic scheme, and this just leads to potential problems.

The probability of the simplicity of a number is checked based on the assumption that the number is “most likely” prime if its period (which is described here ) is a divisor of the number itself, minus one (

$N-1$ ) In the form of a formula, it looks like this:

$a ^ p \ pmod N = 1$

$N-1 \ pmod p = 0$

Here

$N$ - test number

$a$ - value from

$2$ before

$N-2$ (it determines the base of the number system in which the period is calculated),

$p$ - period of the investigated number. Operation

$mod$ here means taking the remainder of dividing the first operand by the second.

Existing checks rely on just such an approach, since for large numbers the unconditional determination of simplicity by known tests takes a very long time. But the minus of such a check is obvious - sometimes you can get a composite number, instead of a simple one. Moreover, no one knows what divisors will be part of such a number. More precisely, an attacker will be able to find out by applying well-known factorization methods that start working in an acceptable time if the factors are on the order of less than a few hundred bits.

How often is probabilistic simplicity error? Rarely enough. There is one mistake for several tens of thousands of simple ones, and sometimes two, but nothing guarantees us even three. In addition, much depends on the test used. So, for example, in the standard Java language library in the BigInteger class, a check is implemented that allows a miss 2-3 times per thousand simple ones. That is, you should not think that if in theory there are very few errors, then in practice everything will be in chocolate.

How is this dangerous? It seems to be not as scary as some may say, because well, if a site that closes browsing its pages with the https protocol receives an easily calculated key, then what will be the problem? Well, hackers will find out what news you read on this site, and what's the point? But in fact, encryption is not only done when browsing the web. A more unpleasant situation will happen if hackers find out the key of your favorite banking service for remotely managing your money. Although, again, it seems that in order to surely open the exchange with the bank (and if the bank uses a weak simplicity check), the hacker will have to wait about 500 years until the probability of issuing an easily calculated key is finally realized, because the keys usually have a validity period of 1 year and therefore, to guarantee hacking, you need to wait until 500 issues of a new key are carried out. Everything seems to be logical, but there is another “but” - there are much more banks than 500 in the world. Therefore, right now, perhaps in your favorite bank, hackers can gain access to your own money.

Are you scared? Surely not, because we are all very brave, until the roasted rooster pecks. But there is still something to be afraid of. Although the probability of a successful attack by hackers precisely on your bank is not so high, but nevertheless, if it is realized, then your money will probably not be found. Why? It’s very simple - the standard first assumption of the bank’s security service is that someone with the appropriate access rights is to blame. Not a hacker, whose likelihood of interference they also consider as minimal, namely someone of their own. Therefore, a hacker may well go unpunished.

How to fix this problem? You need to learn how to get guaranteed prime numbers. But how to guarantee their simplicity? There are four methods for this.

The first method is enumeration with a minimum of restrictions. This is usually a variant of the improved sieve of Eratosthenes. The method is reliable and guaranteed, but limited to very modest orders (less than a hundred). Therefore, this method is not applicable in cryptography.

The second method is enumeration with stronger restrictions. So, if the period of a number is equal to the number itself, minus one, then the number is guaranteed to be prime. The formula here is -

$N-1 = p$ . But to determine the equality of the period of the difference between the number and the unit, very heavy methods are used that do not work for all classes of numbers. Therefore, their use in cryptography rests either on the limited number of numbers of specific classes, or during verification, if we increase the size of the number. And besides, even numbers of a certain class by themselves do not guarantee anything, which means the need to sort out many candidate numbers. Here, for example, the Mersenne numbers, for which there is an unconditional test of simplicity, have already been sorted out and found out that in the range used in cryptography they are practically absent. Here is their entire list with close orders - 1279, 2203, 2281, 3217. As you can see, from 1024 to 2048 there is only one suitable number. But even if we take the rest, their number hints to us very clearly - it’s not worth it! So again we were out of luck with the verification method.

The third method is probabilistic. It is it that is used today, including in your favorite bank. How does he work? Very simple - the remainder of dividing an arbitrary number in a power is checked

$N-1$ on

$N$ if the remainder is one, it is likely that the number is prime. And here the word "probably" is the most unpleasant here. Although probabilistic simplicity verification methods also contain additional improvements, nevertheless, as was shown above, the standard Java library quite often makes mistakes.

And finally, the fourth method. It does not work as fast as probabilistic tests (although something can be optimized here as well), but it gives guaranteed primes, and even with an easily definable period. What is a prime period for, you ask? For example, to generate a pseudo-random or encryption sequence. More precisely, knowing the period, we know exactly how many numbers will be in the sequence, which makes it easy to plan how long we can use the sequence generator based on this number. True, this already applies to symmetric encryption, but it can also be useful in a number of cases. Next, we will consider a theory that guarantees the simplicity of checking candidate numbers.

Theoretical basis

To understand how to generate guaranteed prime numbers, you need to dive into a little math. But don't be alarmed, math is here at the fifth grade level.

For completeness, it is recommended to look here for details about what a period and series of residues are. Well, briefly say that a number of residues are formed by dividing the “corner” (a method known to any student) of the unit by the number chosen for research. At the same time, at each step, a difference appears between the higher number, with a zero added to it, and the product of the number under investigation by a value from 0 to 9 (for the decimal number system) - this is the remainder. But there are many steps in dividing by the “corner”, therefore there are also many residues, and together they form a series of residues, in which the same set of numbers is periodically repeated an infinite number of times. A sign of the beginning of a new cycle is a remainder equal to one. The amount of residual between units is the length of the period (or cycle). That, in fact, is all, but it’s also worthwhile to understand that such a method of constructing a series of balances can be applied using any number system, and in particular, a system with base 2 (and not 10, as is customary in school) will be most often used. When using other number systems, all the rules are preserved, but the number of product variants at each step will be different, equal to b (from base), that is, the base of the number system.

So, from the previously shown, we know that composite numbers always give relatively short periods that do not include a number of “forbidden” values. Knowing the factorization of a compound number, it is easy to calculate how many values are required to be absent in a number of residues forming the period of a given number. A series of residuals will not contain factors and all numbers that are multiples of these factors if the series itself is built on the basis of a number system that is not a multiple of any of the divisors (that is, the basis is not divided by divisors of the number). For example, if there are only two factors, then the number of multiple residuals is determined by the formula

$m = a + b-2$ where

$m$ - the number of multiple residues,

$a$ and

$b$ - factors forming our compound number. Knowing the number of “forbidden” residues, it is easy to calculate that a series of residues longer than half the value

$N-1$ will have a length equal to

$L = N-1- (a + b-2) = N-a-b + 1$ . That is, such a series will be on

$a + b-2$ shorter than the series that would turn out if the given number were prime. This explains why all numbers with a period equal to the full period (i.e.

$N-1$ ), turn out to be simple - if at least one element (which is “forbidden”) were excluded from the sequence of residues, then the period would become less

$N-1$ . As a result of the presence of such a regularity, we observe the operability of all those probability tests, which today verify the simplicity of a number. These tests calculate values in a series of balances per position.

$N-1$ or

$(N-1) / 2$ , and if the values are equal

$1$ or

$N-1$ , then it is highly likely that the number is prime. Why only with high probability, and not guaranteed? Because the period probabilistic tests are not calculated, but between positions

$N-1$ and

$(N-1) / 2$ there may be more units, which will mean the inequality of the period to the value

$N-1$ but if the period is not equal

$N-1$ , then the number may be composite. Moreover, the absence of equality in itself is not critical, another thing is important - pseudo-simple numbers can give such an arrangement of units. Among the numbers verified by such a test, you can find pseudo-simple ones that are composite and that help hackers who steal your money. After all, what are the divisors of such composite numbers? They may well be small enough for an attacker to easily reveal an encrypted data exchange.

Now remember why pseudo-prime numbers may appear at all. Such numbers have short periods that repeat many times within the full period.

$N-1$ therefore, they can sometimes “fit in” and fit in a whole number of times in a full period. So, for example, the number 25 has a period of 4 for the number system with base 7.

$N-1$ for 25 is 24, try to divide: 24/4 = 6. That is, this period is laid out an integer number of times in a full period. This fact can be verified by the above formula to shorten the period, but taking into account the fact that a and b are the same in this case. The maximum possible period will be 24-4 = 20, here 24 is the total number of residues, 4 is the number of residues that are multiples of 5. But the period is not always the maximum, and all other options can be obtained by dividing the maximum period completely. 20 is divided by 2,4,5,10, and just by dividing 20 by a number from the above list we get a period of 4 in length, which gives us at the end of the full period

$N-1$ unit. And when checking simplicity, just the values at the position are checked

$N-1$ , that is, the last value in the full period. And for 25 it is equal to 1, which is a sign of the simplicity of the number. Although in fact a clear sign of simplicity, this is when for all bases of number systems, less

$N$ , the last value in the full period is equal to one. But there is no way to check all the number systems for large numbers, so pseudo-simple numbers appear, which are sometimes used even in cryptography, which increases the vulnerability of our finances.

How to eliminate the influence of pseudo-prime numbers? In principle, it is simple - you can check the periods for all number systems, but then for this operation for large numbers there will not be enough time. Therefore, we will go the other way. And the path is also simple (in the literal sense - it uses other prime numbers). We have seen that short periods can fit into a full period an integer number of times, and this gives us pseudo-simple numbers. But what prevents us from taking and canceling these Mondays? Yes, in general, nothing prevents.

For short periods, the full period (

$N-1$ ) must be divided into many divisors. So for the number 25, the full period 24 is divided into 2, 3, 4, 6, 8, 12. There are so many possibilities for penetrating into the protected territory of pseudo-simple numbers. So we just need to take a prime as a full period, because it does not divide into anything at all and therefore automatically saves us from the enemy element. True, there is one “but” - we need prime numbers, and they are, as you know, odd (except for one exception - 2), then if

$N-1$ simple then itself

$N$ it cannot be simple anymore. Therefore, we have to admit the possibility for the appearance of incomplete periods. Why is this bad? As we saw above, the full period guarantees the simplicity of the number, but we have not proved yet the incomplete period. So we need to prove this moment.

The proof is pretty simple - for compound

$N$ period in length

$N-1$ two times, obtained in a number system that is not a multiple of the divisors of N, has only two rows of residues, and none of them should contain numbers that are multiples of the divisors of

$N$ ("Forbidden" numbers). If at least one element is excluded from one row of residues, then the second one automatically decreases by the same amount (after all, one row is equal to another multiplied by any number that is not in the first). So if there are divisors in the number

$N$ , we get a shorter total length of two periods with all “allowed” balances than the value of the full period and thereby move the unit from its place at the end of the full period, thus ensuring the absence of pseudo-simplicity. But can the number of multiple divisor remainders be such as to give exactly half or one third of the full period? After all, then we would receive, for example, two-thirds of the “allowed” balances (two rows) and one third of the “forbidden” ones, and in total the full period. But to get one third we need to ensure the divisibility of the full period (

$N-1$ ) by 3, which we can quite easily exclude - take a prime number, multiply it by two and add one to the result - voila, we are guaranteed to exclude pseudo-simplicity. With him, the number of divisors divisible by divisors cannot be equal to one third of the full period, because he cannot divide by three full periods. There remains the option with half the remainder that is a multiple of the divisors

$N$ . This option is eliminated a little more difficult, so there will be a slight digression below.

Calculation of the period, or all numbers - Mersenne's children

The period of any number can be calculated. In the simplest case, the calculation is done by simply dividing the corner

$1 / N$ until we meet a remainder equal to one (in the number system, not a multiple of the divisors

$N$ ) But for large numbers, this is unrealistically long. Therefore, there is a need to derive a formula for calculating the period. And such a formula exists, though not in perfect form, when we have a number at the input, and a period at the output. The formula is:

$N = \ frac {b ^ {m + n + 1} -1} {b ^ n + k}$

Here

$N$ - investigated number

$b$ - the base of the used number system (base),

$m$ - maximum degree

$b$ at which the result of exponentiation is less

$N$ (in other words, the senior level index in

$N$ in the number system

$b$ ),

$n$ - distance from left to right in the row of residues from the index

$m + 1$ (equal to the number of bits in

$N$ ) to one,

$k$ Is some integer coefficient.

Formula output

By the definition of the formula, it is clear that (1)

$b ^ {m + 1} -N = x$ , that is, the first degree

$b$ which is larger

$N$ , allows you to subtract

$N$ and get some difference in the form

$x$ . It also follows from the definition that (2)

$x * b ^ n-k * N = 1$ , here

$k$ Is an integer coefficient. If you multiply (1) by

$b ^ n$ and replace

$x * b ^ n$ to its value from (2) , which is equal to

$kN + 1$ then we get

$b ^ {m + n + 1} = N * b ^ n + kN + 1 = N * (b ^ n + k) +1 => N = \ frac {b ^ {m + n + 1} -1 } {b ^ n + k}$ .

Beneficial features

As we can see, using this formula you can get numbers with a known period. True, there is one difficulty - you need to select a coefficient

$k$ that for large numbers again turns into something unattainable. But the formula gives us something else - we clearly see how all numbers are formed. Yes, absolutely all positive integers can be obtained by this formula, because all numbers have some period. And interestingly, all numbers are obtained by dividing the Mersenne number by one of its divisors. Recall that a Mersenne number is a number that is equal to

$2^n-1$ .In the formula in the numerator, we see a generalization for Mersenne numbers with any base (including 2, of course). And if we are interested in primes, then we will not need any other bases besides 2.

Knowing that we divide the Mersenne number, it would be useful for us to be able to calculate the period of numbers precisely for the case of division of Mersenne numbers (recall the extended concept of period here ). But what is very remarkable is that the formula for calculating the Mersen period coincides with the formula for calculating a period of the type

$1/N$ . Well, to derive the formula for dividing the Mersenne numbers, the same principle is used as when deriving the formula for dividing

, with the exception of the formula for calculating the value in a series of residues with index

$1/N$

that looks like this -

$n$

, and for binary numbers we get

$\frac{b^{n+1}-1}{b-1}$

$2^{n+1}1$ .

Now we have all the cards on hand and we can start the game by opening the pseudo-simple archers in the ranks of primes.

As we know from previous series, when divisible, the whole Mersen period of a number must fit the number of digits of the Mersenne number an integer number of times. Therefore, in the formula above, a solution in integers is possible only if the denominator has a period either equal to the numerator or multiple times smaller. But a much shorter period will give us, including divisors of the number itself

, because if

$N$

divides the Mersenne number, then its divisors also divide the Mersenne number. And this is a very important point, because it follows from it that the lengths of the periods of the divisors N divide the length of the period

$N$

completely. That is, if we take

$N$

is such that its period is equal to the period of the numerator, then all the divisors

$N$

will be at the same time divisors of the Mersenne number, and therefore must fit into the period and

$N$

, and Mersenne numbers an integer number of times.

$N$

Again about half the period

Now recall that we stopped on the search for a way to prove that we can exclude the case when half the length of a series of residual numbers is a multiple of its divisors. We wanted to prove this to eliminate pseudo-prime numbers when choosing one prime number as the basis for constructing the period of the next prime number. So now we know that if the constructed number has divisors, then their periods always divide the period of the constructed number completely. Then it remains for us to check which divisors can fit into the previously given restrictions on the constructed number. And such restrictions - his period is divided only into

and on

$2$

$(N-1)/2$ . Therefore, only numbers with periods can be divisors

$2$ and

$(N-1)/2$ . Period

has a number

$2$

, no other number of such a period has more.

$2$ Period

does not fit an integer number of times in

$2$

, because

$(N-1)/2$

is prime, therefore divisibility into three is excluded for such a period. So there is only one possibility left - the dividers have a period

$(N-1)/2$

$(N-1)/2$ . But the number cannot be less than or equal to its period, so the minimum value for the divisors is

$(N-1)/2+1$ . If we multiply two minimal divisors, we get -

, this value must be no more

$(N-1)^2/4+(N-1)+1=(N^2-2N+1+4N)/4=(N^2+2N+1)/4$

, because we are talking about dividers

$N$

$N$ . Then we obtain the inequality

$(N^2+2N+1)/4\leq N => N^2-2N+1\leq 0$ . This inequality can be negative or equal to zero only for

, therefore, for all numbers constructed in this way, pseudo-simplicity is excluded if the resulting number is greater than

$N=1$

$1$ .Well, for smaller numbers, we can test simplicity even in the mind.

Now there are two options - the new number is a prime number, which is what we needed, or this is a composite number and then its period does not fit an integer number of times in a full period, and therefore this number can be easily checked for simplicity by calculating the last value in a full series of residuals. That is, the constructed number can be easily checked with standard probabilistic simplicity tests, and what is important, the test result will not be probabilistic, but guaranteed. So, in the end, we freed ourselves from the curse of pseudo-simplicity, which puts pressure on all probabilistic tests.

But of course, life would be honey if all the problems were solved so simply. Eliminating pseudo-simplicity, we did not exclude numbers that are not hidden from anyone and are not masked under anything. And there is one more problem with them - for now, we can check an arbitrary term in a sequence of residues only by raising to a power and then taking the remainder. But this method is very slow. More precisely, it is fast enough for the numbers used in cryptography, but it is not suitable for finding large primes at home, because it requires many years of computing a regular home computer, which does not allow us to get 400k $ (as shown here ).

Nevertheless, almost everything is ready for computing cryptographic primes. Although our old friend remains - maximalism. He asks - since you can use a period

, what prevents the use of periods

$(N-1)/2$

, etc.? And it turns out that with due care - nothing prevents! In the same way as with the period

$(N-1)/4, (N-1)/6, (N-1)/8$

, for the period

$(N-1)/2$

we are able to set a lower bound by multiplying a prime by 4 and adding 1. Then we guarantee ourselves from pseudo-simple with periods less than

$(N-1)/4$

$(N-1)/4$ .So it remains to consider the possibility of the appearance of pseudo-simple with a period multiple of 4, 3, 2 (1 for the components is excluded, as shown above). From the formula for calculating the number by the period it is clear that the period of the dividend must be equal to the least common multiple of its divisors, which implies that the possible period of the number

must not only fit integer times in

$N$

(otherwise there will be no pseudo-simplicity), but also contain an integer number of periods of dividers. Thus, the number of possible divisors of a pseudo-prime number is sharply limited. Since the period of any number cannot be longer

$N-1$

, then for a possible divider

$N-1$

, giving as a result of division 3, the period cannot be longer

$N$

, in addition, we take into account that the period of 3 is 2.

$N/3-1$

must be odd because it is odd

$N/3$

$N$ . From the above it follows that the even period N / 3-1 is the smallest common multiple with a period 2 of number 3, which means that the possible period of a potentially pseudo-simple N should be equal to the period of the number

$N/3$ . In total there is one value of the period

, for which pseudo-simplicity is possible, is

$N$

$(N-1)/4$ . For other values, either

too small, or its period (and with it the period

$N/3$

) will go to the forbidden area below

$N$

$(N-1)/4$ . The same story with odd numbers, less

, but their periods cannot be longer

$N/3$

, and therefore all of them are simply excluded from consideration. Now show that

$(N-1)/4$

cannot have a period

$N/3$

, which means that he cannot divide the entire period entirely. First, recall the formula

$(N-1)/4$

, which gives us the number of multiples of the remainder divisors for a number divisible only by

$m=a+b-2$

$a$ and

$b$ . In our case

is supposed to be divisible by

$N$

and by 3, all other dividers are excluded, so we get -

$N/3$

$m=N/3+3-2=N/3+1$ . Now from the full period you need to subtract the “forbidden” values, which will be

, and then divide by 4. We get the possible period of the work

$N/3+1$

$3*N/3$ :

, which is less

$p=(N-1-N/3-1)/4=N/6-1/2$

, i.e. a period of length

$(N-1)/4$

impossible due to the need to take into account the “forbidden” residues, which takes all possible pseudo-simple periods to the forbidden region (less

$(N-1)/4$

$(N-1)/4$ )So this situation guarantees us that the constructed number is not pseudo-simple, and therefore we can again be sure of the results of the next probabilistic simplicity test.

Similarly, and taking into account the divisibility of the full period, one can obtain the same results for other values. But if we want to get cryptographic primes this way, then multiplying by 2,4,6 we will increase the size of the number for a very long time, so it makes sense to look in the direction of another option - multiplication of two primes. If we multiply one prime by another, we get an odd number, so be sure to multiply by another 2 to get an even full period and an odd candidate number in prime. The full period in this case will be divided into

$2, a, b, 2a, 2b, ab$ where

$a$ and

are prime numbers. Now we need to prove that either the indicated periods will not give pseudo-simplicity, or to find signs warning us about the presence of pseudo-simplicity. Just note that if the period is equal to

$b$

, then the number will be prime (as shown above). It is also shown above that a half-period number cannot be pseudo-simple, therefore a period of length

$2*a*b$

can be excluded. Although for completeness, you can check the period

$ab$

alternative methods. So if the period is

$ab$

, then considering that

$ab$

$a$ and

simple, it becomes clear that the least common multiple of divisor periods

$b$

can only be equal

$N$

, while the periods of the dividers can be equal to either

$ab$

for both, or one

$ab$

and the other

$a$

$b$ . Since the period is always less than the number itself, then

will obviously be larger

$(ab+x)*(ab+y)$

$2ab+1$ .So pseudo-simplicity is not possible with such a period. Therefore, it remains to check the periods

$2, a, b, 2a, 2b$ .2 is less than the minimum possible period period for all numbers, more than 3, therefore we exclude such a period. Now suppose the constructed number is divided by

$m$ and

, then with the equality of the period

$n$

value

$N$

, their periods will also be equal

$a$

, because this is the only least common multiple of two numbers, because

$a$

not divisible by anything. It follows that

$a$

$(a+x)*(a+y)=N=ab*2+1$ where

- the first possible factor with a period

$a+x=m$

and

$a$

is the second.

$a+y=n$ Further:

$a^2+ax+ay+xy=2ab+1$

$=>$

$a^2+ax+ay+(ka+r)=2ab+1$

$=>$

$a+x+y+k=(2ab+1-r)/a$ . Here

can be equal to only one, otherwise the integer will not work on the left.

$r$ Then:

, which implies that if

$x+y+k=2b-a$

, then pseudo-simple with a period

$\geq 2b$

cannot be. It remains to verify pseudo-simplicity at

$a$

, which can be done by checking the values in the series of residuals at the position

$a<2b$

, if there is 1, then such a number can be pseudo-simple and should be excluded from further operations. Reasoning for

$a$

completely analogous, which means that it is necessary to verify the equality of the unit and its remainder, provided

$b$

$b<2a$ .

For the period

$2a$ we have:

$4a^2+2ax+2ay+xy=2ab+1$

$=>$

$4a^2+2ax+2ay+(ka+r)=2ab+1$

$=>$

$4a+2x+2y+k=(2ab+1-r)/a$

$=>$

$2x+2y+k=2b-4a$ . We get that for

(or equivalently -

$4a \geq 2b$

) again, there can be no simplicity, but if

$2a \geq b$

, then you need to check the balance at the position

$2a<b$

$2a$ . Similarly for

- check if

$b$

$2b<a$ . In total, we obtain for

$a$ and for

need to check only for line items

$b$

$2a$ and

because periods

$2b$

$a$ and

will give a repeat just in position

$b$

$2a$ and

$2b$ . Verification is performed only under the above conditions, which will almost double the process when checking large values

$a$ or

, because for them it is mandatory

$b$

$a \geq 2b$ with the scheme of generation simple below, and lower values will be checked very quickly because of their small size.

Thus, the theoretical foundations have been shown above to be able to generate guaranteed prime numbers for critical areas such as cryptography.

In addition, a way is opened for checking the simplicity of an arbitrary number, provided that the divisors of its full period are found (

$N-1$ ) So, for primes <126,000,000, the number of “multiplied primes” belonging to the class is 676625, with the total number of primes 7163812, which gives us a little less than 9.5%. For primes up to a billion, we have 3.49% belonging to the class 2p + 1, 1.8116% from the class 4p + 1, 2.4709% from the class 6p + 1, and only 7.7746%, where p is a prime number. The truth should be borne in mind that the decomposition of the full period of large numbers is very difficult. In this case, you can offer a recursive check, which will slightly increase the size of the numbers available for checking, but at the same time greatly reduce the proportion of numbers passing such a check, because if the coefficient determining the belonging to the classes of recursively prime numbers is taken equal to 0.2, then already in the second check we will have only 0.04, or 4% of the total number of primes. Nevertheless, in some cases, this approach can be beneficial.

Practical results

The generator itself, of course, was written and tested, since there is minimal complexity. During the check, it turned out that the following algorithm would be fully functional:

We get, for example, 1000 initial primes, which can be generated using the sieve of Eratosthenes or simply downloaded from here . Then we multiply each value with each remaining and do not forget to multiply by two, and then add one. The resulting candidate is often divided by 3, because primes have a specific feature similar to the presence of a charge on particles in physics. Simple with the same “charge” repel, and with different - attract. In this case, the “charge” is the remainder of dividing by 3. That is, if the remainder of dividing by 3 is the same for both primes, the new prime will not work, because it will be divided by 3. And if the remainder is different, then we can get the right one candidate to simple. Moreover, all simple ones obtained by multiplication are “synchronized”, that is, they receive the same remainder equal to 2. Therefore, it is useless to multiply them by themselves again. So, to get new primes, we need to filter out that part of the initial 1000 for which the modulus of the triple (the remainder of dividing by 3) is 1. Thus, after the first multiplication of all with everyone, we get grown up in the amount of numbers that are already pointless to multiply with each other, therefore, to further increase the size (to that used in cryptography), we must again and again multiply the obtained primes by the previously selected “oppositely charged” numbers. After multiplying and adding a unit, we carry out checks according to the criterion for the possibility of pseudo-simplicity and if the criterion is met, then we check the remainder at position 2a for each candidate. If there is 1, then the candidate is rejected. Next, each candidate is checked by the usual probabilistic test, that is, the value is calculated in the series of residuals at the position

$(N-1) / 2$ if it is 1 or

$N-1$ , then in front of us is a guaranteed prime number.

When performing generation, it should be borne in mind that at each iteration, an increase in the number of generated primes is obtained, that is, the multiplication coefficient for 1000 initial primes is much more than unity. Therefore, to obtain cryptographic primes, it is necessary to limit the generation, otherwise it will take a very long time, and there may not be enough memory. At the same time, you should not reduce the initial set of simple ones, because the generation should be as random as possible, so that, knowing its algorithm, the attacker would not predict the resulting values. For this, it is necessary to implement a mechanism for cutting off the branches of the generation tree, which allows each time to obtain unique simple ones located far enough from the previously generated ones. And of course, cutting off the excess significantly speeds up the process.

Test execution time depends on the number of candidates generated. Each of the candidates passes a probability test, which slows down the generation to the greatest extent. In test runs to get a few hundred simple in range

$2 ^ {900}$ -

$2 ^ {1200}$ 5 to 20 minutes were spent. This time difference is explained by various ways that the algorithm goes through in the generation tree. So, initially, generation is limited to about 10 numbers, but as the cryptographic size approaches, generation fades due to a significant reduction in the multiplication coefficient with increasing number. Therefore, it is necessary to increase the number of intermediate primes, but it is difficult to say how specific it is, and therefore, a simple increase in the permissible amount by a factor of two is used to limit it with an increase in the number size and an excess of a roughly set threshold. As a result, within the limits of the limitations, the number of new simple ones can float and thereby make a significant difference in the total generation time.

The following is the text of a Java procedure that generates prime numbers. Naturally, it does not satisfy cryptographic requirements that go far beyond the program code and mainly relate to organizational issues. In the purely software part, the procedure does not implement the mechanism for trimming the branches of the generation tree, with the exception of the simplest restriction. Nevertheless, as an example of the implementation of the generation algorithm, the program can help with something.

Input parameters is the initial list of simple and PrintWriter for saving the result to a file. After the algorithm is completed, the file will contain all the products of the generated primes with those initial numbers that have a three module equal to one. The result can be checked using online services that implement factorization of numbers, but it should be understood that they can use a probabilistic test for simplicity before factoring, and therefore they become useless to verify the proposed approach (after all, all numbers are already checked by a probabilistic test). But a number of them immediately begin to factor the proposed number into factors; such sites can instill some additional confidence in the correctness of the proposed method.

public static void generatePrimes(List<Integer> primes, PrintWriter pw) { List<BigInteger> mod3_1 = new ArrayList<BigInteger>(); List<BigInteger> l = new ArrayList<BigInteger>(); BigInteger three=BigInteger.valueOf(3), two=BigInteger.valueOf(2); for (int k=0;k<primes.size()-1;k++) { BigInteger seed=BigInteger.valueOf(primes.get(k)); BigInteger s2=seed.shiftLeft(1); BigInteger r0=seed.remainder(three); if (r0.intValue()==1) mod3_1.add(seed); for (int i=k+1;i<primes.size();i++) { BigInteger p=BigInteger.valueOf(primes.get(i)); BigInteger r=p.remainder(three); if (r.intValue()==r0.intValue()) continue; // divisible by 3 else addIfPrime(p,seed,s2,two,l); } } List<BigInteger> ps=l; do { System.out.println("found "+l.size()+", bits="+l.get(0).bitLength()); l=new ArrayList<BigInteger>(); for (int k=0;k<ps.size();k++) { BigInteger seed=ps.get(k); BigInteger s2=seed.shiftLeft(1); for (int i=0;i<mod3_1.size();i++) addIfPrime(mod3_1.get(i),seed,s2,two,l); int n=100000; // change following to adjust for required bit count if (l.size()>0) if (l.get(0).bitLength()<700) n=10; else if (l.get(0).bitLength()<800) n=20; else if (l.get(0).bitLength()<900) n=40; if (l.size()>n) break; } ps=l; for (int i=0;i<l.size();i++) pw.println(l.get(i)); pw.flush(); } while (l.size()>0); System.out.println("Done"); } private static void addIfPrime(BigInteger a, BigInteger b, BigInteger b2, BigInteger two, List<BigInteger> l) { BigInteger a2=a.shiftLeft(1), fp=b.multiply(a2), n=fp.add(BigInteger.ONE); BigInteger r=null; if (a2.compareTo(b)<0) r=two.modPow(a2,n); // 2a<b else if (a.compareTo(b2)<0) r=two.modPow(a,n); // a<2b if (r!=null && r.compareTo(BigInteger.ONE)==0) return; r=null; if (b2.compareTo(a)<0) r=two.modPow(b2,n); // 2b<a else if (b.compareTo(a2)<0) r=two.modPow(b,n); // b<2a if (r!=null && r.compareTo(BigInteger.ONE)==0) return; r=two.modPow(fp.shiftRight(1),n); if (r.compareTo(BigInteger.ONE)!=0) return; l.add(n); }

Well, now that you (well, almost) know everything about generating prime numbers, perhaps your interests will go beyond cryptography alone and it will become interesting for you to study number theory, since, as mentioned above, it is available for fifth grade students, but at the same time it also allows you to independently find true diamonds without waiting for help from serious mathematicians.

All Articles

Prime number generation

Theoretical basis

Calculation of the period, or all numbers - Mersenne's children

Formula output

Beneficial features

Again about half the period

Practical results

More articles: