Prime Number Search Algorithms

“The largest prime number is 2 32582657 -1 . And I proudly affirm that I remembered all his numbers ... in binary form. "

Karl Pomerance



A natural number is called prime if it has only two different divisors: one and itself. The task of finding prime numbers has been haunting mathematicians for a very long time. For a long time, this problem had no direct practical application, but everything changed with the advent of public-key cryptography. This article discusses several ways to search for primes, both of purely academic interest and used today in cryptography.



Sieve of Eratosthenes



Sieve of Eratosthenes - an algorithm proposed by the ancient Greek mathematician Eratosthenes. This method allows you to find all primes less than a given number n . The essence of the method is as follows. Take a set of numbers from 2 to n . Cross out all numbers divisible by 2, except 2. From the set (we eliminate), we proceed to the next “non-eliminated” number - 3, again cross out everything that is divisible by 3. We proceed to the next remaining number - 5, and so on, until we we get to n . After performing the above steps, only prime numbers will remain in the original list.



The algorithm can be slightly optimized. Since one of the divisors of the composite number n is mandatory  leqslant sqrtn , the algorithm can be stopped after deleting numbers divisible by  sqrtn .



Illustration of the operation of the algorithm from Wikipedia:



image






The complexity of the algorithm is O(n log logn) at the same time, to store information about which numbers were crossed out, it is required O(n) memory.



There are a number of optimizations to reduce these indicators. A technique called wheel factorization is to include in the initial list only numbers that are coprime with the first few primes (for example, less than 30). In theory, it is proposed to take the first simple ones until about  sqrt logn . This reduces the complexity of the algorithm in  log logn time. In addition, so-called segmentation is used to reduce memory consumption. The initial set of numbers is divided into segments of size  leqslant sqrtn and for each segment, the sieve of Eratosthenes is applied separately. Memory consumption is reduced to O( sqrtn) .



Atkin sieve



A better algorithm for filtering out composite numbers was proposed by Atkin and Bershtayn and was called the Atkin Sieve . This method is based on the following three properties of primes.



Property 1



If n is a positive number that is not a multiple of the square of a prime number and such that n equiv1( mod4) . Then n is prime if and only if the number of roots of the equation 4x2+y2=n odd.



Property 2



If n is a positive number that is not a multiple of the square of a prime number and such that n equiv1( mod6) . Then n is prime if and only if the number of roots of the equation 3x2+y2=n odd.



Property 3



If n is a positive number that is not a multiple of the square of a prime number and such that n equiv11( mod12) . Then n is prime if and only if the number of roots of the equation 3x2y2=n odd.



Evidence for these properties is provided in this article .



At the initial stage of the algorithm, the Atkin sieve is an array A of size n filled with zeros. To determine prime numbers, all x,y< sqrtn . For each such pair is calculated 4x2+y2 , 3x2+y2 , 3x2y2 and the value of the elements of the array A[4x2+y2] , A[3x2+y2] , A[3x2y2] increases by one. At the end of the algorithm, the indices of all elements of the array that have odd values ​​are either prime numbers or squares of a prime number. At the last step of the algorithm, the squares of the numbers remaining in the set are crossed out.



It follows from the description of the algorithm that the computational complexity of the Atkin sieve and memory consumption are O(n) . When using wheel factorization and segmentation, the complexity estimate of the algorithm is reduced to O(n/ log logn) , and memory consumption up O( sqrtn) .



Mersenne numbers and Luke-Lemer test



Of course, with such indicators of complexity, even the Atkin optimized sieve cannot be used to search for truly large primes. Fortunately, there are quick tests to check if a given number is prime. Unlike sieve algorithms, such tests are not designed to search for all primes, they are only able to tell with some probability whether a certain number is prime.



One such test method is the Luc-Lemer test . This is a deterministic and unconditional test of simplicity. This means that passing the test guarantees the simplicity of the number. Unfortunately, the test is intended only for numbers of a special kind 2p1 where p is a natural number. Such numbers are called Mersenne numbers.



The Luke-Lemer test claims that the Mersenne number Mp=2p1 prime if and only if p is prime and Mp divides (p1) member of the sequence Sk set recursively: S1=4,Sk=Sk122 for k>1 .



For the number Mp p bit length the computational complexity of the algorithm is  displaystyleO(p3) .



Due to the simplicity and determinism of the test, the largest known primes are Mersenne numbers. Today's largest known prime number is 282,589,9331 , its decimal notation consists of 24,862,048 digits. You can admire this beauty here .



Fermat's theorem and Miller-Rabin test



Not many Mersenne primes are known, so public key cryptography requires a different way to search for primes. One such method is the Fermat simplicity test . It is based on Fermat’s small theorem, which states that if n is a prime, then for any a that is not divisible by n , the equality an1 equiv1 pmodn . The proof of the theorem can be found on Wikipedia .



Fermat's simplicity test is a probabilistic test that involves enumerating several values ​​of a if at least one of them satisfies the inequality an1 not equiv1 pmodn , then the number n is composite. Otherwise, n is probably prime. The more values ​​of a used in the test, the higher the likelihood that n is prime.



Unfortunately, there are composite numbers n for which comparison an1 equiv1 pmodn holds for all a mutually prime with n . Such numbers are called Carmichael numbers . Compound numbers that successfully pass the Fermat test are called pseudo-simple Fermat. The number of pseudo-simple Fermat is infinite, so the Fermat test is not the most reliable way to determine prime numbers.



Miller-Rabin Test



More reliable results can be achieved by combining Fermat's small theorem and the fact that for a prime p there are no other roots of the equation x2 equiv1 pmodp except 1 and -1. The Miller-Rabin test enumerates several values ​​of a and checks that the following conditions are true.



Let p be a prime and p1=2sd , then for any a at least one of the conditions is true:



  1. ad equiv pm1 pmodp
  2. There is an integer r <s such that a2rd equiv1 pmodp


By Fermat's theorem ap1 equiv1 pmodp , and since p1=2sd from the property of the roots of the equation x2 equiv1 pmodp it follows that if we find a such that one of the conditions is not satisfied, then p is a composite number. If one of the conditions is fulfilled, the number a is called the witness of the simplicity of the number n according to Miller, and the number n itself is probably prime.



The more witnesses to simplicity found, the higher the likelihood that n is prime. According to Rabin’s theorem, the probability that a randomly chosen number a will witness the simplicity of the composite number is approximately 1/4 .



Therefore, if we check k random numbers a , then the probability of taking the composite number as prime  approx(1/4)k .



The complexity of the algorithm O(k log3p) , where k is the number of checks.



Due to its speed and high accuracy, the Miller-Rabin test is widely used in the search for prime numbers. Many modern cryptographic libraries use only this test when checking large numbers for simplicity, and, as Martin Albrecht showed in his work , this is not always enough.



He was able to generate such composite numbers that successfully passed the simplicity test in the libraries OpenSSL, CryptLib, JavaScript Big Number and many others.



Luke Test and Baillie – PSW Test



To avoid vulnerabilities in situations where a composite number generated by an attacker is passed off as prime, Martin Albrecht suggests using the Baillie – PSW test. Despite the fact that the Baillie – PSW test is probabilistic, to date, no compound numbers have been found that successfully pass this test. For finding such a number in 1980, the authors of the algorithm promised a reward of $ 30. The prize has not yet been claimed.



A number of researchers checked all numbers up to 264 and not a single compound number passed the Baillie – PSW test. Therefore, for numbers less 264 the test is considered deterministic.



The essence of the test is to consistently check the number on a downtime by two different methods. One of these methods is the Miller-Rabin test already described above. The second is Luke's test for strong pseudo-simplicity .



Luke Strong Pseudo-Simplicity Test



Luke sequences - pairs of recurrence sequences \ {U_ {n} (P, Q) \}, \ {V_ {n} (P, Q) \} described by the expressions:





 displaystyleU0(P,Q)=0, quadU1(P,Q)=1, quadUn+2(P,Q)=P cdotUn+1(P,Q)Q cdotUn(P,Q),n geq0











 displaystyleV0(P,Q)=2, quadV1(P,Q)=P, quadVn+2(P,Q)=P cdotVn+1(P,Q)Q cdotVn(P,Q),n geq0







Let be Un(P,Q) and Vn(P,Q) Are Lucas sequences, where the integers P and Q satisfy the condition  displaystyleD=P24Q neq0



We calculate the Jacobi symbol :  left( fracDp right)= varepsilon .



Find such r, s for which the equality nε=2rs



For prime n , one of the following conditions holds:



  1. n divides Us
  2. n divides V2js for some j <r


Otherwise, n is composite.



The probability that a composite number n will successfully pass the Luc test for a given pair of parameters P, Q does not exceed 4/15. Therefore, after applying the test k times, this probability is (4/15)k .



The Miller-Rabin and Luke tests produce disjoint sets of pseudo-simple numbers, respectively, if the number p passed both tests, it is simple. It is on this property that the Baillie – PSW test is based.



Conclusion



Depending on the task, various methods for finding prime numbers can be used. For example, when searching for large Mersenne primes, first, using the sieve of Eratosthenes or Atkin, a list of primes is determined up to a certain boundary, suppose 108 . Then, for each number p from the list, using the Luc-Lemer test, it is checked for simplicity Mp=2p1 .



To generate a large prime number for cryptographic purposes, a random number a is selected and verified by the Miller-Rabin test or the more reliable Baillie – PSW. According to the prime number distribution theorem , for a randomly selected number from 1 to n, the chance of being prime is approximately equal  frac1 lnn . Therefore, to find a prime number of 1024 bits, it is enough to sort out about a thousand options.



PS Sources



The implementation of all the described algorithms on Go can be viewed on GitHub .



All Articles