# Rabin-Miller Primality Test

In Qualification Round of Google Code Jam 2016, there is an interesting Coin Jam problem. The summarized problem statement is as follows:

A jamcoin is a string of N ≥ 2 digits with the following properties:

1) Every digit is either 0 or 1.
2) The first digit is 1 and the last digit is 1.
3) If you interpret the string in any base between 2 and 10, inclusive, the resulting number is not prime.

Can you produce J different jamcoins of length N, along with proof that they are legitimate?

For example, for the jamcoin 1001, a possible set of nontrivial divisors for the base 2 through 10 interpretations of the jamcoin would be: 3, 7, 5, 6, 31, 8, 27, 5, and 77, respectively.

The name “jamcoin” is probably a play on Bitcoin, since it deals with prime/composite numbers, a topic commonly found in cryptography. In this problem, we apparently need to determine lots of large numbers (32 digits for Large dataset) if they are composite numbers.

The very first idea, building a sieve of primes for up to 1016 for trial division, seems not feasible for this problem since it will take lots of time and space (e.g., $\mathcal{O}(n\log{}n \log{}\log{}n)$ and $\mathcal{O}(n)$ for sieve of Eratosthenes, respectively).

Note that we don’t need to find all but only J of those jamcoins. Therefore, we can keep iterating over all possible “jam coins” to find the first J numbers that satisfy the conditions. To quickly determine if a large number is a composite/prime number, we can use Rabin-Miller primality test. For reference, the Rabin-Miller primality test is based on the following theorem:

• If p is a prime, let s be such that $p-1 = 2^{s}d$ and $d$ is odd. Then for any $1 \leq n \leq p-1$, one of two things happens:

\begin{align} & n^d = 1 \mod p \mbox{, or} \\ & n^{2^j d} = -1 \mod p \mbox{ for some } 0 \leq j < s. \end{align}

In Rabin-Miller test, we pick $k$ random samples of $n$ in the interval $1 \leq n \leq p-1$. If p is not a prime, then it is at least a ¾ chance that a randomly chosen $n$ will be a fail. For large $k$ independent tests, the probability that it passes all trials is (¼)k ~ 0.

The test is very fast, with runtime complexity of $k \log{}^3 n$ where k is the trial number. Since we looks for composite numbers, this algorithm is even better-suited: even if a number passes all Rabin-Miller trials, we are still NOT sure if it is a prime. However, if a number fails one of Rabin-Miller trial, we are sure that it is a composite number.

Implementation of this algorithm in different languages can be found on the web, such as here. I re-implemented this algorithm in Python (shown below) since 1) it is simple enough (just slightly more complex than Euclid’s gcd algorithm), and 2) I want to avoid disqualification from Google Code Jam for plagiarism.

• Because the function rabin_miller_trial is unlikely reused anywhere else, it is nested inside is_pseudo_prime to keep its function signature simple, intuitive.
• Use pow(x, y, z) in Python to compute more efficiently than (x ** y % z).
• random.randint(2, prime - 2) is used since it is useless to pick 1 and p-1 and trials would be wasted.
• Labor saving steps: we first test for divisibility by small primes that are less than 100 before starting Rabin-Miller trials.

Going back to the Coin Jam problem, note that the problem requires us not only to check if numbers are composite but also find any non-trivial factor for those numbers. Fortunately, as explained in Wikipedia, we can modify the Rabin-Miller test to add greatest common divisor gcd calculations to find a factor of p with minimal additional computational cost. The modified Rabin-Miller for finding factor of composite numbers is shown below.

The final solution to the problem, using the modified Rabin-Miller test above, can be found in this file (search for CoinJam class). Note that the suggested solution to this problem is even nicer by using a mathematical trick and the fact that J is pretty small (relative to 10N). If J is much larger and close to the number of all jamcoins with length N available (e.g., more than 90%), then using modified Rabin-Miller test is probably required.