11. Randomized Algorithms

Types of Random Algorithms

Las Vegas: An algorithm that will always produce a correct result, but whose runtime will vary.
Monte Carlo: An algorithm that may or may not produce a correct result, but whose runtime is constant.

Primality Testing

Fermat Test

A naive solution would be to simply factor a number into its prime factors to determine if it's prime. However, the fastest algorithm for this, the General Number Field Sieve, runs in $O(C^{n^{1/3}\log(n)^{2/3}})$ , i.e. exponential time. Can we do better?

Recall Fermat's Little Theorem:

If $N$ is prime, then $a^{N-1}\equiv1\; (\text{mod} \; N )$ , $\forall a\in \{ 1,\dots,N-1 \}$ .

We can thus develop a very simple test known as the Fermat Test for testing if a number is prime.

Pick $a\in \{ 1,\dots,N-1 \}$ uniformly at random
If $a^{N-1}\equiv1\; (\text{mod} \; N )$ , output prime. Otherwise, output composite.

However, primality testing is not as simple as just choosing an $a$ and checking if $a^{N-1}\equiv1\; (\text{mod} \; N )$ . If $N$ is composite, this congruence could also be true, for certain values of $a$ . In essence, the Fermat Test guarantees that, if $N$ is prime, it will output prime, but if $N$ is composite, it may output prime or composite.

Thus, the Fermat Test will not always be correct. In fact, you may notice that this seems to fit the definition of a Monte Carlo randomized algorithm! Let's examine this further, though. In particular, we will determine an upper bound on the probability of some $a$ "passing" the Fermat Test (i.e. outputting prime) when $N$ is composite, for all $N$ .

Fermat Witness

Any $a$ that fails the Fermat Test when $N$ is composite, i.e. outputs the correct result / $a^{N-1}\not\equiv1\; (\text{mod} \; N )$ , is known as a Fermat witness. If $a$ is coprime to $N$ , we further describe is as a nontrivial Fermat witness, since if $a$ is not coprime to $N$ , it is trivially a Fermat witness. (See the proof below for an explanation).

Carmichael Numbers

Note that there do exist composite numbers that pass the Fermat Test for all coprime $a$ , i.e. for all nontrivial Fermat witnesses. For simplicity, we will ignore these.

Claim:
Let $N$ be composite, and $S=\{ 1,\dots,N-1 \}$ . Suppose there exists at least one $a\in S$ such that $a$ is a nontrivial Fermat witness (i.e. $N$ is not a Carmichael number). Let $F$ be the set of Fermat witnesses. Then $\lvert F \rvert\geq \frac{1}{2}\lvert S \rvert$ .

Proof:
We partition $S$ into four disjoint subsets:

\begin{align*} A &= \{ a\in S\mid a^{N-1}\equiv 1\; (\text{mod} \; N ),\ \mathrm{gcd}(a,N)=1 \} \\ B &= \{ b\in S\mid b^{N-1}\not\equiv 1\; (\text{mod} \; N ),\ \mathrm{gcd}(b,N)=1 \} \\ C &= \{ c\in S\mid c^{N-1}\not\equiv 1\; (\text{mod} \; N ),\ \mathrm{gcd}(c,N)\neq 1 \} \\ D &= \{ d\in S\mid d^{N-1}\equiv 1\; (\text{mod} \; N ),\ \mathrm{gcd}(d,N)\neq 1 \} \\ \end{align*}

Consider $D$ . As aforementioned, any number that is not coprime to $N$ is trivially a Fermat witness. Why? Let $k=\mathrm{gcd}(d,N)\neq1$ . If $d^{N-1}\equiv1\; (\text{mod} \; N )$ , then $\exists q\in \mathbb{Z}$ such that $d^{N-1}-1=qN$ . However, $k\mid d^{N-1}$ and $k\mid N\implies k\mid qN$ , which implies $k\mid1$ , which is contradictory. Thus, $D=\emptyset$ .

Meanwhile, $B$ represents the nontrivial Fermat witnesses, $C$ represents the trivial Fermat witnesses, and $A$ represents the non Fermat witnesses. Note that $F=B\cup C$ . Also note that $A,B,C\neq \emptyset$ since $1\in A$ , $B$ is nonempty as given by the claim, and $C$ is nonempty since $N$ is composite.

Assume for contradiction that $\lvert F \rvert=\lvert B\cup C \rvert< \frac{1}{2}\lvert S \rvert$ . Then $\lvert A \rvert> \frac{1}{2}\lvert S \rvert$ . Let $A=\{ a_{1},\dots,a_{k} \}$ , where $k> \frac{1}{2}\lvert S \rvert$ . Choose an arbitrary $b\in B$ , and consider $Ab$ , i.e.

Ab=\{ a_{1}b\; (\text{mod} \; N ),\ \dots ,\ a_{k}b\; (\text{mod} \; N ) \}

$Ab$ has two useful properties that we will prove.

$Ab\subseteq B$ .
The elements of $Ab$ are distinct, i.e. $\lvert Ab \rvert=\lvert A \rvert$ .

$(1)$ $(a_{i}b)^{N-1}\equiv a_{i}^{N-1}b^{N-1}\equiv b^{N-1}\not\equiv1\; (\text{mod} \; N )$ . Moreover, $\mathrm{gcd}(a_{i}b,N)=1$ since $\mathrm{gcd}(a_{i},N)=1$ and $\mathrm{gcd}(b,N)=1$ . Note that $a_{i}b\; (\text{mod} \; N )$ is also clearly in $S$ , due to modular arithmetic. Therefore, $Ab\subseteq B$ .

$(2)$ Suppose for contradiction that $a_{i}b\; (\text{mod} \; N )\equiv a_{j}b\; (\text{mod} \; N )$ for $i\neq j$ . This implies that $(a_{i}-a_{j})b\equiv0\; (\text{mod} \; N )$ , i.e. $N\mid(a_{i}-a_{j})b$ . Since $\mathrm{gcd}(b,N)=1$ , we must have $N\mid(a_{i}-a_{j})$ . However, $a_{i},a_{j}\in S$ implies that $a_{i},a_{j}<N$ , i.e. $a_{i}-a_{j}=0\implies a_{i}=a_{j}\implies i=j$ . This yields a contradiction, and thus the elements of $Ab$ must be distinct.

The above two properties thus imply that, if $B\neq \emptyset$ , then there are at least $k$ elements in $B$ , where $k> \frac{1}{2}\lvert S \rvert$ . This produces a contradiction, however, and therefore it must be true that $\lvert F \rvert\geq \frac{1}{2}\lvert S \rvert$ .

$\square$

Proof Source

Therefore, given an integer $N$ with unknown primality, if we run the Fermat Test for $k$ rounds and no Fermat witness is identified, then either $n$ is one of the (very rare) Carmichael numbers or $n$ is prime with probability $\geq1-\left( \frac{1}{2} \right)^{k}$ . Thus, we have a nice, randomized Monte Carlo test for primality! (Effective for all but the Carmichael numbers).

Miller-Rabin Test

The Miller-Rabin test is a more robust test than the Fermat Test, in the sense that it works on all numbers, including Carmichael numbers. It is a Monte Carlo algorithm, like the Fermat test, and always outputs "prime" if $n$ is prime and outputs "composite" if $n$ is composite with probability $> \frac{3}{4}$ . Thus, it returns the correct answer with probability $> 1- (\frac{1}{4})^{k}$ in $k$ rounds.

The Miller-Rabin test relies on a different witness of compositeness, based on the following theorem.

If $p$ is prime, then all integer roots of $x^{2}\equiv 1\; (\text{mod} \; p )$ satisfy $x\equiv1\; (\text{mod} \; p )$ or $x\equiv-1\; (\text{mod} \; p )$ .

Proof:

\begin{align*} x^{2}\equiv 1\; (\text{mod} \; p ) &\implies p \mid (x^{2}-1) \\ & \implies p \mid(x-1)(x+1) \end{align*}

Since $p$ is prime, either $p\mid x-1$ , $p\mid x+1$ , or both. This means $x-1\equiv0\; (\text{mod} \; p )$ , $x+1\equiv 0\; (\text{mod} \; p )$ , or both. Thus, the only possible roots are $x=\pm1$ . Finally, we note that these are not extraneous, i.e. $(\pm1)^{2}\equiv1\; (\text{mod} \; p )$ is in fact true.

$\square$

Corollary: Let $n,x \in \mathbb{Z}$ such that $x^{2}\equiv1\; (\text{mod} \; n )$ . If $x\not\equiv\pm 1\; (\text{mod} \; n )$ , then $n$ must be composite.

Now, we can introduce some similar notation as for the Fermat Test. For a composite number $n$ , we denote $x$ as a root witness of $n$ 's compositeness if $x^{2}\equiv 1\; (\text{mod} \; n )$ and $x\not\equiv \pm 1\; (\text{mod} \; n )$ . By definition, a root witness is a nontrivial root.

We'll subsequently derive the intuition and logic behind the Miller-Rabin test.

In essence, the Miller-Rabin test determines that $n$ is composite by using both Fermat and root witnesses. Let us assume $n>2$ and is odd (otherwise it is immediately composite), and let $a$ be chosen uniformly at random from $S=\{1,\dots,n-1 \}$ .

Since $n$ is odd, $n-1$ is even. Let $n-1=2^{r}\cdot d$ , where $d\equiv1\; (\text{mod} \; 2 )$ . We can substitute this into the Fermat Test's equation to produce

a^{n-1}\equiv a^{2^{r}\cdot d}\; (\text{mod} \; n )

Suppose that $a^{2^{r}\cdot d}\equiv1\; (\text{mod} \; n )$ , i.e. $a$ is not a Fermat witness. Consider rewriting the expression as

(a^{2^{r-1}\cdot d})^{2} \equiv 1\; (\text{mod} \; n )

Then, if

a^{2^{r-1}\cdot d} \not\equiv \{ 1,-1 \}\; (\text{mod} \; n )

We have successfully identified a root witness of compositeness, and thus $n$ is correctly identified as composite. However, there's still more to this!

If $a^{2^{r-1}\cdot d}\equiv-1\; (\text{mod} \; n )$ , then we can do nothing further. However, if $a^{2^{r-1}\cdot d}\equiv1\; (\text{mod} \; n )$ , we can form yet another equation of the form $x^{2}\equiv1\; (\text{mod} \; n )$ . In particular,

(a^{2^{r-2}\cdot d})^{2}\equiv 1\; (\text{mod} \; n )

Thus, as long as we get $a^{2^{r-\alpha}\cdot d} \equiv 1 \; (\text{mod} \; n )$ and $r-\alpha>0$ , we can continue trying to find a root witness.

Therefore, we can now define the steps of the Miller-Rabin algorithm. For simplicity, we consider only a single round. (Note: any output immediately returns from the function).

Express $n-1=2^{r}\cdot d$ for some odd $d$
Choose $a\in S=\{ 1,\dots,n-1 \}$ uniformly at random
If $a^{2^{r}\cdot d}\not\equiv 1 \; (\text{mod} \; n )$ , output "composite (Fermat)"
For $y=r-1$ $y = r - 1$ to $y=0$ $y = 0$ ,
1. If $a^{2^{y}\cdot d}\not\equiv \{ 1,-1 \}\; (\text{mod} \; n )$ , output "composite (Root)"
2. If $a^{2^{y}\cdot d} \equiv -1 \; (\text{mod} \; n )$ , output "probably prime"
output "probably prime"

Actually, though, this algorithm is a little more computationally expensive than necessary, since it must compute $a^{2^{r}\cdot d}$ . We can instead start from $a^{d}$ and square this quantity $r$ times in the for loop. Thus, a more optimized version looks like this.

Express $n-1=2^{r}\cdot d$ for some odd $d$
Choose $a\in S=\{ 1,\dots,n-1 \}$ uniformly at random
Let $y=0$ $y = 0$
1. If $a^{2^{y}\cdot d} \equiv 1 \; (\text{mod} \; n )$ , output "probably prime"
  Why? All future squares will be $1$ , so there are no root witnesses.
2. If $a^{2^{y}\cdot d} \equiv -1 \; (\text{mod} \; n )$ , output "probably prime"
  Why? Same reason as 3.1.
For $y=1$ $y = 1$ to $y=r-1$ $y = r - 1$
1. If $a^{2^{y}\cdot d}\equiv 1\; (\text{mod} \; n )$ , output "composite (Root)"
2. If $a^{2^{y}\cdot d} \equiv -1 \; (\text{mod} \; n )$ , output "probably prime"
  Why? Same reason as 3.1.
Output "composite"

Why stop at

y=r-1

Suppose we continued to $y=r$ . Then it must have been the case that $a^{2^{r-1}\cdot d} \not\equiv \{ 1,-1 \}\; (\text{mod} \; n )$ . Now, consider $a^{2^{r}\cdot d}$ . If it is $1\; (\text{mod} \; n )$ , we have a root witness. Otherwise, if it is $\not\equiv1\; (\text{mod} \; n )$ , we have a Fermat witness (recall $a^{2^{r}\cdot d}=a^{n-1}$ ).

Finally, it can be shown that the single-round Miller-Rabin primality test will output composite for a composite $n$ for a randomly chosen $a$ with probability $> \frac{3}{4}$ . The proof of this is nontrivial, and thus is excluded from these notes. In conclusion, though, after running the Miller-Rabin test for $k$ rounds on a composite $n$ , it will find a witness of compositeness with probability $>1-\left( \frac{1}{4} \right)^{k}$ .

Agrawal-Kayal-Saxena (AKS)

This is a deterministic algorithm for primality test in $O(n^{12})$ (actually, it's been shortened to $O(n^{6})$ now). This is effectively a derandomized version of the Miller-Rabin algorithm. For brevity, this algorithm's details are left out.

Minimum Cut

We previously saw an algorithm in Section 7 that solved for the maximum flow / minimum cut for a pair of vertices $s,t$ deterministically in $O(nm^{2})$ , or $O(n^{5})$ for dense graphs. However, we now consider a related, but slightly different problem—given an unweighted, undirected graph $G=(V,E)$ , we seek the minimum cut across all possible pairs $s,t$ , i.e. the cut with the globally minimum number of crossing edges. There actually exists a Monte Carlo algorithm for this problem that runs in $O(n^{2})$ : Karger's Algorithm.

Karger's Algorithm proceeds simply as follows.

For $i=1,\dots,n-2$ $i = 1, \dots, n - 2$
1. Pick a uniformly random edge $e_{i}$
2. Contract $e_{i}$ , i.e. merge its vertices $(u,v)$ into a "supervertex"

At the end of the loop, we are left with two supervertices, and we return the value of the cut between these two supervertices, which effectively represent the two sets of vertices we've divided the graph into.

...what? How does this work??

The critical idea behind this seemingly nonsensically-simple algorithm is that contracting is much less likely to contract the edges crossing the minimum cut. Why? Because the minimum cut, by definition, has the least number of crossing edges of any cut. Thus, it is by far more likely that an edge that does not cross the minimum cut is chosen, considering that there are many edges in the graph and the minimum cut edges form only a minuscule subset.

Astute readers may notice that the idea of this algorithm is very similar to the randomized algorithm we briefly discussed in Section 5. :)

In fact, that algorithm is better!

Let's not stop here though—we'd like to formalize this intuition, and derive the probability that Karger's algorithm does, in fact, output the minimum cut. In particular, we will prove the following theorem.

Theorem: Let $C=(S,\overline{S})$ be a minimum cut of size $k$ . $\mathrm{Pr}[\text{Karger's Algorithm Result}=C]=\geq \frac{1}{\binom{n}{2}}=\frac{2}{n(n-1)}$ .

Proof:
First, let's outline some definitions. Let $G_{i}$ be the graph at the beginning of the $i$ th iteration of the algorithm, with base case $G_{1}=G$ , and let $H_{i}$ be a "happy" (^w^) event, i.e. the event in which the $i$ th contracted edge $e_{i}$ doesn't cross the cut $C$ . Let $A$ denote the event that $\text{Karger's Algorithm Result}=C$ .

Then,

\begin{align*} \mathrm{Pr}[A] &= \mathrm{Pr}[H_{1}\land H_{2}\land\dots \land H_{n-2}] \\ &= \mathrm{Pr}[H_{1}]\cdot \mathrm{Pr}[H_{2}\mid H_{1}]\cdot \mathrm{Pr}[H_{3}\mid H_{1},H_{2}] \dots \mathrm{Pr}[H_{n-2}\mid H_{1},\dots ,H_{n-3}] \\ &\geq \frac{n-2}{n}\cdot \frac{n-3}{n-1} \dots \frac{1}{3} \\ &= \frac{2}{n(n-1)} \end{align*}

The inequality follows from the following

\begin{align*} \mathrm{Pr}[H_{i}\mid H_{1},\dots ,H_{i-1}] &= 1 - \mathrm{Pr}[\text{Contract edge in }C\mid H_{1},\dots ,H_{i-1}] \\ &= 1- \frac{\text{\# crossing edges of }C}{\text{\# edges in }G_{i}} \\ &= 1- \frac{k}{\frac{1}{2}k(n-i+1)} = 1-\frac{2}{n-i+1}= \frac{n-i-1}{n-i+1} \end{align*}

Where the number of edges in $G_{i}$ follows from the following sequence of facts

$\mathrm{MinCut}(G_{i})\geq k$ (since any cut in $G_{i}$ corresponds precisely to some cut in $G$ )
Number of vertices in $G_{i}$ , $\lvert G_{i} \rvert$ , is $n-(i-1)=n-i+1$ , since $i-1$ vertices have been removed
Degree of each vertex in $G_{i}\geq k$ (otherwise, if $\mathrm{deg\;}v<k$ , minimum cut is less than $k$ with $S=v$ )
Number of edges in $G_{i}$ is $\frac{1}{2}\underset{ v\in G_{i} }{ \sum }\mathrm{deg\;}v\geq \frac{1}{2}\underset{ v\in G_{i} }{ \sum }k= \frac{1}{2}k\lvert G_{i} \rvert\geq \frac{1}{2}k(n-i+1)$ .

Thus, the probability of success, $\mathrm{Pr}[A]$ , is $\geq \frac{1}{\binom{n}{2}}\approx \frac{2}{n^{2}}$ . In fact, this scales linearly with the number of valid minimum cuts, i.e. the number of cuts $C$ with the minimum number of crossing edges. So the probability of returning a minimum cut with $c$ minimum cuts in the graph is $\frac{c}{\binom{n}{2}}$ . In general, though this is not known a priori—thus, the lower bound on the probability of success is still $\frac{2}{n^{2}}$ . Therefore, it suffices to run this algorithm $O(n^{2})$ times, and take the minimum cut seen across all runs. For instance, repeating $20n^{2}$ times ensures that the probability a minimum cut is not seen in any run is $\left( 1-\frac{2}{n^{2}} \right)^{20n^{2}}< 0.01$ [Source]. So, the total runtime for Karger's Algorithm (to derive a minimum cut with high probability) is effectively $O(n^{4})$ .