Logo

11. Randomized Algorithms

Types of Random Algorithms

Las Vegas: An algorithm that will always produce a correct result, but whose runtime will vary.
Monte Carlo: An algorithm that may or may not produce a correct result, but whose runtime is constant.

Primality Testing

Fermat Test

A naive solution would be to simply factor a number into its prime factors to determine if it's prime. However, the fastest algorithm for this, the General Number Field Sieve, runs in O(Cn1/3log(n)2/3)O(C^{n^{1/3}\log(n)^{2/3}}), i.e. exponential time. Can we do better?

Recall Fermat's Little Theorem:

If NN is prime, then aN11  (mod  N)a^{N-1}\equiv1\; (\text{mod} \; N ), a{1,,N1}\forall a\in \{ 1,\dots,N-1 \}.

We can thus develop a very simple test known as the Fermat Test for testing if a number is prime.

  1. Pick a{1,,N1}a\in \{ 1,\dots,N-1 \} uniformly at random
  2. If aN11  (mod  N)a^{N-1}\equiv1\; (\text{mod} \; N ), output prime. Otherwise, output composite.

However, primality testing is not as simple as just choosing an aa and checking if aN11  (mod  N)a^{N-1}\equiv1\; (\text{mod} \; N ). If NN is composite, this congruence could also be true, for certain values of aa. In essence, the Fermat Test guarantees that, if NN is prime, it will output prime, but if NN is composite, it may output prime or composite.

Thus, the Fermat Test will not always be correct. In fact, you may notice that this seems to fit the definition of a Monte Carlo randomized algorithm! Let's examine this further, though. In particular, we will determine an upper bound on the probability of some aa "passing" the Fermat Test (i.e. outputting prime) when NN is composite, for all NN.

Fermat Witness

Any aa that fails the Fermat Test when NN is composite, i.e. outputs the correct result / aN1≢1  (mod  N)a^{N-1}\not\equiv1\; (\text{mod} \; N ), is known as a Fermat witness. If aa is coprime to NN, we further describe is as a nontrivial Fermat witness, since if aa is not coprime to NN, it is trivially a Fermat witness. (See the proof below for an explanation).

Carmichael Numbers

Note that there do exist composite numbers that pass the Fermat Test for all coprime aa, i.e. for all nontrivial Fermat witnesses. For simplicity, we will ignore these.

Claim:
Let NN be composite, and S={1,,N1}S=\{ 1,\dots,N-1 \}. Suppose there exists at least one aSa\in S such that aa is a nontrivial Fermat witness (i.e. NN is not a Carmichael number). Let FF be the set of Fermat witnesses. Then F12S\lvert F \rvert\geq \frac{1}{2}\lvert S \rvert.

Proof:
We partition SS into four disjoint subsets:

A={aSaN11  (mod  N), gcd(a,N)=1}B={bSbN1≢1  (mod  N), gcd(b,N)=1}C={cScN1≢1  (mod  N), gcd(c,N)1}D={dSdN11  (mod  N), gcd(d,N)1}\begin{align*} A &= \{ a\in S\mid a^{N-1}\equiv 1\; (\text{mod} \; N ),\ \mathrm{gcd}(a,N)=1 \} \\ B &= \{ b\in S\mid b^{N-1}\not\equiv 1\; (\text{mod} \; N ),\ \mathrm{gcd}(b,N)=1 \} \\ C &= \{ c\in S\mid c^{N-1}\not\equiv 1\; (\text{mod} \; N ),\ \mathrm{gcd}(c,N)\neq 1 \} \\ D &= \{ d\in S\mid d^{N-1}\equiv 1\; (\text{mod} \; N ),\ \mathrm{gcd}(d,N)\neq 1 \} \\ \end{align*}

Consider DD. As aforementioned, any number that is not coprime to NN is trivially a Fermat witness. Why? Let k=gcd(d,N)1k=\mathrm{gcd}(d,N)\neq1. If dN11  (mod  N)d^{N-1}\equiv1\; (\text{mod} \; N ), then qZ\exists q\in \mathbb{Z} such that dN11=qNd^{N-1}-1=qN. However, kdN1k\mid d^{N-1} and kN    kqNk\mid N\implies k\mid qN, which implies k1k\mid1, which is contradictory. Thus, D=D=\emptyset.

Meanwhile, BB represents the nontrivial Fermat witnesses, CC represents the trivial Fermat witnesses, and AA represents the non Fermat witnesses. Note that F=BCF=B\cup C. Also note that A,B,CA,B,C\neq \emptyset since 1A1\in A, BB is nonempty as given by the claim, and CC is nonempty since NN is composite.

Assume for contradiction that F=BC<12S\lvert F \rvert=\lvert B\cup C \rvert< \frac{1}{2}\lvert S \rvert. Then A>12S\lvert A \rvert> \frac{1}{2}\lvert S \rvert. Let A={a1,,ak}A=\{ a_{1},\dots,a_{k} \}, where k>12Sk> \frac{1}{2}\lvert S \rvert. Choose an arbitrary bBb\in B, and consider AbAb, i.e.

Ab={a1b  (mod  N), , akb  (mod  N)}Ab=\{ a_{1}b\; (\text{mod} \; N ),\ \dots ,\ a_{k}b\; (\text{mod} \; N ) \}

AbAb has two useful properties that we will prove.

  1. AbBAb\subseteq B.
  2. The elements of AbAb are distinct, i.e. Ab=A\lvert Ab \rvert=\lvert A \rvert.

(1)(1) (aib)N1aiN1bN1bN1≢1  (mod  N)(a_{i}b)^{N-1}\equiv a_{i}^{N-1}b^{N-1}\equiv b^{N-1}\not\equiv1\; (\text{mod} \; N ). Moreover, gcd(aib,N)=1\mathrm{gcd}(a_{i}b,N)=1 since gcd(ai,N)=1\mathrm{gcd}(a_{i},N)=1 and gcd(b,N)=1\mathrm{gcd}(b,N)=1. Note that aib  (mod  N)a_{i}b\; (\text{mod} \; N ) is also clearly in SS, due to modular arithmetic. Therefore, AbBAb\subseteq B.

(2)(2) Suppose for contradiction that aib  (mod  N)ajb  (mod  N)a_{i}b\; (\text{mod} \; N )\equiv a_{j}b\; (\text{mod} \; N ) for iji\neq j. This implies that (aiaj)b0  (mod  N)(a_{i}-a_{j})b\equiv0\; (\text{mod} \; N ), i.e. N(aiaj)bN\mid(a_{i}-a_{j})b. Since gcd(b,N)=1\mathrm{gcd}(b,N)=1, we must have N(aiaj)N\mid(a_{i}-a_{j}). However, ai,ajSa_{i},a_{j}\in S implies that ai,aj<Na_{i},a_{j}<N, i.e. aiaj=0    ai=aj    i=ja_{i}-a_{j}=0\implies a_{i}=a_{j}\implies i=j. This yields a contradiction, and thus the elements of AbAb must be distinct.

The above two properties thus imply that, if BB\neq \emptyset, then there are at least kk elements in BB, where k>12Sk> \frac{1}{2}\lvert S \rvert. This produces a contradiction, however, and therefore it must be true that F12S\lvert F \rvert\geq \frac{1}{2}\lvert S \rvert.

\square

Proof Source

Therefore, given an integer NN with unknown primality, if we run the Fermat Test for kk rounds and no Fermat witness is identified, then either nn is one of the (very rare) Carmichael numbers or nn is prime with probability 1(12)k\geq1-\left( \frac{1}{2} \right)^{k}. Thus, we have a nice, randomized Monte Carlo test for primality! (Effective for all but the Carmichael numbers).

Miller-Rabin Test

The Miller-Rabin test is a more robust test than the Fermat Test, in the sense that it works on all numbers, including Carmichael numbers. It is a Monte Carlo algorithm, like the Fermat test, and always outputs "prime" if nn is prime and outputs "composite" if nn is composite with probability >34> \frac{3}{4}. Thus, it returns the correct answer with probability >1(14)k> 1- (\frac{1}{4})^{k} in kk rounds.

The Miller-Rabin test relies on a different witness of compositeness, based on the following theorem.

If pp is prime, then all integer roots of x21  (mod  p)x^{2}\equiv 1\; (\text{mod} \; p ) satisfy x1  (mod  p)x\equiv1\; (\text{mod} \; p ) or x1  (mod  p)x\equiv-1\; (\text{mod} \; p ).

Proof:

x21  (mod  p)    p(x21)    p(x1)(x+1)\begin{align*} x^{2}\equiv 1\; (\text{mod} \; p ) &\implies p \mid (x^{2}-1) \\ & \implies p \mid(x-1)(x+1) \end{align*}

Since pp is prime, either px1p\mid x-1, px+1p\mid x+1, or both. This means x10  (mod  p)x-1\equiv0\; (\text{mod} \; p ), x+10  (mod  p)x+1\equiv 0\; (\text{mod} \; p ), or both. Thus, the only possible roots are x=±1x=\pm1. Finally, we note that these are not extraneous, i.e. (±1)21  (mod  p)(\pm1)^{2}\equiv1\; (\text{mod} \; p ) is in fact true.

\square

Corollary: Let n,xZn,x \in \mathbb{Z} such that x21  (mod  n)x^{2}\equiv1\; (\text{mod} \; n ). If x≢±1  (mod  n)x\not\equiv\pm 1\; (\text{mod} \; n ), then nn must be composite.

Now, we can introduce some similar notation as for the Fermat Test. For a composite number nn, we denote xx as a root witness of nn's compositeness if x21  (mod  n)x^{2}\equiv 1\; (\text{mod} \; n ) and x≢±1  (mod  n)x\not\equiv \pm 1\; (\text{mod} \; n ). By definition, a root witness is a nontrivial root.

We'll subsequently derive the intuition and logic behind the Miller-Rabin test.

In essence, the Miller-Rabin test determines that nn is composite by using both Fermat and root witnesses. Let us assume n>2n>2 and is odd (otherwise it is immediately composite), and let aa be chosen uniformly at random from S={1,,n1}S=\{1,\dots,n-1 \}.

Since nn is odd, n1n-1 is even. Let n1=2rdn-1=2^{r}\cdot d, where d1  (mod  2)d\equiv1\; (\text{mod} \; 2 ). We can substitute this into the Fermat Test's equation to produce

an1a2rd  (mod  n)a^{n-1}\equiv a^{2^{r}\cdot d}\; (\text{mod} \; n )

Suppose that a2rd1  (mod  n)a^{2^{r}\cdot d}\equiv1\; (\text{mod} \; n ), i.e. aa is not a Fermat witness. Consider rewriting the expression as

(a2r1d)21  (mod  n)(a^{2^{r-1}\cdot d})^{2} \equiv 1\; (\text{mod} \; n )

Then, if

a2r1d≢{1,1}  (mod  n)a^{2^{r-1}\cdot d} \not\equiv \{ 1,-1 \}\; (\text{mod} \; n )

We have successfully identified a root witness of compositeness, and thus nn is correctly identified as composite. However, there's still more to this!

If a2r1d1  (mod  n)a^{2^{r-1}\cdot d}\equiv-1\; (\text{mod} \; n ), then we can do nothing further. However, if a2r1d1  (mod  n)a^{2^{r-1}\cdot d}\equiv1\; (\text{mod} \; n ), we can form yet another equation of the form x21  (mod  n)x^{2}\equiv1\; (\text{mod} \; n ). In particular,

(a2r2d)21  (mod  n)(a^{2^{r-2}\cdot d})^{2}\equiv 1\; (\text{mod} \; n )

Thus, as long as we get a2rαd1  (mod  n)a^{2^{r-\alpha}\cdot d} \equiv 1 \; (\text{mod} \; n ) and rα>0r-\alpha>0, we can continue trying to find a root witness.

Therefore, we can now define the steps of the Miller-Rabin algorithm. For simplicity, we consider only a single round. (Note: any output immediately returns from the function).

  1. Express n1=2rdn-1=2^{r}\cdot d for some odd dd
  2. Choose aS={1,,n1}a\in S=\{ 1,\dots,n-1 \} uniformly at random
  3. If a2rd≢1  (mod  n)a^{2^{r}\cdot d}\not\equiv 1 \; (\text{mod} \; n ), output "composite (Fermat)"
  4. For y=r1y=r-1 to y=0y=0,
    1. If a2yd≢{1,1}  (mod  n)a^{2^{y}\cdot d}\not\equiv \{ 1,-1 \}\; (\text{mod} \; n ), output "composite (Root)"
    2. If a2yd1  (mod  n)a^{2^{y}\cdot d} \equiv -1 \; (\text{mod} \; n ), output "probably prime"
  5. output "probably prime"

Actually, though, this algorithm is a little more computationally expensive than necessary, since it must compute a2rda^{2^{r}\cdot d}. We can instead start from ada^{d} and square this quantity rr times in the for loop. Thus, a more optimized version looks like this.

  1. Express n1=2rdn-1=2^{r}\cdot d for some odd dd
  2. Choose aS={1,,n1}a\in S=\{ 1,\dots,n-1 \} uniformly at random
  3. Let y=0y=0
    1. If a2yd1  (mod  n)a^{2^{y}\cdot d} \equiv 1 \; (\text{mod} \; n ), output "probably prime"
      Why? All future squares will be 11, so there are no root witnesses.
    2. If a2yd1  (mod  n)a^{2^{y}\cdot d} \equiv -1 \; (\text{mod} \; n ), output "probably prime"
      Why? Same reason as 3.1.
  4. For y=1y=1 to y=r1y=r-1
    1. If a2yd1  (mod  n)a^{2^{y}\cdot d}\equiv 1\; (\text{mod} \; n ), output "composite (Root)"
    2. If a2yd1  (mod  n)a^{2^{y}\cdot d} \equiv -1 \; (\text{mod} \; n ), output "probably prime"
      Why? Same reason as 3.1.
  5. Output "composite"
Why stop at y=r1y=r-1?

Suppose we continued to y=ry=r. Then it must have been the case that a2r1d≢{1,1}  (mod  n)a^{2^{r-1}\cdot d} \not\equiv \{ 1,-1 \}\; (\text{mod} \; n ). Now, consider a2rda^{2^{r}\cdot d}. If it is 1  (mod  n)1\; (\text{mod} \; n ), we have a root witness. Otherwise, if it is ≢1  (mod  n)\not\equiv1\; (\text{mod} \; n ), we have a Fermat witness (recall a2rd=an1a^{2^{r}\cdot d}=a^{n-1}).

Finally, it can be shown that the single-round Miller-Rabin primality test will output composite for a composite nn for a randomly chosen aa with probability >34> \frac{3}{4}. The proof of this is nontrivial, and thus is excluded from these notes. In conclusion, though, after running the Miller-Rabin test for kk rounds on a composite nn, it will find a witness of compositeness with probability >1(14)k>1-\left( \frac{1}{4} \right)^{k}.

Agrawal-Kayal-Saxena (AKS)

This is a deterministic algorithm for primality test in O(n12)O(n^{12}) (actually, it's been shortened to O(n6)O(n^{6}) now). This is effectively a derandomized version of the Miller-Rabin algorithm. For brevity, this algorithm's details are left out.

Minimum Cut

We previously saw an algorithm in Section 7 that solved for the maximum flow / minimum cut for a pair of vertices s,ts,t deterministically in O(nm2)O(nm^{2}), or O(n5)O(n^{5}) for dense graphs. However, we now consider a related, but slightly different problem—given an unweighted, undirected graph G=(V,E)G=(V,E), we seek the minimum cut across all possible pairs s,ts,t, i.e. the cut with the globally minimum number of crossing edges. There actually exists a Monte Carlo algorithm for this problem that runs in O(n2)O(n^{2}): Karger's Algorithm.

Karger's Algorithm proceeds simply as follows.

  1. For i=1,,n2i=1,\dots,n-2
    1. Pick a uniformly random edge eie_{i}
    2. Contract eie_{i}, i.e. merge its vertices (u,v)(u,v) into a "supervertex"

At the end of the loop, we are left with two supervertices, and we return the value of the cut between these two supervertices, which effectively represent the two sets of vertices we've divided the graph into.

...what? How does this work??

The critical idea behind this seemingly nonsensically-simple algorithm is that contracting is much less likely to contract the edges crossing the minimum cut. Why? Because the minimum cut, by definition, has the least number of crossing edges of any cut. Thus, it is by far more likely that an edge that does not cross the minimum cut is chosen, considering that there are many edges in the graph and the minimum cut edges form only a minuscule subset.

Astute readers may notice that the idea of this algorithm is very similar to the randomized algorithm we briefly discussed in Section 5. :)

In fact, that algorithm is better!

Let's not stop here though—we'd like to formalize this intuition, and derive the probability that Karger's algorithm does, in fact, output the minimum cut. In particular, we will prove the following theorem.

Theorem: Let C=(S,S)C=(S,\overline{S}) be a minimum cut of size kk. Pr[Karger’s Algorithm Result=C]=1(n2)=2n(n1)\mathrm{Pr}[\text{Karger's Algorithm Result}=C]=\geq \frac{1}{\binom{n}{2}}=\frac{2}{n(n-1)}.

Proof:
First, let's outline some definitions. Let GiG_{i} be the graph at the beginning of the iith iteration of the algorithm, with base case G1=GG_{1}=G, and let HiH_{i} be a "happy" (^w^) event, i.e. the event in which the iith contracted edge eie_{i} doesn't cross the cut CC. Let AA denote the event that Karger’s Algorithm Result=C\text{Karger's Algorithm Result}=C.

Then,

Pr[A]=Pr[H1H2Hn2]=Pr[H1]Pr[H2H1]Pr[H3H1,H2]Pr[Hn2H1,,Hn3]n2nn3n113=2n(n1)\begin{align*} \mathrm{Pr}[A] &= \mathrm{Pr}[H_{1}\land H_{2}\land\dots \land H_{n-2}] \\ &= \mathrm{Pr}[H_{1}]\cdot \mathrm{Pr}[H_{2}\mid H_{1}]\cdot \mathrm{Pr}[H_{3}\mid H_{1},H_{2}] \dots \mathrm{Pr}[H_{n-2}\mid H_{1},\dots ,H_{n-3}] \\ &\geq \frac{n-2}{n}\cdot \frac{n-3}{n-1} \dots \frac{1}{3} \\ &= \frac{2}{n(n-1)} \end{align*}

The inequality follows from the following

Pr[HiH1,,Hi1]=1Pr[Contract edge in CH1,,Hi1]=1# crossing edges of C# edges in Gi=1k12k(ni+1)=12ni+1=ni1ni+1\begin{align*} \mathrm{Pr}[H_{i}\mid H_{1},\dots ,H_{i-1}] &= 1 - \mathrm{Pr}[\text{Contract edge in }C\mid H_{1},\dots ,H_{i-1}] \\ &= 1- \frac{\text{\# crossing edges of }C}{\text{\# edges in }G_{i}} \\ &= 1- \frac{k}{\frac{1}{2}k(n-i+1)} = 1-\frac{2}{n-i+1}= \frac{n-i-1}{n-i+1} \end{align*}

Where the number of edges in GiG_{i} follows from the following sequence of facts

  1. MinCut(Gi)k\mathrm{MinCut}(G_{i})\geq k (since any cut in GiG_{i} corresponds precisely to some cut in GG)
  2. Number of vertices in GiG_{i}, Gi\lvert G_{i} \rvert, is n(i1)=ni+1n-(i-1)=n-i+1, since i1i-1 vertices have been removed
  3. Degree of each vertex in GikG_{i}\geq k (otherwise, if deg  v<k\mathrm{deg\;}v<k, minimum cut is less than kk with S=vS=v)
  4. Number of edges in GiG_{i} is 12vGideg  v12vGik=12kGi12k(ni+1)\frac{1}{2}\underset{ v\in G_{i} }{ \sum }\mathrm{deg\;}v\geq \frac{1}{2}\underset{ v\in G_{i} }{ \sum }k= \frac{1}{2}k\lvert G_{i} \rvert\geq \frac{1}{2}k(n-i+1).

Thus, the probability of success, Pr[A]\mathrm{Pr}[A], is 1(n2)2n2\geq \frac{1}{\binom{n}{2}}\approx \frac{2}{n^{2}}. In fact, this scales linearly with the number of valid minimum cuts, i.e. the number of cuts CC with the minimum number of crossing edges. So the probability of returning a minimum cut with cc minimum cuts in the graph is c(n2)\frac{c}{\binom{n}{2}}. In general, though this is not known a priori—thus, the lower bound on the probability of success is still 2n2\frac{2}{n^{2}}. Therefore, it suffices to run this algorithm O(n2)O(n^{2}) times, and take the minimum cut seen across all runs. For instance, repeating 20n220n^{2} times ensures that the probability a minimum cut is not seen in any run is (12n2)20n2<0.01\left( 1-\frac{2}{n^{2}} \right)^{20n^{2}}< 0.01 [Source]. So, the total runtime for Karger's Algorithm (to derive a minimum cut with high probability) is effectively O(n4)O(n^{4}).