Logo

3. Decompositions of Graphs

3.1 Graph Representations

Adjacency matrix vs. adjacency list yap... not too relevant for an Advanced Algorithms class, in my opinion.

3.2 (Undirected) Depth-First Search

Pseudocode:

dfs(G, v):
	visited(v) = true
	previsit(v)
	for edge (v, u) in E:
		if !visited(u): dfs(u)
	postvisit(v)

Asymptotic Analysis

Two steps:

  1. Some fixed amount of work--marking the vertex as visited and then the time complexity of pre/postvisit work.
  2. Recurse on adjacent edges if not visited.
    The total work done in step 1 across all vertices is O(V)O(\lvert V \rvert) (assuming pre/postvisit work is O(1)O(1)). The total work done in step 22 is O(E)O(\lvert E \rvert), as the algorithm will traverse all edges for each vertex. Thus, the time complexity is O(V+E)O(\lvert V \rvert+\lvert E \rvert); essentially O(E)O(\lvert E \rvert).

Connected Components

Just define previsit to be

previsit(v):
	ccnum[v] = cc

Where cccc is the identifying index of the CC passed around by DFS.

Pre-order and Post-order

previsit(v):
	pre[v] = clock
	clock++
	
postvisit(v):
	post[v] = clock
	clock++

Note that, for any two nodes u,vu,v, for the intervals [pre[u],post[u]][pre[u],post[u]] and [pre[v],post[v]][pre[v], post[v]], either

  • they are entirely disjoint
  • one is entirely within the other

3.3 (Directed) DFS

[u[v]v]u    Tree/Forward[v[u]u]v    Back[v]v[u]u    Cross\begin{align*} \underset{ u } { [ } && \underset{ v }{ [ } && \underset{ v }{ ] } && \underset{ u }{ ] } &&&\implies \text{Tree/Forward} \\ \underset{ v } { [ } && \underset{ u }{ [ } && \underset{ u }{ ] } && \underset{ v }{ ] } &&&\implies\text{Back} \\ \underset{ v }{ [ } && \underset{ v }{ ] } && \underset{ u }{ [ } && \underset{ u }{ ] } &&&\implies \text{Cross} \end{align*}

There is a special type of directed graph known as a directed acyclic graph (DAG). This just denotes a directed graph without any cycle. It is possible to test for acylicity in linear time, with a single depth-first search.

A directed graph has a cycle     \iff its DFS reveals a back edge.

In other words, if the DFS reaches a vertex it has already visited before, the directed graph has a cycle.

DAGs are highly applicable to many real world situations. Frequently, it's desirable to linearize or topologically sort a DAG. This means identifying an ordering of the vertices such that each edge goes from an earlier to a later vertex in the sequence. All DAGs can be linearized! This is because of a very nice property that derives from the fact that DAGs cannot have back edges:

In a DAG, every edge leads to a vertex with a lower post number.

Therefore, we have an algorithm that runs in linear time that orders the nodes of a DAG (toposort). We can also conclude that the vertices with the smallest post number must be sinks. Similarly, the vertices with the highest post number must be sources. This gives us one more property.

Every DAG has at least one source and one sink.

Thus, an alternate approach to linearization is actually iteratively finding a source, appending it to the sequence, and then deleting it.

3.4 Strongly Connected Components

Description

Strongly connected components apply only to directed graphs. For directed graphs, we state that two nodes u,vu,v are connected if there exists a path from uu to vv and a path from vv to uu. This definition allows us to partition a directed graph into disjoint sets that are denoted as strongly connected components. In each set, the vertices are pairwise connected.

Now, if we shrink each SCC into a single "meta-node," and draw an edge from each meta-node to another if there is an edge in the same direction between their underlying components, we can create a "meta-graph" that is, in fact, a DAG. That is,

The meta-graph constructed by the strongly connected components of any directed graph is always a DAG.

Proof:
Let U,VU,V be two SCCs and let uUu\in U and vVv\in V. If there exists a bidirectional (back) edge between UU and VV, then there exists a path from uu to vv and vv to uu. This is because, since UU and VV are SCCs, it is possible for uu to traverse to the vertex with an edge to VV and then traverse to vv from there. The analogue can be said for vv to uu. This would mean UVU\cup V is an SCC, a contradiction. Therefore, the meta-graph must be a DAG.

\square

This gives insight into the greater structure of a directed graph. In particular, at the top level, there exists a simpler DAG of the graph. Underlying this DAG, though, is the actual directed graph with SCCs in each of the DAG's nodes.

Strongly Connected Components Decomposition

We can actually decompose a directed graph into its SCCs in linear time.

Property 1: If DFS is started at node uu, it will terminate precisely when all nodes reachable from uu have been visited.

This means that, if we start DFS on a node that lies within an SCC that is a sink on the meta-graph, it will visit exactly the SCC, and nothing more.

Property 2: The node that receives the highest post number in a DFS must lie in a source SCC.

This is a result of a more general property:

Property 3: if CC and CC' are SCCs, and there exists an edge from a node in CC to a node in CC', then the highest post number in CC is larger than the highest post number in CC'.

Proof: There are two cases to consider. If DFS visits component CC before CC', then all of CC and CC' will be traversed before the procedure ends. By the properties of DFS, the post number of the first visited node in CC will be higher than the highest post number in CC'. Otherwise, if CC' is visited first, DFS will end after visiting all of CC' but before visiting any of CC, in which case the property follows immediately.

\square

We can restate Property 3 as "the SCCs can be linearized by arranging them in decreasing order of their highest post numbers."

Now, consider the reverse graph GRG^{R}, which is the same as the original graph GG but with all edges reversed. GRG^{R} has the same connected components as GG, but the meta-graph has its edges reversed! By Property 2, if we perform DFS on GRG^{R}, the node with the highest post number will be a source SCC for the meta-graph of GRG^{R}, but will be a sink SCC for the meta-graph of the original graph GG.

Once we have identified a sink SCC of GG, we then delete the sink from the graph. Then, by Property 3, the node with the highest post number among the remaining nodes will belong to another sink SCC of GG. Therefore, we can iteratively delete the sink from the graph until there are no nodes left. Thus, we have a linear time algorithm to decompose the graph into SCC DAGs.

More formally, we describe Kosajaru's Algorithm for decomposing a directed graph into its condensation graph, i.e. the DAG of its SCCs. The following is taken from cp-algorithms,

  1. Run DFS 1 or more times on GG, computing postorder number for each vertex and yielding a list of the vertices sorted by postorder number.
  2. Build the transpose graph GTG^{T}, and repeatedly run DFS on the vertices in descending postorder. By the properties described above, the node with the highest postorder number is part of a source SCC of GG, but a sink SCC of GTG^{T}. Therefore, running DFS on the node with highest postorder number will visit only the nodes of this sink SCC of GTG^{T}. This essentially extends, to the rest of the nodes, as the algorithm visits the SCCs of GTG^{T} in reverse topological order. This means it visits the SCCs of GG in topological order.