6. Dynamic Programming
6.1 Shortest paths in DAGs
Dynamic programming solutions are, implicitly, taking advantage of the linearization of a hidden DAG.
6.2 Longest Increasing Subsequences (LIS)
Naive DP
Memoize the length of the LIS ending at each index. Array is already a linearized DAG as is! Iterate forward through array, and find the longest LIS among previous indices such that this element could extend the LIS.
int n;
vector<int> v(n);
vector<int> dp(n);
for (int i = 0; i < n; i++) {
dp[i] = 1;
for (int j = 0; j < i; j++) {
if (v[i] > v[j])
dp[i] = max(dp[i], dp[j] + 1);
}
}
It's also possible to maintain an array to track ancestors, to recover the LIS. This has a time complexity of .
Optimized DP
We can improve upon the previous algorithm. The idea behind this algorithm is that we only really need to maintain one "candidate LIS." In essence, we maintain a DP array that tracks the LIS, except, for each iteration (iterating forward), we either (1) append to the end of the candidate LIS or (2) replace an element in the LIS by binary searching for the smallest element in the LIS that is the current element. This candidate LIS does not really track the actual elements of the LIS, as one might notice. However, it does efficiently calculate the length of the LIS.
It's difficult to reason about why this works. The main intuition is that, for each iteration, all elements in the candidate LIS are elements before the current element in the array. And, critically, each element essentially records the LIS ending at that element. Thus, replacing an element in the candidate LIS with the current element promises to potentially lead to an LIS that is at least as good as the current one. Let the element at index in the candidate LIS, and let the current element under consideration be . Let be replacing in the candidate LIS. By definition, and is greater than the element before in the candidate LIS. Consider an LIS of which is a member of, but not the end of. Then, the information of this LIS is already tracked by a further element in the candidate LIS. Now, consider an LIS of which is the end of. note that any future LIS constructed by extending this LIS must use elements later than in the array; thus, it is clearly optimal to greedily choose to replace with in the candidate LIS, as can only ever lead to a better LIS than .
C++ code:
int n = arr.size();
vector<int> dp;
vector<int> arr(n);
dp.push_back(arr[0]);
for (int i = 1; i < n; i++) {
if (arr[i] > dp.back()) {
// new longest sequence
dp.push_back(arr[i]);
} else {
int x = lower_bound(dp.begin(), dp.end(), arr[i]) - dp.begin();
dp[x] = arr[i];
}
}
return dp.size();
6.3 Edit Distance
Given two strings, the algorithm can perform 3 operations:
- Insert a character into one of the strings
- Delete a character from a string
- Substitute a character in a string with another character
The goal is to find the minimum number of operations to make the two strings equal; i.e., the edit distance.
We can solve this problem with a 2D dynamic programming approach. Let denote the current index in and denote the current index in , where are the input strings. Let denote the edit distance of the two strings and . Then we can note that is equivalent to the minimum of the following subproblems:
- , i.e. the equivalent of adding a character to or deleting a character from after equalizing the two strings.
- , i.e. the same as (1) but the two strings are switched.
- if , i.e. substituting with , or vice versa.
- if , i.e. the cost of equalizing and , since is already equal to .
Therefore, it suffices to construct a DP table for this problem. This clearly is in time complexity.
6.4 Knapsack
Problem: Given items with a weight and value and a total weight capacity , determine the maximum value that can be achieved without exceeding .
With Repetition
In this version of the problem, it assumes there is an infinite number of each item, i.e. the same item can be chosen multiple times.
Let denote the maximum value that can be achieved without exceeding the capacity . Let denote the weight and value of item . Let denote the set of items. Then, it follows that
In other words, at each capacity, simply consider the addition of each item to a subproblem whose capacity is reduced by that item's weight. Hopefully this makes sense intuitively!
Of course, recursively finding this is computationally expensive. Instead, we memoize our results while iterating forwards through the possible capacities . Additionally, at each step, it's necessary to check subproblems. Thus, the total time complexity is .
Without Repetition
In this version, it assumes there is exactly one instance of each item, i.e. an item can only be chosen once.
With this problem, the strategy is somewhat similar. However, we reverse the order. Instead of iterating through the set of capacities and iterating through all items for each capacity, we iterate through all items and iterate through the set of capacities for each item. Additionally, we iterate the set of capacities in reverse, starting from . This change helps maintain the requirement that, for every member of the memoized DP array, it will only have considered each element once. (Iterating forward the capacities result in being reused later for ).
Thus, since the iterations are essentially the same, the time complexity of this algorithm is too.
6.5 Chain Matrix Multiplication
There are, in fact, different costs to the order in which one multiplies a sequence of matrices . (Where one chooses to place parentheses, since matrix multiplication is associative). (Also, recall that the sizes of the matrices must be , , , , resulting in the different costs of each matrix multiplication).
The greedy approach, to always perform the cheapest matrix, fails. We consider a dynamic programming approach instead. Note first that the cost of multiplying an matrix by a matrix is approximately .
Now, think about the problem like divide and conquer first. Our problem is multiplying . In divide and conquer, we might choose a matrix , , such that we then compute the problems and separately, and then combining afterwards. However, we must also consider all possible , as it is not guaranteed that our split is optimal.
To write out the subproblems, we first generalize this conclusion by replacing with and with , just to make notation easier. Then, let denote the cost of multiplying , where , with an obvious base case of . Then,
This is sufficient to solve the problem, and we simply compute this for all , memoizing to avoid recomputation. This is a 2D table of size , and calculating each entry requires looping through, at most, other entries. Thus, the total runtime is .
6.6 Shortest Paths
Shortest Reliable Paths
The shortest reliable path is defined as the shortest path from to that uses at most edges.
Let be the length of the shortest path from to that uses edges, where . Let denote the weight of the edge from to . We can then define
Thus, we have a DP relation. We can then essentially run a modified version of Bellman-Ford's, where we iterate times and maintain a 2D DP array of size . Since we need to check each edge at most twice for each iteration, the total runtime is .
All-Pairs Shortest Paths
The Floyd-Warshall algorithm finds the shortest path between all pairs of vertices in a graph, and works even in the presence of negative edges. The pseudocode is as follows:
for (int k = 0; k < n; ++k) {
for (int i = 0; i < n; ++i) {
for (int j = 0; j < n; ++j) {
if (d[i][k] < INF && d[k][j] < INF)
d[i][j] = min(d[i][j], d[i][k] + d[k][j]);
}
}
}
In essence, Floyd-Warshall's maintains a 2-d matrix that keeps track of the distance from node and node in . Before the th step, stores the minimum distance path whose intermediate nodes are only within the set . At each step, either is still the minimum or, if and has been found, then there may be a new minimum path traversing through node .
The runtime of Floyd-Warshall's is clearly .
Traveling Salesman Problem
The Traveling Salesman Problem (TSP) is a known NP-hard problem. Essentially, given a graph of nodes and pairwise distances between nodes, it asks for the best order of traversing the nodes starting from some node such that each node is visited exactly once and the path begins and ends at . In essence, the minimum cost Hamiltonian tour.
The brute force approach evaluates every single tour, which gives a time complexity of . With dynamic programming, it's possible to reduce this to exponential time complexity. Consider how we might construct a subproblem. An intuitive idea is to consider the subproblem of a prefix of the tour, i.e. a path that does not fully complete the tour. More formally, for a subset of nodes that includes and the node , let be the length of the shortest path visiting each node in exactly once, starting at and ending at . In essence, a Hamiltonian path for a subset of the graph. Note that since the path should not start and end at unless is precisely the entire set of nodes.
Now, we calculate based on its subproblems. This involves essentially just iterating through all possible choices for the second-to-last node. In other words,
There are at most subproblems, and each takes linear time to combine. Thus, the total runtime is .
6.7 Independent Sets in Trees
We denote a subset of nodes as an independent set of a graph if there are no edges between any nodes in . One problem is to find the largest independent set of a graph. This problem is, in general, intractable. However, in a tree, the problem can be solved in linear time using dynamic programming.
Let denote the size of the largest independent set of the subtree rooted at . We consider the possible subproblems. Assume first that we include the root node in the set. Then, we cannot include any of the children of the root node. However, we can include the union of the largest independent sets of each of its grandchildren. Assume next that we don't include the root node. Then, we can include the union of the largest independent sets of each of its children. Therefore, we have the following dp relation
Thus, we can root the node at an arbitrary node and simply perform DFS with memoization (calculating in the postorder routine). Thus, the overall runtime is .