Algorithm II
6. Dynamic Programming I
WU Xiaokun 吴晓堃
xkun.wu [at] gmail
Greedy. Myopically ordering, making irrevocable decisions.
Divide-and-conquer. Break up a problem into independent sub-problems; solve each subproblem; combine solutions to sub-problems (form solution to original problem).
Dynamic programming. Break up a problem into a series of overlapping sub-problems; combine solutions to smaller sub-problems (form solution to larger subproblem).
Consider n subjects are sharing a single resources.
Goal: find max-weight subset of mutually compatible jobs.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
---|---|---|---|---|---|---|---|---|---|---|
9 | + | + | + | + | + | + | + | |||
1 | + | + | + | + | ||||||
1 | + | + | + | + |
Earliest-finish-time-first.
Recall. Greedy algorithm is correct if all weights are 1.
Observation. Greedy algorithm fails for weighted version.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
---|---|---|---|---|---|---|---|---|---|---|
9 | + | + | + | + | + | + | + | |||
1 | + | + | + | + | ||||||
1 | + | + | + | + |
Convention. Jobs are in ascending order of finish time: f_1 \le f_2 \le . . . \le f_n.
Def. p(j) = largest index i < j such that job i is compatible with j.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | p | |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | + | + | + | + | 0 | ||||||
9 | + | + | + | + | + | + | + | 0 | |||
1 | + | + | + | + | 1 |
Def. OPT(j) = max weight of any subset of mutually compatible jobs (for subproblem consisting only of jobs 1, 2, ..., j).
Goal. OPT(n) = max weight of any subset of mutually compatible jobs.
Case 1. OPT(j) does not select job j.
Case 2. OPT(j) selects job j.
Def. OPT(j) = max weight of any subset of mutually compatible jobs (for subproblem consisting only of jobs 1, 2, ..., j).
Goal. OPT(n) = max weight of any subset of mutually compatible jobs.
Case 1. OPT(j) does not select job j.
Case 2. OPT(j) selects job j.
Bellman equation.
OPT(n) = \left\{ \begin{array}{lcl} 0 & \text{if} & j = 0 \\ \max \{OPT(j-1), w_j + OPT(p(j))\} & \text{if} & j > 0 \\ \end{array}\right.
BRUTE-FORCE
(n, s_1, \ldots,
s_n, f_1, \ldots, f_n, w_1, \ldots, w_n)
COMPUTE-OPT
(n);COMPUTE-OPT
(j)
COMPUTE-OPT
(j – 1), w_j + COMPUTE-OPT
(p[j]) };What is running time of COMPUTE-OPT
(n) in the worst
case?
A. \Theta(n \log
n)
B. \Theta(n^2)
C. \Theta(1.618^n)
D. \Theta(2^n)
COMPUTE-OPT
(j)
COMPUTE-OPT
(j – 1), w_j + COMPUTE-OPT
(p[j]) };Discussed next.
Observation. Recursive algorithm is spectacularly slow because of overlapping sub-problems \Rightarrow exponential-time algorithm.
Ex. # recursive calls for family of “layered” instances grows like Fibonacci sequence.
0 | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
+ | + | + | |||
+ | + | + | |||
+ | + | + | |||
+ | + | + |
Key insight. Avoid repeated computations using memory.
Top-down dynamic programming (memoization).
TOP-DOWN
(n, s_1, \ldots, s_n,
f_1, \ldots, f_n, w_1, \ldots, w_n)
M-COMPUTE-OPT
(n);M-COMPUTE-OPT
(j)
M-COMPUTE-OPT
(j – 1),
w_j + M-COMPUTE-OPT
(p[j]) };Claim. Memoized version of algorithm takes O(n \log n) time.
Pf.
Sort by finish time: O(n \log n).
Compute p[j] for each j: O(n \log n) via binary search.
M-COMPUTE-OPT
(j):
each invocation takes O(1) time and
either
M-COMPUTE-OPT
(n) is O(n).Q. DP algorithm computes optimal value. How
to find optimal solution?
A. Make a second pass, ie., backtrace.
FIND-SOLUTION
(j)
FIND-SOLUTION
(p[j]);FIND-SOLUTION
(j –
1);Bottom-up dynamic programming. Unwind recursion.
BOTTOM-UP
(n, s_1, \ldots, s_n,
f_1, \ldots, f_n, w_1, \ldots, w_n)
Running time. The bottom-up version takes O(n \log n) time.
Goal. Given an array x of n integer (positive or negative), find a contiguous sub-array whose sum is maximum.
Ex.
sum | 12 | 5 | −1 | 31 | −61 | 59 | 26 | −53 | 58 | 97 | −93 | −23 | 84 | −15 | 6 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
187 | + | + | + | + | + |
Applications. Computer vision, data mining, genomic sequence analysis, technical job interviews, etc.
Goal. Given an array x of n integer (positive or negative), find a contiguous sub-array whose sum is maximum.
Ex.
sum | 12 | 5 | −1 | 31 | −61 | 59 | 26 | −53 | 58 | 97 | −93 | −23 | 84 | −15 | 6 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
187 | + | + | + | + | + |
Brute-force algorithm.
Apply “cumulative sum” trick.
Def. OPT(i) = max sum of any sub-array of x whose rightmost index is i.
Goal. \max_i OPT(i)
Bellman equation.
OPT(i) = \left\{ \begin{array}{lcl} x_1 & \text{if} & i = 1 \\ \max \{x_i, x_i + OPT(i-1)\} & \text{if} & i > 1 \\ \end{array}\right.
Running time. O(n).
Goal. Given an n-by-n matrix A, find a rectangle whose sum is maximum.
+ | + | + | |||||
---|---|---|---|---|---|---|---|
−2 | 5 | 0 | −5 | −2 | 2 | −3 | |
+ | 4 | -3 | -1 | 3 | 2 | 1 | −1 |
+ | −5 | 6 | 3 | −5 | −1 | -4 | −2 |
+ | −1 | -1 | 3 | −1 | 4 | 1 | 1 |
+ | 3 | -3 | 2 | 0 | 3 | -3 | −2 |
+ | −2 | 1 | -2 | 1 | 1 | 3 | −1 |
+ | 2 | -4 | 0 | 1 | 0 | -3 | −1 |
Applications. Databases, image processing, maximum likelihood estimation, technical job interviews, etc.
Assumption. Suppose you knew the left and right column indices j and jʹ.
j | j' | |||||
---|---|---|---|---|---|---|
−2 | 5 | 0 | −5 | −2 | 2 | −3 |
4 | -3 | -1 | 3 | 2 | 1 | −1 |
−5 | 6 | 3 | −5 | −1 | -4 | −2 |
−1 | -1 | 3 | −1 | 4 | 1 | 1 |
3 | -3 | 2 | 0 | 3 | -3 | −2 |
−2 | 1 | -2 | 1 | 1 | 3 | −1 |
2 | -4 | 0 | 1 | 0 | -3 | −1 |
x |
---|
−7 |
4 |
−3 |
6 |
5 |
0 |
1 |
An O(n^3) algorithm.
Open problem. O(n^{3−\epsilon}) for any constant \epsilon > 0.
Least squares. Foundational problem in statistics.
SSE = \sum_{i=1}^n (y_i - ax_i - b)^2
Solution. Calculus \Rightarrow min error is achieved when
a = \frac{n \sum_i x_i y_i - (\sum_i x_i) (\sum_i y_i)}{n \sum_i x_i^2 - (\sum_i x_i)^2}, b = \frac{\sum_i y_i - a \sum_i x_i}{n}
Segmented least squares.
Q. What is a reasonable choice for f(x) to balance accuracy and parsimony?
Goal. Minimize f(x) = E + c L for some constant c > 0, where
Notation.
To compute OPT(j):
Bellman equation.
OPT(j) = \left\{ \begin{array}{lcl} 0 & \text{if} & j = 0 \\ \min_{1 \le i \le j} \{e_{ij} + c + OPT(i-1)\} & \text{if} & j > 0 \\ \end{array}\right.
SEGMENTED-LEAST-SQUARES
(n,
p_1, \ldots, p_n, c)
Theorem. [Bellman 1961] DP algorithm solves the
segmented least squares problem in O(n^3) time and O(n^2) space.
Pf.
a_{ij} = \frac{ n \sum_k x_k y_k - (\sum_k x_k)(\sum_k y_k) }{ n \sum_k x_k^2 - (\sum_k x_k)^2 }, b_{ij} = \frac{ \sum_k y_k - a_{ij} \sum_k x_k }{n}
Remark. Can be improved to O(n^2) time.
Goal. Pack knapsack so as to maximize total value of items taken.
Assumption. All values and weights are integral.
Ex. The subset { 1, 2, 5 } has value 35 (and weight 10).
Ex. The subset { 3, 4 } has value 40 (and weight 11).
i | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
v_i | 1 | 6 | 18 | 22 | 28 |
w_i | 1 | 2 | 5 | 6 | 7 |
weight limit W = 11
Which algorithm solves knapsack problem?
A. Greedy-by-value: repeatedly add item with maximum
v_i.
B. Greedy-by-weight: repeatedly add item with minimum
w_i.
C. Greedy-by-ratio: repeatedly add item with maximum
ratio v_i / v_i.
D. None of the above.
i | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
v_i | 1 | 6 | 18 | 22 | 28 |
w_i | 1 | 2 | 5 | 6 | 7 |
weight limit W = 11
Which sub-problems?
A. OPT(w) = optimal
value of knapsack problem with weight limit w.
B. OPT(i) = optimal
value of knapsack problem with items 1,
\ldots, i.
C. OPT(i, w) = optimal
value of knapsack problem with items 1,
\ldots, i subject to weight limit W.
D. Any of the above.
A/B: not eliminating any conflict, thus reduce to brute-force.
Def. OPT(i, w) = optimal value of knapsack problem with items 1, \ldots, i, subject to weight limit w.
Goal. OPT(n, W).
Case 1. OPT(i, w) does not select item i.
Case 2. OPT(i, w) selects item i.
Def. OPT(i, w) = optimal value of knapsack problem with items 1, \ldots, i, subject to weight limit w.
Goal. OPT(n, W).
Case 1. OPT(i, w) does not select item i.
Case 2. OPT(i, w) selects item i.
Bellman equation.
OPT(i, w) = \left\{ \begin{array}{ll} 0 & \text{if}\ i = 0 \\ OPT(i-1, w) & \text{if}\ w_i > w \\ \max \{ OPT(i-1, w), v_i + OPT(i-1, w-w_i) \} & \text{otherwise} \\ \end{array}\right.
OPT(i, w) = \left\{ \begin{array}{ll} 0 & \text{if}\ i = 0 \\ OPT(i-1, w) & \text{if}\ w_i > w \\ \max \{ OPT(i-1, w), v_i + OPT(i-1, w-w_i) \} & \text{otherwise} \\ \end{array}\right.
KNAPSACK
(n, W, w_1, \ldots,
w_n, v_1, \ldots, v_n)
OPT(i, w) = \left\{ \begin{array}{ll} 0 & \text{if}\ i = 0 \\ OPT(i-1, w) & \text{if}\ w_i > w \\ \max \{ OPT(i-1, w), v_i + OPT(i-1, w-w_i) \} & \text{otherwise} \\ \end{array}\right.
Knapsack size W = 6, items w_1 = 2, w_2 = 2, w_3 = 3.
OPT(i, w) = \left\{ \begin{array}{ll} 0 & \text{if}\ i = 0 \\ OPT(i-1, w) & \text{if}\ w_i > w \\ \max \{ OPT(i-1, w), v_i + OPT(i-1, w-w_i) \} & \text{otherwise} \\ \end{array}\right.
Theorem. The DP algorithm solves the knapsack
problem with n items and maximum weight
W in \Theta(n W) time and \Theta(n W) space.
Pf.
Remarks.
Problem. Given n coin denominations \{ d_1, d_2, \ldots, d_n \} and a target value V, find the fewest coins needed to make change for V (or report impossible).
Recall. Greedy cashier’s algorithm is optimal for U.S. coin denominations, but not for arbitrary coin denominations.
Ex. { 1, 10, 21, 34, 70, 100, 350, 1295, 1500
}.
Optimal. 140¢ = 70 + 70.
Def. OPT(v) = min number of coins to make change for v.
Goal. OPT(V).
Multiway choice. To compute OPT(v),
Bellman equation.
OPT(v) = \left\{ \begin{array}{ll} \infty & \text{if}\ v < 0 \\ 0 & \text{if}\ v = 0 \\ \min_{1 \le i \le n} \{ 1 + OPT(v-d_i) \} & \text{if}\ v > 0 \\ \end{array}\right.
Running time. O(n V).
RNA. String B = b_1 b_2 \ldots b_n over alphabet { A, C, G, U }.
Secondary structure. RNA is single-stranded so it tends to loop back and form base pairs with itself. This structure is essential for understanding behavior of molecule.
Secondary structure. A set of pairs S = \{ (b_i, b_j) \} that satisfy:
[Watson–Crick] S is a matching and each pair in S is a Watson–Crick complement: A–U, U–A, C–G, or G–C.
[No sharp turns] The ends of each pair are separated by at least 4 intervening bases. If (b_i, b_j) \in S, then i < j – 4.
[Non-crossing] If (b_i, bj) and (b_k, bl) are two pairs in S, then we cannot have i < k < j < l.
Secondary structure. A set of pairs S = \{ (b_i, b_j) \} that satisfy:
Free-energy hypothesis. RNA molecule will form secondary structure with minimum total free energy.
Goal. Given an RNA molecule B = b_1 b_2 \ldots b_n, find a secondary structure S that maximizes number of base pairs.
Is the following a secondary structure?
A. Yes.
B. No, violates Watson–Crick condition.
C. No, violates no-sharp-turns condition.
D. No, violates no-crossing condition.
Which sub-problems?
A. OPT(j) = max
number of base pairs in secondary structure of the substring b_1 b_2 \ldots b_j.
B. OPT(j) = max number
of base pairs in secondary structure of the substring b_j b_{j+1} \ldots b_n.
C. Either A or B.
D. Neither A nor B.
First attempt. OPT(j) = max number of base pairs in secondary structure of the substring b_1 b_2 \ldots b_j.
Goal. OPT(n).
Choice. Match bases b_t and b_j.
Difficulty. Results in two sub-problems (but one of wrong form).
Def. OPT(i, j) = maximum number of base pairs in a secondary structure of the substring b_i b_{i+1} \ldots b_j.
Case 1. If i \ge j – 4.
Case 2. Base b_j is not involved in a pair.
Case 3. Base b_j pairs with b_t for some i \le t < j – 4.
In which order to compute OPT(i, j)?
A. Increasing i,
then j.
B. Increasing j, then
i.
C. Either A or
B.
D. Neither A nor
B.
B
Q. In which order to solve the sub-problems?
A. Do shortest intervals first—increasing order of
|j − i|.
Ex. RNA sequence ACCGGUAGU.
RNA-SECONDARY-STRUCTURE
(n,
b_1, \ldots, b_n)
Theorem. The DP algorithm solves the RNA secondary structure problem in O(n^3) time and O(n^2) space.
Outline.
Techniques.
Top-down vs. bottom-up DP. recursive vs. iterative.