Algorithm II
8. Intractability III
WU Xiaokun 吴晓堃
xkun.wu [at] gmail
Q. Suppose I need to solve an NP-hard problem. What should I do?
A. Sacrifice one of three desired features.
Coping strategies.
Independent set on trees. Given a tree, find a max-cardinality subset of nodes that no two are adjacent.
Fact. A tree has at least one leaf node (degree = 1).
Key observation. If node v is a leaf, there exists a max-cardinality independent set containing v.
Pf. [exchange argument]
INDEPENDENT-SET-IN-A-FOREST
(F)
Theorem. The greedy algorithm finds a max-cardinality independent set in forests (and hence trees).
Remark. Can implement in O(n) time by maintaining nodes of degree 1.
How might the greedy algorithm fail if the graph is not a tree/forest?
A. Might get stuck.
B. Might take exponential time.
C. Might produce a suboptimal independent set.
D. Any of the above.
A. the algorithm relies on leave nodes.
Weighted independent set on trees. Given a tree and node weights w_v \ge 0, find an independent set S that maximizes \sum_{v \in S} w_v.
Greedy algorithm can fail spectacularly.
Weighted independent set on trees. Given a tree and node weights w_v \ge 0, find an independent set S that maximizes \sum_{v \in S} w_v.
Dynamic-programming solution. Root tree at some node, say r.
Bellman equation.
\begin{aligned} OPT_{in} (u) & = w_u + \sum_{v \in children(u)} OPT_{out} (v) \\ OPT_{out} (u) & = \sum_{v \in children(u)} \max \{ OPT_{in} (v), OPT_{out} (v) \} \\ \end{aligned}
In which order to solve the subproblems?
A. Preorder.
B. Postorder.
C. Level order.
D. Any of the above.
B. the algorithm relies on leave nodesensures a node is processed after all of its descendants.
WEIGHTED-INDEPENDENT-SET-IN-A-TREE
(T)
Theorem. The DP algorithm computes max weight of an independent set in a tree in O(n) time.
Note: can also find independent set itself (not just value)
Independent set on trees. Tractable because we can find a node that breaks the communication among the subproblems in different subtrees.
Def. A graph is planar if it can be embedded in the plane in such a way that no two edges cross.
Applications. VLSI circuit design, computer graphics, etc.
Theorem. [Hopcroft-Tarjan 1974] There exists an O(n) time algorithm to determine whether a graph is planar.
Fact 0. Many graph problems can be solved faster in
planar graphs.
Ex. Shortest paths, max flow, MST, matchings, etc.
Fact 1. Some NP-complete problems become tractable
in planar graphs.
Ex. MAX-CUT, ISING, CLIQUE, GRAPH-ISOMORPHISM, 4-COLOR,
etc.
Fact 2. Other NP-complete problems become easier in
planar graphs.
Ex. INDEPENDENT-SET, VERTEX-COVER, TSP, STEINER-TREE,
etc.
PLANAR-3-COLOR. Given a planar graph, can it be colored using 3 colors so that no two adjacent nodes have the same color?
PLANAR-MAP-3-COLOR. Given a planar map, can it be colored using 3 colors so that no two adjacent regions have the same color?
PLANAR-MAP-3-COLOR. Given a planar map, can it be colored using 3 colors so that no two adjacent regions have the same color?
Theorem. PLANAR-3-COLOR \equiv_P PLANAR-MAP-3-COLOR.
Pf sketch.
Theorem. PLANAR-3-COLOR \in NP-complete.
Pf.
Lemma. W is a planar graph such that:
Pf. The only 3-colorings (modulo permutations) of W are shown below.
Construction. Given instance G of 3-COLOR, draw G in plane, letting edges cross. Form planar G' by replacing each edge crossing with planar gadget W.
Lemma. G is 3-colorable iff G' is 3-colorable.
Construction. Given instance G of 3-COLOR, draw G in plane, letting edges cross. Form planar G' by replacing each edge crossing with planar gadget W.
Lemma. G is 3-colorable iff G' is 3-colorable.
Theorem. [Appel-Haken 1976] Every planar map is 4-colorable.
Remarks.
Trees. VERTEX-COVER, INDEPENDENT-SET, LONGEST-PATH, GRAPH-ISOMORPHISM, etc.
Bipartite graphs. VERTEX-COVER, INDEPENDENT-SET, 3-COLOR, EDGE-COLOR, etc.
Planar graphs. MAX-CUT, ISING, CLIQUE, GRAPH-ISOMORPHISM, 4-COLOR, etc.
Bounded treewidth. HAM-CYCLE, INDEPENDENT-SET, GRAPH-ISOMORPHISM, etc.
Small integers. SUBSET-SUM, KNAPSACK, PARTITION, etc.
\rho-approximation algorithm.
Ex. Given a graph G, can find a vertex cover that uses \le 2 \cdot OPT(G) vertices in O(m + n) time.
Challenge. Need to prove a solution’s value is close to optimum value, without even knowing what optimum value is!
VERTEX-COVER. Given a graph G = (V, E), find a min-size vertex cover.
GREEDY-VERTEX-COVER
(G)
Running time. Can be implemented in O(m + n) time.
Given a graph G, let M be any matching and let S be any vertex cover. Which of the following must be true?
A. |M | \le |S
|
B. |S | \le |M |
C. |S | = |M |
D. None of the above.
A. if two nodes not matched, then they are not covered and conected, contra to cover; when covering nodes are matched to each other, strictly less.
Pf. Each vertex can cover at most one edge in any matching.
Theorem. Let S^\ast
be a minimum vertex cover. Then, greedy algorithm computes a vertex
cover S with |S | \le 2 |S^\ast | (ie.
2-approximation algorithm).
Pf.
Corollary. Let M^\ast be a maximum matching. Then, greedy
algorithm computes a matching M with
|M | ≥ ½ |M^\ast |.
Pf. |M | = ½ |S | ≥ ½ |M^\ast
|.
Theorem. [Dinur-Safra 2004] If P \ne NP, then no \rho-approximation for VERTEX-COVER for any \rho < 1.3606.
Open research problem. Close the gap (1.3606, 2).
Conjecture. no \rho-approximation for VERTEX-COVER for any \rho < 2.
Knapsack problem.
Ex: \{ 3, 4 \} has value 40.
item | value | weight |
---|---|---|
1 | 1 | 1 |
2 | 6 | 2 |
3 | 18 | 5 |
4 | 22 | 6 |
5 | 28 | 7 |
KNAPSACK. Given a set X, weights w_i ≥ 0, values v_i ≥ 0, a weight limit W, and a target value V, is there a subset S \subseteq X such that:
\sum_{i \in S} w_i \le W, \sum_{i \in S} v_i \le V
SUBSET-SUM. Given a set X, values u_i ≥ 0, and an integer U, is there a subset S \subseteq X whose elements sum to exactly U?
Theorem. SUBSET-SUM \le_P KNAPSACK.
Pf. Given instance (u_1, ..,
u_n, U) of SUBSET-SUM, create KNAPSACK instance:
\begin{aligned} v_i = w_i = u_i && \sum_{i \in S} u_i & \le U \\ V = W = U && \sum_{i \in S} u_i & \le U \\ \end{aligned}
Def. OPT(i, w) = max value subset of items 1, .., i with weight limit w.
Case 1. OPT does not select item i.
Case 2. OPT selects item i.
OPT(i, w) = \left\{ \begin{array}{ll} 0 & \text{if}\ i = 0 \\ OPT(i-1, w) & \text{if}\ w_i > w \\ \max \{ OPT(i-1, w), v_i + OPT(i-1, w-w_i) \} & \text{otherwise} \\ \end{array}\right.
Theorem. Computes the optimal value in O(n W) time.
Def. OPT(i, v) = min weight of a knapsack for which we can obtain a solution of value \ge v using a subset of items 1, .., i.
Note. Optimal value is the largest value v such that OPT(n, v) \le W.
Case 1. OPT does not select item i.
Case 2. OPT selects item i.
OPT(i, v) = \left\{ \begin{array}{ll} 0 & \text{if}\ v \le 0 \\ \infty & \text{if}\ i = 0\ \text{and}\ v > 0 \\ \min \{ OPT(i-1, v), w_i + OPT(i-1, v-v_i) \} & \text{otherwise} \\ \end{array}\right.
Theorem. Dynamic programming algorithm II computes
the optimal value in O(n^2 v_{max})
time, where v_{max} is the maximum of
any value.
Pf.
Remark 1. Not polynomial in input size!
(pseudo-polynomial)
Remark 2. Polynomial time if values are small
integers.
Intuition for approximation algorithm.
item | value | weight |
---|---|---|
1 | 934221 | 1 |
2 | 5956342 | 2 |
3 | 17810013 | 5 |
4 | 21217800 | 6 |
5 | 27343199 | 7 |
item | value | weight |
---|---|---|
1 | 1 | 1 |
2 | 6 | 2 |
3 | 18 | 5 |
4 | 22 | 6 |
5 | 28 | 7 |
Round up all values:
\bar{v_i} = \lceil \frac{v_i}{\theta} \rceil \theta, \hat{v_i} = \lceil \frac{v_i}{\theta} \rceil
Observation. Optimal solutions to problem with \bar{v} are equivalent to optimal solutions to problem with \hat{v}.
Intuition. \bar{v} close to v so optimal solution using \bar{v} is nearly optimal; \hat{v} small and integral so dynamic programming algorithm II is fast.
Theorem. If S is solution found by rounding algorithm and S^\ast is any other feasible solution satisfying weight constraint, then (1 + \epsilon) \sum_{i \in S} v_i \ge \sum_{i \in S^\ast} v_i.
Pf.
\begin{aligned} \sum_{i \in S^\ast} v_i & \le \sum_{i \in S^\ast} \bar{v_i} && \text{round up}\\ & \le \sum_{i \in S} \bar{v_i} && \text{optimality} \\ & \le \sum_{i \in S} (v_i + \theta) && \text{rounding gap} \\ & \le \sum_{i \in S} v_i + n\theta && |S| \le n \\ & = \sum_{i \in S} v_i + \frac{1}{2} \epsilon v_{max} && \theta = \epsilon v_{max} / 2n \\ & \le (1 + \epsilon) \sum_{i \in S} v_i && v_{max} \le 2 \sum_{i \in S} v_i \\ \end{aligned}
Theorem. For any \epsilon
> 0, the rounding algorithm computes a feasible solution whose
value is within a (1 + \epsilon) factor
of the optimum in O(n^3 / \epsilon)
time.
Pf.
\hat{v}_{max} = \lceil \frac{v_{max}}{\theta} \rceil = \lceil \frac{2n}{\epsilon} \rceil
Complexity theory deals with worst-case behavior.
“ For every polynomial-time algorithm you have, there is an exponential algorithm that I would rather run.” — Alan Perlis
Brute force. Given a 3-SAT instance with n variables and m clauses, the brute-force algorithm takes
O((m + n) 2^n) time.
Pf.
A recursive framework. A 3-SAT formula \Phi is either empty or the disjunction of a clause (l_1 \vee l_2 \vee l_3 ) and a 3-SAT formula \Phi' with one fewer clause.
\begin{aligned} \Phi &= (l_1 \vee l_2 \vee l_3) \wedge \Phi' \\ &= (l_1 \wedge \Phi') \vee (l_2 \wedge \Phi') \vee (l_3 \wedge \Phi') \\ &= (\Phi' | l_1 = true) \vee (\Phi' | l_2 = true) \vee (\Phi' | l_3 = true) \\ \end{aligned}
Notation. \Phi | x = true is the simplification of \Phi by setting x to true.
Ex.
\begin{aligned} \Phi &= (x \vee y \vee \neg z) &\wedge (x \vee \neg y \vee z) &\wedge (w \vee y \vee \neg z) &\wedge (\neg x \vee y \vee z) \\ \Phi' &= &\wedge (x \vee \neg y \vee z) &\wedge (w \vee y \vee \neg z) &\wedge (\neg x \vee y \vee z) \\ (\Phi' | x = true) &= & &\wedge (w \vee y \vee \neg z) &\wedge (y \vee z) \\ \end{aligned}
A recursive framework. A 3-SAT formula \Phi is either empty or the disjunction of a clause (l_1 \vee l_2 \vee l_3 ) and a 3-SAT formula \Phi' with one fewer clause.
3-SAT
(\Phi)
3-SAT
(\Phi' | l_1 =
true) RETURN true;3-SAT
(\Phi' | l_2 =
true) RETURN true;3-SAT
(\Phi' | l_3 =
true) RETURN true;
Theorem. The brute-force 3-SAT algorithm takes O(poly(n) 3^n) time.
Pf. T(n) \le 3T(n - 1) +
poly(n).
Key observation. The cases are not mutually exclusive. Every satisfiable assignment containing clause (l_1 \vee l_2 \vee l_3) must fall into one of 3 classes:
3-SAT
(\Phi)
3-SAT
(\Phi' | l_1 =
true) RETURN true;3-SAT
(\Phi' | l_1 =
false, l_2 = true) RETURN true;3-SAT
(\Phi' | l_1 =
false, l_2 = false, l_3 = true) RETURN true;Theorem. The brute-force algorithm takes O(1.84^n) time.
Pf. T(n) \le T(n - 1) + T(n -
2) + T(n - 3) + O(m + n).
Theorem. [Moser and Scheder 2010] There exists a O(1.33334^n) deterministic algorithm for 3-SAT.
DPPL algorithm. Highly-effective backtracking procedure.
Chaff. State-of-the-art SAT solver.
Given the locations of n Pokémon, find shortest tour to collect them all.
TSP. Given a set of n cities and a pairwise distance function d(u, v), is there a tour of length \le D ?
13,509 cities in the United States
TSP. Given a set of n cities and a pairwise distance function d(u, v), is there a tour of length \le D ?
HAM-CYCLE. Given an undirected graph G = (V, E), does there exist a cycle that visits every node exactly once?
Theorem. HAM-CYCLE \le_P TSP.
Pf.
d(u, v) = \left\{ \begin{array}{ll} 1 & \text{if}\ (u, v) \in E \\ 2 & \text{if}\ (u, v) \notin E \\ \end{array}\right.
Theorem. [Held-Karp, Bellman 1962] TSP can be solved
in O(n^2 2^n) time.
Pf. [dynamic programming]
c(s, v, X) = \left\{ \begin{array}{ll} c(v, s) & \text{if}\ |X| = 2 \\ \min_{u \in X\backslash\{s,v\}} c(s,u,X\backslash\{v\}) + c(u,v) & \text{if}\ |X| > 2 \\ \end{array}\right.
Remark. 22-city TSP instance takes 1,000 years!
Concorde TSP solver. [Applegate-Bixby-Chvátal-Cook]
Remarkable fact. Concorde has solved all 110 TSPLIB instances.
Euclidean TSP. Given n points in the plane and a real number L, is there a tour that visit every city exactly once that has distance \le L?
Fact. 3-SAT \le_P EUCLIDEAN-TSP.
Remark. Not known to be in \mathcal{NP}.
Theorem. [Arora 1998, Mitchell 1999] Given n points in the plane, for any constant \epsilon > 0: there exists a poly-time
algorithm to find a tour whose length is at most (1 + \epsilon) times that of the optimal
tour.
Pf recipe. Structure theorem + divide-and-conquer +
dynamic programming.