Algorithm II
11. Approximation Algorithms
WU Xiaokun 吴晓堃
xkun.wu [at] gmail
Q. Suppose I need to solve an NP-hard problem. What
should I do?
A. Sacrifice one of three desired features.
\rho-approximation algorithm.
Challenge. Need to prove a solution’s value is close to optimum value, without even knowing what optimum value is!
Input. m identical machines; n \ge m jobs, job j has processing time t_j.
Def. Let S[i] be the subset of jobs assigned to machine i. The load of machine i is L[i] = \sum_{j \in S[i]} t_j.
Def. The makespan is the maximum load on any machine L = \max_i L[i].
Load balancing. Assign each job to a machine to minimize makespan.
6 | a | a | d | f | f | f | |
7 | b | c | c | e | g | g | g |
Claim. Load balancing is hard even if m = 2 machines.
Pf. PARTITION \le_P
LOAD-BALANCE.
Number Partitioning Problem. [Exercise 8.26] You are given positive integers x_1, .., x_n; you want to decide whether the numbers can be partitioned into two sets S_1 and S_2 with the same sum: \sum_{x_i \in S_1} x_i = \sum_{x_j \in S_2} x_j.
6 | a | a | d | f | f | f | |
6 | b | c | c | e | g | g |
List-scheduling algorithm.
LIST-SCHEDULING
(m, n, t_1,
t_2, .., t_n)
Implementation. O(n \log m) using a priority queue for loads L[k].
Theorem. [Graham 1966] Greedy algorithm is a 2-approximation.
Lemma 1. For all k:
the optimal makespan L^\ast \ge
t_k.
Pf. Some machine must process the most time-consuming
job.
Lemma 2. The optimal makespan L^\ast \ge \frac{1}{m} \sum_{k} t_k.
Pf.
Bottleneck machine. Machine that has highest load after dispatching.
Theorem. Greedy algorithm is a
2-approximation.
Pf. Consider load L[i]
of bottleneck machine i.
Theorem. Greedy algorithm is a
2-approximation.
Pf. Consider load L[i]
of bottleneck machine i.
L[i] - t_j \le \frac{1}{m} \sum_{k} L[k] = \frac{1}{m} \sum_{k} t_k \le L^\ast
Q. Is our analysis tight?
A. Essentially yes.
Ex: m machines, first m (m – 1) jobs have length 1, last job has length $$m.
Longest processing time (LPT). Sort n jobs in decreasing order of processing times; then run list scheduling algorithm.
LPT-LIST-SCHEDULING
(m, n,
t_1, t_2, .., t_n)
Observation. If bottleneck machine i has only 1
job, then optimal.
Pf. Any solution must schedule that job.
Lemma 3. If there are more than m jobs, L^\ast \ge
2 t_{m+1}.
Pf. Consider processing times of first m+1 jobs t_1 \ge
t_2 \ge .. \ge t_{m+1}.
Theorem. LPT rule is a 3/2-approximation algorithm.
Pf. [ similar to proof for list scheduling ]
Q. Is our 3/2
analysis tight?
A. No.
Theorem. [Graham 1969] LPT rule is a 4/3-approximation.
Pf. More sophisticated analysis of same algorithm.
Q. Is Graham’s 4/3
analysis tight?
A. Essentially yes.
Ex.
Input. Set of n sites s_1, .., s_n and an integer k > 0.
Center selection problem. Select set of k centers C so that maximum distance r(C) from a site to nearest center is minimized.
Input. Set of n sites s_1, .., s_n and an integer k > 0.
Center selection problem. Select set of k centers C so that maximum distance r(C) from a site to nearest center is minimized.
Notation.
Goal. Find set of centers C that minimizes r(C), subject to |C| = k.
Distance function properties.
Ex: each site is a point in the plane, a center can be any point in the plane, dist(x, y) = Euclidean distance.
Remark: search can be infinite!
Greedy algorithm. Put the first center at the best possible location for a single center, and then keep adding centers so as to reduce the covering radius each time by as much as possible.
Remark: arbitrarily bad!
Repeatedly choose next center to be site farthest from any existing center.
GREEDY-CENTER-SELECTION
(k, n,
s_1, s_2, .., s_n)
Property. Upon termination, all centers in C are pairwise at least r(C) apart.
Pf. By construction, r(C) =
\max_i dist(s_i, C) = maximum distance dist(s_i, C).
Lemma. Let C^\ast
be an optimal set of centers. Then r(C) \le
2r(C^\ast).
Pf. [by contradiction] Assume \frac{1}{2} r(C) > r(C^\ast) := r.
Lemma. Let C^\ast be an optimal set of centers. Then r(C) ≤ 2r (C^\ast).
Theorem. Greedy algorithm is a 2-approximation for center selection problem.
Remark. Greedy algorithm always places centers at sites, but is still within a factor of 2 of best solution that is allowed to place centers anywhere.
Question. Is there hope of a 3/2-approximation? 4/3?
Theorem. Unless \mathcal{P} = \mathcal{NP}, there no \rho-approximation for center selection
problem for any \rho < 2.
Pf. We show how we could use a (2 – \epsilon)-approximation algorithm for
CENTER-SELECTION selection to solve DOMINATING-SET in poly-time.
DOMINATING-SET. Each vertex is adjacent to at least one member of the DOMINATING-SET, as opposed to each edge being incident to at least one member of the VERTEX-COVER.
Theorem. Unless \mathcal{P} = \mathcal{NP}, there no \rho-approximation for center selection
problem for any \rho < 2.
Pf. We show how we could use a (2 – \epsilon)-approximation algorithm for
CENTER-SELECTION selection to solve DOMINATING-SET in poly-time.
Definition. Given a graph G = (V, E), a vertex cover is a set S \subseteq V such that each edge in E has at least one end in S.
Weighted vertex cover. Given a graph G = (V, E) with vertex weights w_i \ge 0, find a vertex cover of minimum weight.
How to define “progress” in this setting?
How to define “progress” in this setting?
Option 1. w_i/|S_i|: “cost per element covered”.
Option 2. w_i/|S_i \cap R|: we are only concerned with elements still left uncovered.
Greedy algorithm. Assignment.
Greedy analysis. O(\log d^\ast)-approximation, d^\ast = \max_i |S_i|. Assignment.
Pricing method. Each edge must be covered by some
vertex.
Edge e = (i, j) pays price p_e \ge 0 to use both vertex i and j.
Fairness. Edges incident to vertex i should pay \le w_i in total.
Fairness lemma. For any vertex cover S and any fair prices p_e: \sum_e p_e \le w(S).
Pf. \sum_{e \in E} p_e \le
\sum_{i \in S} \sum_{e = (i, j)} p_e \le \sum_{i \in S} w_i \le
w(S).
WEIGHTED-VERTEX-COVER
(G,
w)
tight. \sum_{e = (i, j)} p_e = w_i
Theorem. Pricing method is a 2-approximation for
WEIGHTED-VERTEX-COVER.
Pf.
\begin{aligned} w(S) & = \sum_{i \in S} w_i & \text{all nodes tight} \\ & = \sum_{i \in S} \sum_{e=(i,j)} p_e & S \subseteq V \\ & \le \sum_{i \in V} \sum_{e=(i,j)} p_e & \text{edge counted twice} \\ & = 2 \sum_{e \in E} pe \le 2 w(S^\ast) & \text{fairness lemma} \\ \end{aligned}
Definition. Given a graph G = (V, E), a vertex cover is a set S \subseteq V such that each edge in E has at least one end in S.
Weighted vertex cover. Given a graph G = (V, E) with vertex weights w_i \ge 0, find a vertex cover of minimum weight.
Weighted vertex cover. Given a graph G = (V, E) with vertex weights w_i \ge 0, find a vertex cover of minimum weight.
Integer linear programming formulation.
Weighted vertex cover. Integer linear programming formulation.
\begin{aligned} \text{(ILP)}\ & \min & \sum_{i \in V} w_i x_i & & \\ & \text{s.t.} & x_i + x_j & \ge 1 & (i, j) \in E \\ & & x_i & \in \{0, 1\} & i \in V \\ \end{aligned}
Observation. If x^\ast is optimal solution to ILP, then S = \{ i \in V : x_i^\ast = 1\} is a min-weight vertex cover.
Given integers a_{ij}, b_i, c_j, find integers x_j that satisfy:
\begin{aligned} \min c^T x & \\ \text{s.t.}\ Ax & \ge b \\ x & \ge 0 \\ x & \ \text{is integral} \\ \end{aligned}
\begin{aligned} \min \sum_{j=1}^n c_j x_j & & \\ \text{s.t.}\ \sum_{j=1}^n a_{ij} x_j & \ge b_i & 1 \le i \le m \\ x_j & \ge 0 & 1 \le j \le n \\ x_j & \ \text{is integral} & 1 \le j \le n \\ \end{aligned}
Observation. Vertex cover formulation proves that INTEGER-PROGRAMMING is an NP-hard optimization problem.
Given integers a_{ij}, b_i, c_j, find real numbers x_j that satisfy:
\begin{aligned} \min c^T x & \\ \text{s.t.}\ Ax & \ge b \\ x & \ge 0 \\ \end{aligned}
\begin{aligned} \min \sum_{j=1}^n c_j x_j & & \\ \text{s.t.}\ \sum_{j=1}^n a_{ij} x_j & \ge b_i & 1 \le i \le m \\ x_j & \ge 0 & 1 \le j \le n \\ \end{aligned}
Linear. No x^2, xy, \arccos(x), x (1 – x), etc.
Simplex algorithm. [Dantzig 1947] Can solve LP in practice.
Ellipsoid algorithm. [Khachiyan 1979] Can solve LP in poly-time.
Interior point algorithms. [Karmarkar 1984, Renegar 1988, … ] Can solve LP both in poly-time and in practice.
LP geometry in 2D.
Linear programming relaxation.
\begin{aligned} \text{(ILP)}\ & \min & \sum_{i \in V} w_i x_i & & \\ & \text{s.t.} & x_i + x_j & \ge 1 & (i, j) \in E \\ & & x_i & \in \{0, 1\} & i \in V \\ \end{aligned}
\begin{aligned} \text{(LP)}\ & \min & \sum_{i \in V} w_i x_i & & \\ & \text{s.t.} & x_i + x_j & \ge 1 & (i, j) \in E \\ & & x_i & \ge 0 & i \in V \\ \end{aligned}
Observation. Optimal value of LP is \le optimal value of ILP, ie. better.
Pf. LP has fewer constraints.
Note. LP solution x^\ast may not correspond to a vertex cover. (even if all weights are 1)
Q. How can solving LP help us find a low-weight
vertex cover?
A. Solve LP and round fractional values in
x^\ast.
Lemma. If x^\ast is
optimal solution to LP, then S = { i \in V :
x_i^\ast \ge \frac{1}{2} } is a vertex cover whose weight is at
most twice the min possible weight.
Pf. [ S is a vertex
cover ]
Pf. [ S has desired weight ]
Theorem. The rounding algorithm is a 2-approximation
algorithm.
Pf. Lemma + fact that LP can be solved in
poly-time.
Theorem. [Dinur–Safra 2004] If \mathcal{P} \ne \mathcal{NP}, then no \rho-approximation algorithm for WEIGHTED-VERTEX-COVER for any \rho < 1.3606 (even if all weights are 1).
Open research problem. Close the gap.
Theorem. [Kohot–Regev 2008] If Unique Games Conjecture is true, then no (2 - \epsilon)-approximation algorithm for WEIGHTED-VERTEX-COVER for any \epsilon > 0.
Open research problem. Prove the Unique Games Conjecture.
Input. Set of m machines M; set of n jobs J.
Def. Let J_i be the
subset of jobs assigned to machine i.
The load of machine i
is L_i = \sum_{j \in J_i} t_j.
Def. The makespan is the maximum load on any machine = \max_i L_i.
Generalized load balancing. Assign each job to an authorized machine to minimize makespan.
ILP formulation. x_{ij} = time that machine i spends processing job j.
\begin{aligned} \text{(ILP)}\ & \min & L & & \\ & \text{s.t.} & \sum_i x_{ij} & = t_j & \forall j \in J \\ & & \sum_j x_{ij} & \le L & \forall i \in M \\ & & x_{ij} & \in \{0, t_j\} & \forall j \in J, i \in M_j \\ & & x_{ij} & = 0 & \forall j \in J, i \notin M_j \\ \end{aligned}
LP relaxation.
\begin{aligned} \text{(LP)}\ & \min & L & & \\ & \text{s.t.} & \sum_i x_{ij} & = t_j & \forall j \in J \\ & & \sum_j x_{ij} & \le L & \forall i \in M \\ & & x_{ij} & \ge 0 & \forall j \in J, i \in M_j \\ & & x_{ij} & = 0 & \forall j \in J, i \notin M_j \\ \end{aligned}
Lemma 1. The optimal makespan L^\ast \ge \max_j t_j.
Pf. Some machine must process the most time-consuming
job.
Lemma 2. Let L be
optimal value to the LP. Then, optimal makespan L^\ast \ge L.
Pf. LP has fewer constraints than ILP formulation.
Lemma 3. Let x be
solution to LP. Let G(x) be the graph
with an edge between machine i and job
j if x_{ij}
> 0. Then G(x) is
acyclic.
Pf. (deferred)
Why a job can connect to multiple machines?
Rounded solution. Find LP solution x where G(x) is a forest. Root forest G(x) at some arbitrary machine node r.
Lemma 4. Rounded solution only assigns jobs to
authorized machines.
Pf. If job j is
assigned to machine i, then x_{ij} > 0. LP solution can only assign
positive value to authorized machines.
Lemma 5. If job j
is a leaf node and machine i =
parent(j), then x_{ij} =
t_j.
Pf.
Lemma 6. At most one non-leaf job is assigned to a
machine.
Pf. The only possible non-leaf job assigned to machine
i is parent(i).
Theorem. Rounded solution is a 2-approximation.
Pf.
\begin{aligned} \sum_{j \in J(i)} t_j & = \sum_{j \in J(i)} x_{ij} & \text{LEMMA 5} \\ & \le \sum_{j \in J} x_{ij} \le L & \text{LP} \\ & \le L^\ast & \text{LEMMA 2} \\ \end{aligned}
Flow formulation of LP.
\begin{aligned} \sum_i x_{ij} & = t_j & \forall j \in J \\ \sum_j x_{ij} & \le L & \forall i \in M \\ x_{ij} & \ge 0 & \forall j \in J, i \in M_j \\ x_{ij} & = 0 & \forall j \in J, i \notin M_j \\ \end{aligned}
Observation. Solution to feasible flow problem with value L are in 1-to-1 correspondence with LP solutions of value L.
Lemma 3. Let (x, L)
be solution to LP. Let G(x) be the
graph with an edge from machine i to
job j if x_{ij} > 0. We can find another solution
(x', L) such that G(x') is acyclic.
Pf. Let C be a cycle
in G(x).
Running time. The bottleneck operation in our 2-approximation is solving one LP with m n + 1 variables.
Remark. Can solve LP using flow techniques on a
graph with m + n + 1 nodes:
given L, find feasible flow if it
exists. Binary search to find L^\ast.
Extensions: unrelated parallel machines. [Lenstra–Shmoys–Tardos 1990]
PTAS. (1 + \epsilon)-approximation algorithm for any constant \epsilon > 0.
Consequence. PTAS produces arbitrarily high quality solution, but trades off accuracy for time.
This section. PTAS for knapsack problem via rounding and scaling.
Knapsack problem.
Ex: \{ 3, 4 \} has value 40.
item | value | weight |
---|---|---|
1 | 1 | 1 |
2 | 6 | 2 |
3 | 18 | 5 |
4 | 22 | 6 |
5 | 28 | 7 |
SUBSET-SUM. Given a set X, values u_i ≥ 0, and an integer U, is there a subset S \subseteq X whose elements sum to exactly U?
KNAPSACK. Given a set X, weights w_i ≥ 0, values v_i ≥ 0, a weight limit W, and a target value V, is there a subset S \subseteq X such that:
\sum_{i \in S} w_i \le W, \sum_{i \in S} v_i \le V
Theorem. SUBSET-SUM \le_P KNAPSACK.
Pf. Given instance (u_1, ..,
u_n, U) of SUBSET-SUM, create KNAPSACK instance:
\begin{aligned} v_i = w_i = u_i && \sum_{i \in S} u_i & \le U \\ V = W = U && \sum_{i \in S} u_i & \le U \\ \end{aligned}
Def. OPT(i, w) = max value subset of items 1, .., i with weight limit w.
Case 1. OPT does not select item i.
Case 2. OPT selects item i.
OPT(i, w) = \left\{ \begin{array}{ll} 0 & \text{if}\ i = 0 \\ OPT(i-1, w) & \text{if}\ w_i > w \\ \max \{ OPT(i-1, w), v_i + OPT(i-1, w-w_i) \} & \text{otherwise} \\ \end{array}\right.
Theorem. Computes the optimal value in O(n W) time.
Def. OPT(i, v) = min weight of a knapsack for which we can obtain a solution of value \ge v using a subset of items 1, .., i.
Note. Optimal value is the largest value v such that OPT(n, v) \le W.
Case 1. OPT does not select item i.
Case 2. OPT selects item i.
OPT(i, v) = \left\{ \begin{array}{ll} 0 & \text{if}\ v \le 0 \\ \infty & \text{if}\ i = 0\ \text{and}\ v > 0 \\ \min \{ OPT(i-1, v), w_i + OPT(i-1, v-v_i) \} & \text{otherwise} \\ \end{array}\right.
Theorem. Dynamic programming algorithm II computes
the optimal value in O(n^2 v_{max})
time, where v_{max} is the maximum of
any value.
Pf.
Remark 1. Not polynomial in input size!
(pseudo-polynomial)
Remark 2. Polynomial time if values are small
integers.
Intuition for approximation algorithm.
item | value | weight |
---|---|---|
1 | 934221 | 1 |
2 | 5956342 | 2 |
3 | 17810013 | 5 |
4 | 21217800 | 6 |
5 | 27343199 | 7 |
item | value | weight |
---|---|---|
1 | 1 | 1 |
2 | 6 | 2 |
3 | 18 | 5 |
4 | 22 | 6 |
5 | 28 | 7 |
Round up all values:
\bar{v_i} = \lceil \frac{v_i}{\theta} \rceil \theta, \hat{v_i} = \lceil \frac{v_i}{\theta} \rceil
Observation. Optimal solutions to problem with \bar{v} are equivalent to optimal solutions to problem with \hat{v}.
Intuition. \bar{v} close to v so optimal solution using \bar{v} is nearly optimal; \hat{v} small and integral so dynamic programming algorithm II is fast.
Theorem. If S is
solution found by rounding algorithm and S^\ast is any other feasible solution
satisfying weight constraint, then (1 +
\epsilon) \sum_{i \in S} v_i \ge \sum_{i \in S^\ast} v_i
Pf.
\begin{aligned} \sum_{i \in S^\ast} v_i & \le \sum_{i \in S^\ast} \bar{v_i} && \text{round up}\\ & \le \sum_{i \in S} \bar{v_i} && \text{optimality} \\ & \le \sum_{i \in S} (v_i + \theta) && \text{rounding gap} \\ & \le \sum_{i \in S} v_i + n\theta && |S| \le n \\ & = \sum_{i \in S} v_i + \frac{1}{2} \epsilon v_{max} && \theta = \epsilon v_{max} / 2n \\ & \le (1 + \epsilon) \sum_{i \in S} v_i && v_{max} \le 2 \sum_{i \in S} v_i \\ \end{aligned}
Theorem. For any \epsilon
> 0, the rounding algorithm computes a feasible solution whose
value is within a (1 + \epsilon) factor
of the optimum in O(n^3 / \epsilon)
time.
Pf.
\hat{v}_{max} = \lceil \frac{v_{max}}{\theta} \rceil = \lceil \frac{2n}{\epsilon} \rceil