Gready Alg and Local Search:Minimum-Degree Spanning Tree

 

This blog is still talking about greedy algorithm and local search. And this blog will focus on the minimum-degree Spanning Tree problem.

Statement of Problem

Minimum-degree spanning trees:

  • Instance: Graph $G = (V, E)$ with $\vert V\vert = n$.

  • Solution: Find a spanning tree $T$ of $G$.

  • Objective: minimize the maximum degree of vertices in $T$.

Note: A spanning tree has maximum degree two iff it is a Hamiltonian path. And deciding if a graph has a Hamiltonian path is NP complete.

Theorem: Given a graph it is NP-complete to decide whether it has a spanning tree of maximum degree two.

Main idea of Local Search Algorithm

Start with arbitrary spanning tree $T$

To reduce degree of vertex $u$ with high ${d_T(u)}$, find all edge ${{v, w}}$ s.t. ${T + {v, w}}$ create a cycle with $u$. If $\max{d_T(v), d_T(w)} \leq d_T(u) - 2$; We can pick up an edge $e$ in this cycle which is also incident to node ${u}$. Then we create $T’ = T - e + (v, w)$ (we call it a local move).

Note: $d_{T’}(u) = d_T(u) - 1$ and $\max{d_T(v), d_T(w)} \leq d_T(u) - 2 + 1 = d_T(u) - 1$. That means we indeed reduce the degree in this local part of the spanning tree.

In polynomial time, we can do local move on very high degree vertices.

Algorithm and Analysis

Local Search Algorithm:

  • Step 1: Start with arbitrary spanning tree $T$. Let ${\Delta(T) = \max_{u \in V} d_T(u); l = \left\lceil \log n \right\rceil}$

  • Step 2: pick a node $u$ with degree at least $\Delta(T) - l$ and try to reduce its degree by local move.

  • Step 3: Stop if no improvement possible for any node $u$ with degree at least $\Delta(T) - l$.

Approximation Analysis

Let’s first analysis the approximation of algorithm.

Theorem: Let $T^*$ be a locally optimal tree. Then ${\Delta(T^*) \leq 2OPT + l, l = \left\lceil \log n \right\rceil}$

  1. We can first find the lower bound of $OPT$. Let’s remove $k$ edges from $T$; this creates $k + 1$ connected components.
  • Let’s denote the set $S$ of nodes such that each $e \in E$ connecting two of the $k + 1$ components is incident on $S$. (In another word, we find all the edge in $G$ that connected any pair of the components, and we pick up the nodes of these edge in set $S$.)

  • For any spanning tree of $G$, it must have at least $k$ edges with endpoints in these $k+1$ different components.

  • For any spanning tree of $G$ average degree (here means the degree in the the spanning tree instead of the degree in $G$) of a node in $S$ is at least $\frac{k}{\vert S \vert }$. (If we want to connect to component, we have to add an edge with an endpoint in $S$, so the sum of degree of each note in $S$ is at least $k$, which implies the average degree is at least $\frac{k}{\vert S \vert }$.)

  • Hence we have $OPT \geq \max_{s \in S}{d_T(s)} \geq \frac{k}{\vert S \vert}$.

  1. Let $S_i$ be the set of nodes with degree at least $i$ in $T^*$.

Claim 1: For each $S_i$, with $i \geq \Delta(T) - l + 1$:

a) there are at least $(i - 1)\vert S_i \vert + 1$ distinct edges of $T^*$ incident on $S_i$.

b) after removing these edges, any edge connecting two different components are incident on $S_{i-1}$.

Claim 2: $\exists i \geq \Delta(T) - l + 1$ s.t. $\vert S_{i-1} \vert \leq 2\vert S_i \vert$.

By claim 2, we can find such $i$ and take $S = S_{i-1}$, then we have

\[OPT \geq \frac{(i-1)\vert S_i\vert + 1}{\vert S_{i-1} \vert } \geq \frac{(i-1)\vert S_i \vert + 1}{2 \vert S_i \vert } \geq \frac{i-1}{2} + \frac{1}{2 \vert S_i \vert} \geq (\Delta(T)-l)/2 \geq (\Delta(T^*)-l)/2\]

Note, here we don’t know exactly the edge set that will generate set $S_{i-1}$. But, we can observe the node set $S_{i}$, if we remove all the edges of $T^*$ that are incident on $S_i$, we can get $S_{i-1}$. In this process, we know $k \geq (i - 1) \vert S_i \vert + 1$. Thus, we can use the conclusion $OPT \geq \frac{k}{\vert S \vert}$. Additionally, we do local move for these high degree nodes, so $\Delta(T) \geq \Delta(T^*)$.

Hence, we prove the theorem. $\square$

Now, I will show the proof of above claims that are not trivial.

Claim 1: For each $S_i$, with $i \geq \Delta(T) - l + 1$:

a) there are at least $(i - 1)\vert S_i\vert + 1$ distinct edges of $T^*$ incident on $S_i$.

b) after removing these edges, any edge connecting two different components are incident on $S_{i-1}$.

Proof.

a) there are at least $(i - 1)\vert S_i\vert + 1$ distinct edges of $T^*$ incident on $S_i$.

There are at least $i\vert S_i \vert$ edges of $T^*$ incident on nodes of $S_i$ as each vertex of $S_i$ has degree at least $i$. Because $T^*$ is a spanning tree, so there at most $\vert S_i \vert - 1$ join two vertices in $S_i$ (In another word, the edges that linked two nodes in $S_i$ are count twice in $i \vert S_i \vert $.) Hence, there at least $i \vert S_i \vert - \vert S_i \vert - 1 = (i-1)\vert S_i \vert +1 $ distinct edges of $T^*$ are incident on $S_i$.

b) after removing these edges, any edge connecting two different components are incident on $S_{i-1}$.

After removing all the edges in $T^*$ incident on $S_i$, we will break $T^*$ into several components. Now, we focus on the edges (in $G$) that connect two components (after removing above edges).

Case 1: The edge incident to $S_i$, because $S_i \subseteq S_{i-1}$. Hence this kind of edges are also incident to $S_{i-1}$.

Case 2: The edge is not incident to $S_i$, but we know it can connect two component. And the two components are created by removing the edges in $T^*$ incident to $S_i$. So, there exists a node $u$ belonging to $S_i$ in one of the two components. So, this edge $e={s,t}$ creates a cycle in $T^*+e$, and the cycle includes node $u$. Because $T^*$ is locally optimal, and $d_{T^*}(u) \geq i \geq \Delta(T) - \ell + 1$, and we know the degree of $s,t$ cannot be both $ \leq i -2$, otherwise, their degree in $T^*$ will must be $\leq d_{T^*}(u)-2$ and we have can do local move for $u$ by $T^*+e$ ($d_{T^*}(u) \geq \Delta(T) - \ell$, we can pick up $u$ in algorithm) . Hence, one of endpoints in $e$ (w.l.g we denote as $s$) will have $d_{T^*}(s) > i-2 $ i.e. $d_{T^*}(s) \geq i- 1$. Thus, we get $s \in S_{i-1}$. That means $e$ is incident on $S_{i-1}$. $\square$

Claim 2: $\exists i \geq \Delta(T) - l + 1$ s.t. $\vert S_{i-1} \vert \leq 2\vert S_i \vert$.

Proof. Suppose, for any $i\geq \Delta(T) - l + 1$, we have $\vert S_{i-1} \vert > 2\vert S_i \vert$. Let $i = \Delta(T) - l + 1$, we have \(\vert S_{\Delta(T) -\ell} \vert > 2 \vert S_{\Delta(T) -\ell+1} \vert \cdots 2^\ell \vert S_{\Delta(T)} \vert > 2^{\lceil \lg n \rceil} \vert S_{\Delta(T)} \vert > n \vert S_{\Delta(T)} \vert > n\)

Note, here $\vert S_{\Delta(T)}\vert \geq 1$, otherwise, it cannot be a spanning tree. The above conclusion contradicts to the fact that there are $n$ vertices in the graph $G$. $\square$

Review the idea of proof for approx ratio

The algorithm starts with an arbitrary spanning tree $T$ and using $\Delta(T)$ as parameter to conduct the local search, and terminates resulting in $T^*$. We define a series set $S_i$ (where $i$ relates to $\Delta(T)$), and we prove that if we moving all the edges (denote as $E_i$) in $T^*$ incident $S_i$, we find any edges that connect two components (generating by removing $E_i$) are incident to $S_{i-1}$, and we can lower bound (which relates to $\vert S_i \vert$) the size of $\vert E_i \vert $ . By claim 2, we can find a specific $i-1$, and consider this specific set $S_{i-1}$, that can have a upper bound (relates to size of $\vert S_i\vert$). Through the relation between average degree in $T^*$ of $S_{i-1}$ and $OPT$, we can get the lower bound of $OPT$ is $(\Delta(T)-\ell) /2$, which will not less than $(\Delta(T^*)-\ell) /2$.

Running time analysis

Then, let’s analysis the running time. In this proof, we will learn the strategy of potential function.

Theorem: The algorithm finds a locally optimal tree polynomial time. For tree $T$ define the potential function $\phi(T) = \sum_{v \in V} 3^{d_T(v)}$.

  • Initially: $\phi(T) \leq n^3\Delta(T) \leq n3^n$

  • Lowest: for Hamiltonian path: potential is $2\cdot3 + (n - 2)3^2 > n$

Claim 1: At each step potential function drops by $\left( 1 - \frac{2}{27n^3} \right)$.

Then, we know after $\frac{27}{2} n^4 \ln 3$ local moves the potential function is:

\[\left(1 - \frac{2}{27n^3}\right)^{\frac{27}{2} n^4 \ln 3} \cdot (n3^n)\leq e^{-n \ln 3} \cdot (n3^n) = n\]

As lower bound of potential function of spanning tree is $n$, so after $O(n^4)$ moves no local move possible and we must terminates the algorithm. $\square$

Now, I will show the decrease ration mentioned in claim 1.

Claim 1: At each step potential function drops by $\left( 1 - \frac{2}{27n^3} \right)$.

Proof. Denote the current spanning tree is $T’$. Suppose the algorithm reduces the degree of a vertex $u$ from $i$ to $i - 1$, where $i \geq \Delta(T) - \ell$, and adds an edge $(v, w)$. We know the degree of $v,w$ at most $i-2$ before local move. And we know the function $f(x) = 3^x - 3^{x-1} = 2 \cdot 3^{x-1} $ is a monotonically increasing.

Then the increase in the potential function due to increasing the degree of $v$ and $w$ is at most $2 \cdot (3^{i-1} - 3^{i-2}) = 4 \cdot 3^{i-2}$. The decrease in the potential function due to decreasing the degree of $u$ is $3^i - 3^{i-1} = 2 \cdot 3^{i-1}$. And we have

\[3^\ell = 3^{\lceil \log_2 n\rceil} \leq 3 \cdot 3^{\log_2 n} \leq 3 \cdot 2^{2\log_2 n} = 3n^2\]

Therefore, the overall decrease in the potential function is at least

\[\begin{aligned} 2 \cdot 3^{i-1} - 4 \cdot 3^{i-2} &= 2 \cdot 3^{i} \geq 2 \cdot 3^{\Delta(T) - \ell} \\ &= \frac{2\cdot 3^{\Delta(T)}}{3^\ell}\\ & \geq \frac{2\cdot 3^{\Delta(T)}}{3n^2} \\ & = \frac{2\cdot n 3^{\Delta(T)}}{3n^3} \\ & \geq \frac{2\cdot n 3^{\Delta(T')}}{3n^3} \\ & \geq \frac{2\cdot \sum_{v\in V} 3^{d_{T'}(v)}}{27n^3} \\ & = \frac{2}{27n^3} \cdot \phi(T') \\ \end{aligned}\]

Thus, for the resulting tree $T’’$ we have that $\phi(T’’) \leq \left( 1 - \frac{2}{27n^3} \right) \phi(T’)$. $\square$