Randomized Algs:Permutation Routing Problem

 

This blog, we still talk about randomized algorithms. Let’s consider the parallel computation problem: A network of parallel processors is modeled by a directed graph ${G = (N, E)}$. The nodes ${N}$ represent the processors and the edges ${E}$ model the communication links between the processors. All communication between processors occurs in synchronous steps. Each link can carry at most one unit message (packet) in one step. During a step, a processor can send at most one packet to each of its neighbors. Each processor is uniquely identified by a number between 1 and ${N}$.

Permutation Routing Problem

We can abstract the above problem to following statement.

Given a directed graph on ${N}$ nodes, where each node ${i}$ initially contains one packet destined for some node ${d(i)}$, s.t. ${d(\cdot)}$ is a permutation. In each step, every edge can carry a single packet. A node that may send a packet on each outgoing edge (if it has the packets).

A route for a packet is a list of edges it can follow from its source to its destination.

If two packets want to use the same edge, one may have to wait. The queueing discipline for an algorithm is how it decides which packet goes first (Like FIFO, first in first out).

An oblivious algorithm for the permutation routing problem satisfies the following property: if the route followed by the packet starting at $v_i$ depends only on $d(i)$, not on $d(j)$ for any $j \neq i$. (Note: Oblivious routing algorithms are attractive for their simplicity of implementation : the communication hardware at each node in the network can determine the next link on its route, simply by looking at the source and destination information carried by a packet.)

Lower bound of oblivious algorithm

Theorem 1.1 For any deterministic oblivious permutation routing algorithm on a network of ${N}$ nodes each of out-degree ${d}$, there is an instance of permutation routing requiring $\Omega \left( \sqrt{\frac{N}{d}} \right)$ steps 1.

Its proof will not be shown in this blog.

Consider the implications of this theorem for the case when the network is the Boolean hypercube, a popular network for parallel processing. The Boolean hypercube has ${N=2^n}$ nodes connected in the following manner:

If $(i_0, \ldots, i_{n-1})$ and $(j_0, \ldots, j_{n-1})$ are the (ordered) binary representations of node $i$ and node $j$ respectively, then there exists a directed edge $e_{ij} \in E$ and a directed edge $e_{ji} \in E$ if and only if $(i_0, \ldots, i_{n-1})$ and $(j_0, \ldots, j_{n-1})$ differ in exactly one position. Note that the maximum number of transitions is bounded by $\sqrt{2^n/n}$, which is an amazing huge number.

Bit Fixing Strategy

  1. Given that the source and destination addresses are $n$-bit vectors, consider the following simple choice of route to send $v_i$ from $i$ to the node $\sigma(i)$:

  2. Scan the bits of $\sigma(i)$ from left to right, and compare them with the address of the current location of $v_i$.

  3. Send $v_i$ out of the current node along the edge corresponding to the leftmost bit in which the current position and $\sigma(i)$ differ.

The strategy is best explained through an example. In order to go from $(1, 1, 0, 1)$ to $(0, 0, 0, 0)$, the packet takes the following path:

\[(1, 1, 0, 1) \rightarrow (0, 1, 0, 1) \rightarrow (0, 0, 0, 1) \rightarrow (0, 0, 0, 0).\]

Randomized Routing

Algorithm

For each packet ${v_i}$, independently, define its route as follows:

Phase I Pick random ${\sigma(i)}$ from ${{1, \ldots, N}}$. Packet ${v_i}$ travels to ${\sigma(i)}$ using bit-fixing strategy.

Phase II Packet ${v_i}$ travels from ${\sigma(i)}$ to ${d(i)}$, using bit-fixing strategy.

Queueing discipline: Arbitrary (FIFO).

Analysis

Let ${\text{delay}(v_i)}$ denote the number of steps ${v_i}$ spends in queues waiting for other packets to move during Phase I. Total #steps for ${v_i}$ in Phase I is at most ${n + \text{delay}(v_i)}$. Now, we hope to bound the ${\text{delay}(v_i)}$.

**Lemma: ** Let ${p_i = (e_1, \ldots, e_k)}$ be the route for ${v_i}$, and let ${S_i}$ be the set of other paths intersecting ${p_i}$. Then $\text{delay}(v_i)\leq \vert {S_i}\vert$.

Proof.

To prove this lemma, we need to prove the following points:

Point 1: If the bit fixing algorithm is used to route a packet ${v_i}$ from $i$ to ${\sigma(i)}$ and ${v_j}$ from $j$ to ${\sigma(j)}$, then their routes do not rejoin after they separate.

Suppose $k$ be the node at which the two paths separate and $l$ be the node at which they rejoin. Note that the route determined by the bit fixing scheme for ${v_i}$ and ${v_j}$ from $k$ to $l$ depends only on the bit representations of $k$ and $l$, therefore ${v_i}$ and ${v_j}$ must follow the same route. A contradiction to the assumption that $k$ is the last node before the paths separate. That means, if ${p_i,p_j}$ are intersected with an edge, their overlap may start from some ${k}$ and end at some.

Point 2: The lag of packet ${v_i}$ is at most ${\vert S_i \vert}$.

If a packet is ready to follow edge ${e_j}$ at time ${t}$, we define its lag at time ${t}$ to be ${t-j}$. The lag of ${v_i}$ is initially zero, and the delay incurred by ${v_i}$ is its lag when it traversers ${e_k}$ (the end of the route).

Note: 1) At beginning, we set time ${t=0}$. If ${v_i}$ passes the route without any wait, in the end ${t-k = k - k = 0}$. Hence, we can use ${t-j}$ to measure the delay before time ${t}$. 2) Additionally, there may be more than one packets that are ready to pass edge ${e_j}$ in time ${t}$, there lags in time ${t}$ are same, that is ${t-j}$.

Goal: We will show that each step at which the lag of ${v_i}$ increases by one can be charged to a distinct member of ${S_i}$.

The way of charging: ** We will charge each increase ${\textbf{lag}(v_i)}$ to a packet in ${S_i}$ **leaving with lag as ${\ell}$.

Claim 1 (Existence of lag ${\ell}$): ** When **lag of ${v_i}$ increases from $l$ to $l+1$. There exists a packet that is waiting in the queue ready to pass $e_j$ at next time $t$, that means there exists a packet with lag as ${t-j= \textbf{lag}(v_i) = \ell}$.

Claim 2 (Leaving): Let $t’$ be the last time step at which any packet in $S$ has a lag $l$. Note that there are only a finite number of time steps. (Each packed in ${S}$ may have a series of lags in the some time steps, now we consider the packet with lag as ${\ell}$ and with max time step ${t’}$, and we denote the packet as ${v}$ and ready to follow ${e_{j’}}$ at time ${t’}$ such that $t’ - j’ = l$.) Since $v$ demands ${e_j’}$, we assume a packet $\omega$ actually travels on ${e_j’}$ at time $t’$ (there may be many packet waiting in ${e_{j’}}$ at time ${t’}$, we denote the real issue one as ${\omega}$, it’s possible ${v = \omega}$). Then we have that $\omega$ must leave ${p_i}$ at time $t’$. Otherwise, ${\omega}$ would demand ${e_{j’+1}}$ at time $t’ + 1$ (because it just pass ${e_{j’}}$ at time ${t’}$) implying some packet must actually follow ${e_{j’+1}}$ at time $t’ + 1$, which violates the minimality of $t’$. By first item proved in this lemma, we know $\omega$ will never returns to ${p_i}$. Therefore, we this increase in lag to a packet ${\omega}$.

Hence, we charge every increase to a distinct packet in ${S_i}$, which implies ${\text{delay}(v_i) \leq \vert S_i \vert }$. $\square$

Theorem 1.1 With probability at least $1 - 2^{-5n}$, the packet ${v_i}$ reaches ${\sigma(i)}$ in $7n$ or fewer steps.

**Proof. ** Define a random variable ${H_{ij} = 1}$ if ${p_i}$ and ${p_j}$ share at least one edge, and 0 otherwise. Then the total delay incurred by ${v_i}$ is at most ${\sum_{j=1}^N H_{ij}}$.

Since, the ${\sigma(\cdot)}$ of various packets are chosen independently at random, the ${H_{ij}}$’s are independent Poisson trials for ${j \neq i}$. Thus, to bound the delay of packet ${v_i}$ from above using the Chernoff bound, it is enough to obtain an upper bound on the ${\sum_{j=1}^N H_{ij}}$. To do this we first bound ${E\left[\sum_{j=1}^N H_{ij}\right]}$.

For an edge ${e}$ of the hypercube, let the random variable ${T(e)}$ be the number of routes that pass through ${e}$. If ${p_i = (e_1, \ldots, e_k)}$, then

\[\sum_{j=1}^N H_{ij} \leq \sum_{\ell=1}^k T(e_{\ell})\]

This follows from the fact that the total delay is bound by the total number of routes that pass through this route, and therefore

\[E\left[\sum_{j=1}^N H_{ij}\right] \leq E\left[\sum_{\ell=1}^k T(e_{\ell})\right]\]

The total number of edges in the directed graph is ${Nn}$ counting ${n}$ edges per node. The expected length of each route is ${\frac{n}{2}}$ (each bit with probability ${\frac{1}{2}}$ to be different, so the expected Hamming distance is ${\frac{n}{2}}$) and hence the expected length of the total route length summed over all packets is ${\frac{Nn}{2}}$. That means ${\mathbb{E}[\sum_{e\in E}T(e)] = \frac{Nn}{2}}$. Since, all edges in the hypercube are symmetric, ${E[T(e_l)] = E[T(e_m)]}$ for any two edges ${e_l}$ and ${e_m}$, we have \(E[T(e)] =\frac{1}{\vert E \vert} \mathbb{E}\left[\sum_{e\in E}T(e)\right] = \frac{N \cdot \frac{n}{2}}{Nn} = \frac{1}{2}\)

Thus, for ${p_i = (e_1, \ldots, p_k)}$,

\[E\left[\sum_{j=1}^N H_{ij}\right] \leq \sum_{l=1}^k E[T(e_l)] = \frac{k}{2} \leq \frac{n}{2}\]

Now we can apply our first Chernoff bound to get

\[\begin{aligned} \Pr\left[\sum_{j=1}^N H_{ij} > 6n\right] &= \Pr\left[\sum_{j=1}^N H_{ij} > (1+11)\frac{n}{2}\right] \\ & < \Pr\left[\sum_{j=1}^N H_{ij} > (1+11)\mu \right] \\ & < \left(\frac{e^{11}}{(1+11)^{1+11}}\right)^{\mu}\\ & < \left(\frac{e^{11}}{(1+11)^{1+11}}\right)^{\frac{n}{2}}\\ & < 2^{-\left(1+11\right)\frac{n}{2}} = 2^{-6n} \end{aligned}\]

Since ${\text{delay}(v_i) \leq \sum_{j=1}^N H_{ij}}$, this gives

\[\Pr[\text{delay}(v_i) > 6n] < 2^{-6n}\]

Since the probability of a union of events is at most the sum of probabilities, the probability that any delay exceeds $6n$ is at most

\[\Pr[\max_i \text{delay}(v_i) > 6n] \leq \sum_{i=1}^N \Pr[\text{delay}(v_i) > 6n] < N \cdot 2^{-6n} = 2^n \cdot 2^{-6n} = 2^{-5n}\]

Thus, since each path has length at most $n$

\[\Pr[\#Steps \text{ in Phase I} > 7n] = \Pr[(n + \max_i \text{delay}(v_i) > 7n] < 2^{-5n} \quad \quad \square\]

Theorem 1.2 With a probability at least $1 - 2^{1-5n}$, every packet reaches its destination in $14n$ or fewer steps.

Proof. The Phase 2 of this scheme is identical to Phase 1, if the roles of the destination and source are interchanged. The above analysis is hence valid for the second phase also. Therefore, the probability that any packet fails to reach its target in either phase is less than $2 \times 2^{1-5n}= 2^{1-5n}$. ${\square}$