Lecture 15:Develop of the simplex method

 

This blog is talking about how along we can move along basic feasible direction, i.e. ${ \theta }$ and talking abou the termination of Simplex Method.

Recall our assumptions: LP problem in standard form

$$ \begin{aligned} & \text{minimize} & \bar{c}^\top \bar{x} \\ & \text{subject to} & A\bar{x} = \bar{b} \\ & & \bar{x} \geq 0 \\ \end{aligned} $$

where matrix ${A}$ has full row rank.

We introduce an additional assumption (which will be relaxed later on):

Every b.f.s. is nondegenerate.

Given a b.f.s. ${\bar{x}}$ (with associated basis matrix $B$) of the feasible polyhedron ${P = \{ \bar{x} \in \mathbb{R}^n \mid A\bar{x} = \bar{b} \text{ and } \bar{x} \geq \bar{0} \} } $,

EITHER: all reduced costs ${\bar{c}_j}$ for all nonbasic variables are nonnegative, and ${\bar{x}}$ is an optimal solution (according to optimal certification) and the simplex algorithm stops;

OR: the reduced cost ${\bar{c}_j}$ of a nonbasic variable ${x_j}$ is negative, and the ${j^{th}}$ basic direction at ${\bar{x}}$ is a feasible direction of cost decrease.

The vector ${\bar{d}_j}$ with

$$ \bar{d}_j = \begin{cases} d_{j} = 1 & \\ d_{j} = 0 & \text{for } i \neq B(1),...,B(m), j \\ d_{B} = -B^{-1}A_j & \\ \end{cases} $$

In order to lower the cost as much as possible, we then move away from the b.f.s. $\bar{x}$ along the path $\theta \rightarrow \bar{x} + \theta \bar{d}$ (for $\theta \geq 0$) as far as possible, i.e., until we reach the point $\bar{x} + \theta^*\bar{d}$ with

$$ \begin{aligned} \theta^* &= \max \{ \theta \in [0, \infty) \mid \bar{x} + \theta \bar{d} \in P \} \\ &= \max \{ \theta \in [0, \infty) \mid \bar{x} + \theta \bar{d} \geq \bar{0} \}. \end{aligned} $$

Note: ${ A(\bar{x} + \theta \bar{d}) = \bar{b} }$ is satisfied ${ \forall \theta \in [0,\theta^*) }$ since ${ A\bar{x} = \bar{b} }$ and ${ A\bar{d} = 0 }$

Doing so, the nonbasic variable $x_j$ becomes positive while all other nonbasic variables remain zero — we say that $x_j$ (or $A_j$) enters or is brought into the basis.

The reduction in cost resulting from the move from $\bar{x}$ to $\bar{x} + \theta \bar{d}$ is

$$ \bar{c}^\top(x+\theta \bar{d}) - \bar{c}^\top \bar{x} = \bar{c}^\top \theta \bar{d} = \theta (\bar{c}_j + \bar{c}_B^\top \bar{B}^{-1} \bar{d}_B) = \theta (\bar{c}_j - \bar{c}_B^\top \bar{B}^{-1} A_j) = \theta \tilde{c}_j $$

How to compute ${\theta^*}$?

The path $\theta \mapsto \bar{x}+\theta \bar{d}$ (with $\theta \geq 0$) exits the feasible set $P$ when a nonnegativity constraint is about to be violated.

EITHER $\bar{d} \geq 0$ and then $\bar{x}+\theta \bar{d} \geq 0$ for all $\theta \in [0,\infty)$ so that $\bar{x}+\theta \bar{d}$ never exits $P$ and we set $\theta^* = \infty$

OR $\bar{d}_i < 0$ for some $i$ and then

$\theta^* = \max \{ \theta \in [0,\infty) \mid \bar{x}+\theta \bar{d} \geq 0 \}$

  • For each $k$ s.t. $\bar{d}_k \geq 0$, we have $\bar{x}_k+\theta \bar{d}_k \geq 0$ for all $\theta \geq 0$

  • For each $k$ s.t. $\bar{d}_k < 0$, we have $\theta \leq -\frac{\bar{x}_k}{\bar{d}_k}$

$$ \begin{aligned} \theta^* &= \max \left\{ \theta \in [0,\infty) \bigg| \theta = -\frac{\bar{x}_k}{\bar{d}_k} \text{ if } \bar{d}_k < 0 \right\} \\ &= \min_{\substack{k \in \{1,2,\cdots,n\} \\ \text{ s.t. } d_k <0 }} \left( -\frac{\bar{x}_k}{\bar{d}_k} \right)\\ & \min_{\substack{i \in \{1,2,\cdots,m\} \\ \text{ s.t. } d_{B(i)} <0 }} \left( -\frac{\bar{x}_{B(i)}}{\bar{d}_{B(i)}} \right) \end{aligned} $$

where ${1,…,n}^+$ represents the set of indices for which $\bar{d}_k$ is positive.

Conclusion:

$$ \theta^* = \begin{cases} \infty & \text{if } \bar{d} \geq 0 \\ \min_{\substack{i \in \{1,2,\cdots,m\} \\ \text{ s.t. } d_{B(i)} <0 }} \left( -\frac{\bar{x}_{B(i)}}{\bar{d}_{B(i)}} \right) & \text{otherwise} \end{cases} $$

Thus, if $\theta^*$ is finite, $\theta^* = \frac{-\bar{x}_{B(i)}}{\bar{d}_{B(i)}}$ for some $i \in {1,…,m}$ (with $\bar{d}_{B(i)} < 0$), and we have $\bar{x}_{B(i)} + \theta^* \bar{d}_{B(i)} = 0$.

The variable $\bar{x}_{B(i)}$, which was basic at the b.f.s. $\bar{x}$, has become zero at $\bar{x}’ := \bar{x} + \theta^* \bar{d}$, whereas the variable $x_j$, which was nonbasic at the b.f.s. $\bar{x}$, has become positive at $\bar{x}’ := \bar{x} + \theta^* \bar{d}$.

This suggests that ${x_j}$ should replace ${\bar{x}_{B(i)}}$ in the basis.

What’s the updating basic matrix

Let $\bar{B} = \begin{bmatrix}\bar{A}_{B(1)} & \bar{A}_{B(2)} & \cdots &\bar{A}_{B(\ell-1)} & \bar{A}_{B(\ell)} &\bar{A}_{B(\ell+1)} &\cdots &\bar{A}_{B(m)} \end{bmatrix}$ be the basis matrix corresponding to the b.f.s. $\bar{x}$.

Let $\bar{B}’ = \begin{bmatrix}\bar{A}_{B(1)} & \bar{A}_{B(2)} & \cdots &\bar{A}_{B(\ell-1)} & \bar{A}_{j} &\bar{A}_{B(\ell+1)} &\cdots &\bar{A}_{B(m)} \end{bmatrix}$ be the matrix obtained from ${ B }$ by replacing the column ${ \bar{A}_{B(\ell)} }$ by ${ \bar{A}_{j} }$

Theorem:

(i) The columns of ${B’}$ are linearly independent and, therefore, ${B’}$ is a basis matrix.

(ii) The vector ${\bar{x}’ = \bar{x} + \theta^* \bar{d}}$ is the basic feasible solution associated with the basis matrix ${B’}$.

Proof.

(1) ${B^{-1} \begin{bmatrix} \bar{A}_{B(1)} & \bar{A}_{B(2)} & \cdots & \bar{A}_{B(m)} \end{bmatrix} = B^{-1} B = I = \begin{bmatrix} \bar{e}_1 & \bar{e}_2 & \cdots & \bar{e}_m \end{bmatrix}}$ and ${B^{-1}\bar{A}_j = -\bar{d}_B}$ implies that

${B^{-1} B’ = B^{-1} \begin{bmatrix}\bar{A}_{B(1)} & \bar{A}_{B(2)} & \cdots &\bar{A}_{B(\ell-1)} & \bar{A}_{j} &\bar{A}_{B(\ell+1)} &\cdots &\bar{A}_{B(m)} \end{bmatrix} = \begin{bmatrix} \bar{e}_1 & \cdots & \bar{d}_B & \cdots & \bar{e}_m \end{bmatrix}}$

The $\ell$-th entry of this column is $-\bar{d}_{B(i)}$, which is nonzero ( ${ \ell }$ was selected from ${ \min_{\substack{i \in {1,2,\cdots,m} \ \text{ s.t. } d_{B(i)} <0 }} \left( -\frac{\bar{x}_{B(i)}}{\bar{d}_{B(i)}} \right) }$, which implies ${ d_{B(\ell)} < 0 }$ ).

So, we know ${ B^{-1} B’ }$ is invertible, which means ${ B’ }$ is ivertible and columns of ${ B’ }$ are linear independent.

(2) By the very construction of $\bar{x}’$, we have

\[\begin{aligned} A\bar{x}' &= b \\ \bar{x}' &\geq 0 \\ \bar{x}_i' &= 0 \text{ for } i \notin \{B(1), B(2), \ldots, B(m)\} \end{aligned}\]

and the columns of the matrix $B’$ have just been shown to be linearly independent. Therefore $\bar{x}’$ is the b.f.s. associated with the basis matrix $B’$. ${ \square }$

Iteration of Simplex algorithm

This procedure for moving from a b.f.s. $\bar{x}$ to a distinct b.f.s. $\bar{x}’$ with strictly smaller cost is a typical iteration of the simplex algorithm a.k.a. a pivot.

  1. Start from b.f.s. $\bar{x}$ with associated basis matrix
\[B = [\bar{A}_{B(1)} \bar{A}_{B(2)} \ldots \bar{A}_{B(m)}]\]
  1. Compute the reduced costs $\tilde{c}_j = c_j - \bar{c}_B^\top B^{-1} \bar{A}_j$ (at $\bar{x}$) for all nonbasic indices $j$.

If they are all nonnegative, the b.f.s. $\bar{x}$ is optimal and the algorithm terminates.

Otherwise, choose a nonbasic index $j$ for which $\bar{c}_j < 0$. (Doing so, we are essentially choosing a feasible direction $\bar{d}$ of cost decrease.)

  1. Compute $\bar{\mu} = B^{-1} \bar{A}_j$ ($= -\bar{d}_B$). If no component of $\bar{\mu}$ is positive, we have $\theta^* = \infty$, the optimal cost is $-\infty$, and the algorithm terminates.

If some component of $\bar{\mu}$ is positive, let $\theta^* = \min\limits_{i \in {1,…,m}} \left( \frac{\bar{x}(B(i))}{\bar{\mu}_i} \right)$ subject to $\bar{\mu}_i > 0$.

  1. $\theta^* = \frac{\bar{x}_(B(\ell))}{\bar{\mu}_\ell}$ for some $\ell \in {1,2,…,m}$ (with $\bar{\mu}_\ell > 0$).

Replace the column $\bar{A}_{B(\ell)}$ by $\bar{A}_j$ in the basis matrix $B$ so as to obtain the new basis matrix $B’$.

The new b.f.s. $\bar{x}’$ satisfies

\[\begin{aligned} \bar{x}_j' &= \theta^* = \bar{x}'_{B'(\ell)} \\ \bar{x}_{B(i)}' &= \bar{x}_{B(i)} - \theta^* \bar{\mu}_i = \bar{x}'_{B'(i)} \quad \text{(for $i \neq \ell$)} \\ \bar{x}_k' &= 0 \quad \text{for $k \notin \{B'(1), B'(2), ..., B'(m)\}$}. \end{aligned}\]

Terminates in finite steps?

The simplex algorithm is initialized with an arbitrary b.f.s. $\bar{x}$, which, for feasible set forms LP problems, is guaranteed to exist.

Good news: in the nondegenerate case, the simplex algorithm works (i.e. solves the LP problem) and terminates after a finite number of iterations.

Theorem Provided the feasible set is nonempty and every b.f.s. is nondegenerate, the simplex algorithm terminates after a finite number of iterations and returns

  1. either an optimal basis and an associated b.f.s. which is optimal

  2. or a vector $\bar{d}$ such that $A\bar{d} = 0$ and $\bar{c}^T\bar{d} < 0$ $\bar{d} \geq 0$ showing that the optimal cost is $-\infty$.

Proof At each iteration, the algorithm moves by a positive amount $\theta^*$ along a direction $\bar{d}$ that satisfies $\bar{c}^T\bar{d} < 0$. Therefore, the cost of every successive b.f.s. visited by the algorithm is strictly less than the cost of the previous b.f.s., and no b.f.s. can be visited twice. Since there is a finite number of b.f.s., the algorithm must eventually stop.

  • If the algorithm terminates because all reduced costs at the b.f.s. $\bar{x}$ are nonnegative, then the optimality criterion has been met, the current basis matrix is optimal and the current b.f.s. is optimal.

  • If the algorithm terminates due to the stopping criterion in step 3, then we are not at a basic solution $\bar{x}$ and we have discovered a nonbasic variable $\bar{x}_j$ such that $\bar{c}_j < 0$ and such that the corresponding basic direction $\bar{d}$ satisfies $A\bar{d} = 0$ and $\bar{d} \geq 0$. In particular, $\bar{x}+\theta \bar{d} \in \Gamma$ for all $\theta > 0$. Since $\bar{c}^T\bar{d} < 0$, by taking $\theta$ arbitrarily large, the costs can be made arbitrarily negative, and the optimal cost is $-\infty$.

Degeneracy in Simplex Method

Following, we will foucs on thr behavior of the simplex algorithm in the presence of degenerate b.f.s.

Suppose the simplex algorithm is used in the presence of degeneracy. Then the following new possibilities may be encountered in the course of the algorithm:

  1. If the current b.f.s. $\bar{x}$ is degenerate, it can happen that $\theta^* = 0$ and then $\bar{x}’ = \bar{x}$. (We stay at the same corner of the polyhedron.) This happens if some basic component $\bar{x}(B(i))$ of the b.f.s. $\bar{x}$ is equal to zero and the corresponding component $\bar{d}(B(i))$ of the direction vector $\bar{d}$ is negative.

While $\bar{x}’ = \bar{x}$, the basis matrix changes: The column $\bar{A}_{B(i)}$ of $B$ is replaced with $\bar{A}_j$ in order to get $B’$.

  1. If $\theta^* > 0$, it may happen that more than one of the original basic variables becomes zero at $\bar{x}’ = \bar{x} + \theta^* \bar{d}$. Since only one of them exits the basis, the other(s) remain in the basis at zero level, and the new b.f.s. $\bar{x}’$ is degenerate.

Thus, in the presence of degeneracy, basis changes while staying at the same b.f.s. can occur.

A sequence of such basis changes may lead to the eventual discovery of a cost-reducing feasible direction. However, a sequence of such basis changes might as well lead back to the initial basis, in which case the algorithm may loop indefinitely. This undesirable phenomenon is called cycling.

We will see later that cycling can be avoided by choosing judiciously the variables that enter or exit the basis.

Freedom of choice / pivoting rules

At step 2 of the simplex iteration, we are free to choose any nonbasic index $j$ whose reduced cost $\bar{c}_j$ is negative. (Choice of entering column.)

At step 4 of the simplex iteration, there may be several indices $\ell$ that attain the minimum in the definition of $\theta^*$ and we are free to choose any one of them. (Choice of exiting column.)

Rules for making such choices are called pivoting rules.

Some natural possible pivoting rules:

  1. Choose a column $\bar{A}_j$ with $\bar{c}_j < 0$, whose reduced cost is the most negative. (Requires the computation of the reduced cost for every nonbasic variable.)

  2. Choose a column $\bar{A}_j$ with $\bar{c}_j < 0$ for which the corresponding cost decrease $\vert \bar{c}_j \vert$ is largest. (Heavier computational burden!)

  3. Smallest subscript rule: Choose the smallest $j$ for which $\bar{c}_j < 0$. (Simple rule / Lower computational burden.)

    Likewise, for the choice of the exiting column, out of all the basic variables eligible to exit the basis, choose the one with the smallest subscript.

    Remark: It turns out that by following the smallest subscript rule for both the entering and the exiting column, cycling can be avoided. (See later.)

  4. Steepest edge rule: Choose the column $\bar{A}_j$ with $\bar{c}_j < 0$ so as to maximize the angle $\varphi$ between the corresponding basic direction of cost decrease $\bar{d}$ and the cost vector $\bar{c}$.

$$ \cos \varphi = \frac{\bar{c}^\top \bar{d}}{\Vert \bar{c} \Vert \cdot \Vert \bar{d} \Vert} $$