Lecture 9:Corners of a polyhedron

This blog is talking about three different characterizations of the corners of a polyhedron, say ${ P }$, and then show that all three characterizations are equivalent.

Extreme Points

Definition: An extreme point of a polyhedron ${ P }$ is a vector ${ \bar{x} \in P }$ such that one cannot find two vectors ${ \bar{y}, \bar{z} \in P }$ both different from ${ \bar{x} }$ and a scalar ${ \lambda \in [0,1] }$ such that ${ \bar{x} = (1-\lambda)\bar{y} + \lambda \bar{z} }$.

Vertices

Definition: A vector ${ \bar{x} \in P }$ is a vertex of the polyhedron ${ P }$ if there exists some vector ${ \bar{c} }$ such that ${ \bar{c}^ \top \bar{x} < \bar{c}^\top \bar{y} }$ for all ${ \bar{y} \in P, \bar{y} \neq \bar{x} }$

From ${ \bar{c}^ \top \bar{x} < \bar{c}^\top \bar{y} }$, we have ${ \bar{c}^\top (\bar{y} - \bar{x}) > 0 }$. That means the angle between the vectors ${ \bar{c} }$ and ${ \bar{y} - \bar{x} }$ is less than ${ 90 ^\circ }$, for all ${ \bar{y} \in P, \bar{y} \neq \bar{x} }$.

Hence, the polyhedron is fully containd in a halfspace of ${ \mathbb{R}^n }$. And the halfspace takes halfplane ${ \{\bar{y}\in \mathbb{R}^n | \bar{c}^\top \bar{y} = \bar{c}^\top \bar{x}\} }$ as boundary and ${ \bar{x} }$ is the only point of ${ P }$ belonging to that hyperplane.

In other words, ${ \bar{x} }$ is a vertex of ${ P}$ if and only if ${ P }$ is on one side of a hyperplane which meets ${ P }$ only at the point ${ \bar{x} }$

Basic Feasible Solution

Active constraints

Consider a polyhedron ${ P \in \mathbb{R}^n }$ defined by a family of linear equality and inequality constraints:

$$ \begin{equation} \begin{aligned} \bar{a_i}^\top \bar{x} &\leq b_i, &i \in L \\ \bar{a_j}^\top \bar{x} &= b_j, &j \in E \\ \bar{a_k}^\top \bar{x} &\geq b_k, &k \in G \end{aligned} \end{equation} $$

Definition: If a vector ${ \bar{x}^* }$ statisfies ${ \bar{a_i}^\top \bar{x}^* = b_i }$ for some ${ i \in L, E }$ or ${ G }$, we say that corresponding constraint is active or binding at ${ \bar{x}^* }$.

* When do the constraints active at a point charaterize that point uniquely?

Theorem: Fix a point ${ \bar{x}^* \in \mathbb{R}^n }$ and consider

the set ${ I_{\bar{x}^*} = \{i \in L \cup E \cup G| \bar{a_i}^\top \bar{x}^* = b_i\}}$ of indices of constraints that are active at ${ \bar{x}^* }$
and the set ${ \{ \bar{a_i} \in \mathbb{R}^n | i \in I_{\bar{x}^*} \} }$ of normal vector to the hyperplanes corresponding to constraints active at ${ \bar{x}^* }$

Then following are equivalent:

(1) There exists ${ n }$ linearly independent vectors in the set ${ \{ \bar{a_i} \in \mathbb{R}^n | i \in I_{\bar{x}^*} \} }$

(2) Every element of ${ \mathbb{R}^n }$ can be represented as a linear combination of ${ \{ \bar{a_i} \in \mathbb{R}^n | i \in I_{\bar{x}^*} \} }$. In other words, ${\text{span} \{ \bar{a_i} \in \mathbb{R}^n | i \in I_{\bar{x}^*} \} = \mathbb{R}^n}$

(3) The system of linear equations

\[\bar{a_i}^\top \bar{x} = b_i, \quad \text{for } i \in I_{\bar{x}^*}\]

has a unique solution ${ \bar{x}^*}$.

Proof.

(2) $\Rightarrow$ (1)

Suppose ${\text{span}\{\bar{a}_i \mid i \in I_{x^*}\} = \mathbb{R}^n}$. Then, one can extract a basis for $\mathbb{R}^n$ from the set ${\{\bar{a}_i \mid i \in I_{x^*}\}}$. The ${n}$ vectors that are constituting basis are necessarily linearly independent (by the definition of basis.)

(1) $\Rightarrow$ (2)

Suppose $\bar{a}_{i_1}, \bar{a}_{i_2}, …, \bar{a}_{i_n}$ are ${n}$ linearly independent vectors in the set ${\{\bar{a}_i \mid i \in I_{x^*}\}}$. Then ${\text{span}\{\bar{a}_{i_1}, \bar{a}_{i_2}, …, \bar{a}_{i_n}\}}$ is an $n$-dimensional linear subspace of $\mathbb{R}^n$. Thus

\[\text{span}\{\bar{a}_{i_1}, \bar{a}_{i_2}, ..., \bar{a}_{i_n}\} = \mathbb{R}^n\]

However, ${\{\bar{a}_{i_1}, \bar{a}_{i_2}, …, \bar{a}_{i_n}\} \subset \{\bar{a}_i \mid i \in I_{x^*}\} \subset \mathbb{R}^n}$, which implies

\[\text{span}\{\bar{a}_{i_1}, \bar{a}_{i_2}, ..., \bar{a}_{i_n}\} \subset \text{span}\{\bar{a}_i | i \in I^{n*}\} \subset \mathbb{R}^n\]

It follows that ${\text{span}\{\bar{a}_i \mid i \in I^{n*}\} = \mathbb{R}^n}$.

(2) $\Rightarrow$ (3)

Suppose the system $\bar{a}_i^{\top} x = b_i$ for $i \in I_{x*}$ has multiple solutions, say $x_1$ and $x_2$, then the (nonzero) vector $\bar{d} = x_2 - x_1$ satisfies $\bar{a}_i^{\top} \bar{d} = 0$ for all $i \in I_{x^*}$. But then $\bar{d}$ (being orthogonal to every vector $\bar{a}_i$ with $i \in I_{x^*}$) and the vector $\bar{d}$ is not a linear combination of those vectors, which contradicts that vectors $\bar{a}_i$ with $i \in I_{x^*}$ do not span $\mathbb{R}^n$.

(3) $\Rightarrow$ (2)

Suppose the vectors $\bar{a}_i$ with $i \in I_{x^*}$ do not span $\mathbb{R}^n$, we can choose a nonzero vector ${\bar{d} \in (\text{span}\{\bar{a}_i \mid i \in I_{x^*}\})^\perp}$. Then, we have $\bar{a}_i^{\top} x^* = b_i,\forall i \in I_{x^*}$ and ${\bar{d} \in (\text{span}\{\bar{a}_i \mid i \in I_{x^*}\})^\perp \iff \bar{a}_i^{\top} \bar{d} = 0,\forall i \in I_{x^*}}$.

Therefore, $\bar{a}_i^{\top}(x^* + t\bar{d}) = b_i, \forall i \in I_{x^*}$ (the parameter $t$ can be any real number we want).

Thus, the linear system $\bar{a}_i^{\top} x = b_i$ for all $i \in I_{x^*}$ admits infinitely many solutions, which contradicts to that $x^*$ is the only solution of the system $\bar{a}_i^{\top} x = b_i$ for all $i \in I_{x^*}$. $\square$

Basic feasible solution

Definition: Consider a family of linear equality and inequality constraints describing a polyhedron ${ P\in \mathbb{R}^n }$

(1) A vector ${ \bar{x}^* \in \mathbb{R}^n }$ is said to be a basic solution if all equality constraints are active at ${ \bar{x}^* }$ and if there are ${ n }$ of the active constraints that are linearly independent.

(2) A basic solution that satisfies all constraints is called a basic feasible solution

Equivalence of Three Characterizations

Theorem: Let ${ \bar{x}^* }$ be a point in a non-empty polyhedron ${ P }$. Then following are equivalent

(1) ${ \bar{x}^* }$ is a vertex of ${ P }$

(2) ${ \bar{x}^* }$ is an extreme point of ${ P }$

(3) ${ \bar{x}^* }$ is a basic feasible solution of ${ P }$

Proof. Without loss of generality, we may assume that $P$ is represented by constraints of the form $\bar{a}_i^{\top} x \geq b_i$ and $\bar{a}_i^{\top} x = b_i$ exclusively.

(1) $\Rightarrow$ (2)

Suppose $x^*$ is a vertex of $P$. Then, by definition, $\exists \bar{c} \in \mathbb{R}^n, \bar{c} \neq \bar{0}$ such that

\[\forall \bar{y} \in P, \bar{y} \neq \bar{x}^* \Rightarrow \bar{c}^{\top} \bar{x}^* < \bar{c}^{\top}y.\]

Pick any two $\bar{y}, \bar{z} \in P$ with $\bar{y} \neq \bar{x}^*$ and $\bar{z} \neq \bar{x}^*$and any scalar $\lambda \in [0,1]$. We necessarily have

\[\bar{c}^{\top} \bar{x}^* < \bar{c}^{\top}\bar{y} \quad \text{and} \quad \bar{c}^{\top} \bar{x}^* < \bar{c}^{\top}\bar{z}\] \[\begin{aligned} \bar{c}^{\top}(\bar{y} - \bar{x}^*) &> 0 \quad \text{and} \quad \bar{c}^{\top}(\bar{z} - \bar{x}^*) > 0 \end{aligned}\]

$\forall \lambda \in [0,1]$, we have

\[(1-\lambda) \bar{c}^{\top}(\bar{y}-\bar{x}^*) + \lambda \bar{c}^{\top}(\bar{z}-\bar{x}^*) > 0\] \[\begin{aligned} (1-\lambda) \bar{c}^{\top}\bar{y} + \lambda \bar{c}^{\top}\bar{z} > & (1-\lambda) \bar{c}^{\top} \bar{x}^* + \lambda \bar{c}^{\top} \bar{x}^* \\ \\ \bar{c}^{\top} \bar{x}^* < & \bar{c}^{\top}((1-\lambda)\bar{y} + \lambda \bar{z}) \end{aligned}\]

Therefore

\[\bar{x}^* \neq (1-\lambda)\bar{y} + \lambda \bar{z}\]

Thus $\bar{x}^*$ cannot be expressed as a convex combination of two any other elements $\bar{y}, \bar{z}$ of $P$, which means $\bar{x}^*$ extreme point of $P$.

(2) $\Rightarrow$ (3)

Suppose $\bar{x}^* \in P$ is not a basic feasible solution. Let ${I_{\bar{x}^*} = \{ i \mid \bar{a}_i^{\top} \bar{x}^* = b_i \}}$ be the set of indices of constraints that are active at $\bar{x}^*$.

Since $\bar{x}^*$ is not a basic feasible solution, there do not exist $n$ linearly independent vectors in ${\{ \bar{a}_i \mid i \in I_{\bar{x}^*} \}}$. Hence ${\text{span} \{ \bar{a}_i \mid i \in I_{\bar{x}^*} \}}$ is a proper subspace of $\mathbb{R}^n$, and the dimension of ${(\text{span} \{ \bar{a}_i \mid i \in I_{\bar{x}^*} \})^\perp}\geq 1$, and we can pick a nonzero vector ${\bar{d} \in (\text{span} \{ \bar{a}_i \mid i \in I_{\bar{x}^*} \})^\perp}$.

In other words, there exists $\bar{d} \in \mathbb{R}^n$ such that $\bar{a}_i^{\top} \bar{d} = 0, \forall i \in I_{x^*}$. Consider the points of the form $\bar{x}^* + \epsilon \bar{d}$ where $\epsilon$ is a small positive number.

For all $i \in I_{\bar{x}^*}$, we have

\[\bar{a}_i^{\top}(\bar{x}^* + \epsilon \bar{d}) = \bar{a}_i^{\top} \bar{x}^* + \epsilon \bar{a}_i^{\top} \bar{d} = b_i + \epsilon \cdot 0 = b_i\]

The constraints that are active at $\bar{x}^*$ are also active at $\bar{x}^* + \epsilon \bar{d}$. Furthermore, for $i \notin I_{\bar{x}^*}$, we have

\[\bar{a}_i^{\top}(x^* \pm \epsilon \bar{d}) = (\bar{a}_i^{\top} \bar{x}^*) \pm \epsilon (\bar{a}_i^{\top} \bar{d})\]

Here, we know $\bar{a}_i \bar{x}^* \geq b_i$ (because $\bar{x}^*$ is a feasible solution, $\bar{x}^* \in P$) and $\bar{a}_i \bar{x}^* \neq b_i$ (because $\bar{x^*}$ is not active at this constraint, $i \notin I_{\bar{x}^*}$), which implies $\bar{a}_i \bar{x}^* > b_i$. Therefore, by choosing $\epsilon$ sufficiently small (such that $\epsilon \vert \bar{a}_i^{\top} \bar{d} \vert < \bar{a}_i \bar{x}^* - b_i$).

In other word, we construct two points $\bar{y}:=\bar{x}^* + \epsilon \bar{d}$ and $\bar{z}:= \bar{x}^* - \epsilon \bar{d}$ both belong to $P$. Since $\bar{x}^* = \frac{1}{2} \bar{y} + \frac{1}{2} \bar{z}$, which contradicts to that $\bar{x}^*$ is an extreme points of $P$.

(3) $\Rightarrow$ (1)

Suppose $\bar{x}^*$ is a basic feasible solution of $P$ and let ${I_{\bar{x}^*} = \{ i \mid \bar{a}_i^{\top} \bar{x}^* = b_i \}}$ be the set of indices of constraints that are active at $\bar{x}^*$. Consider the vector $\bar{c} = \sum_{i \in I_{\bar{x}^*}} \bar{a}_i$. We have

\[\bar{c}^{\top} \bar{x}^* = \sum_{i \in I_{\bar{x}^*}} \bar{a}_i^{\top} \bar{x}^* = \sum_{i \in I_{\bar{x}^*}} b_i\]

Furthermore, for any element $x \in P$, we have $\bar{a}i^{\top} x \geq b_i, \forall i \in I{\bar{x}^*}$ and thus

\[\bar{c}^{\top} \bar{x} = \sum_{i \in I_{\bar{x}^*}} \bar{a}_i^{\top} \bar{x} \geq \sum_{i \in I_{\bar{x}^*}} b_i \quad \quad \quad (*)\]

This shows that $x^*$ is an optimal solution to the problem of minimizing the function $x \mapsto \bar{c}^{\top} x$ over the polyhedron $P$.

Equality holds in $(\ast)$ if and only if $\bar{a}_i^{\top} \bar{x} = b_i, \forall i \in I_{\bar{x}^*}$. However, since $\bar{x}^*$ is a basic feasible solution, there are $n$ linearly independent constraints that are active at $\bar{x}^*$ and $\bar{x}^*$ is therefore the unique solution of the linear system $\bar{a}_i^{\top} \bar{x} = b_i, i \in I_{\bar{x}^*}$ according to above theorem. $\square$

PREVIOUSLec 8:Polyhedra and Convex Set

NEXTLec 10:Polyhedra in Standard form