This blog is talking about three different characterizations of the corners of a polyhedron, say ${ P }$, and then show that all three characterizations are equivalent.
Extreme Points
Definition: An extreme point of a polyhedron ${ P }$ is a vector ${ \bar{x} \in P }$ such that one cannot find two vectors ${ \bar{y}, \bar{z} \in P }$ both different from ${ \bar{x} }$ and a scalar ${ \lambda \in [0,1] }$ such that ${ \bar{x} = (1-\lambda)\bar{y} + \lambda \bar{z} }$.
Vertices
Definition: A vector ${ \bar{x} \in P }$ is a vertex of the polyhedron ${ P }$ if there exists some vector ${ \bar{c} }$ such that ${ \bar{c}^ \top \bar{x} < \bar{c}^\top \bar{y} }$ for all ${ \bar{y} \in P, \bar{y} \neq \bar{x} }$
From ${ \bar{c}^ \top \bar{x} < \bar{c}^\top \bar{y} }$, we have ${ \bar{c}^\top (\bar{y} - \bar{x}) > 0 }$. That means the angle between the vectors ${ \bar{c} }$ and ${ \bar{y} - \bar{x} }$ is less than ${ 90 ^\circ }$, for all ${ \bar{y} \in P, \bar{y} \neq \bar{x} }$.
Hence, the polyhedron is fully containd in a halfspace of ${ \mathbb{R}^n }$. And the halfspace takes halfplane ${ \{\bar{y}\in \mathbb{R}^n | \bar{c}^\top \bar{y} = \bar{c}^\top \bar{x}\} }$ as boundary and ${ \bar{x} }$ is the only point of ${ P }$ belonging to that hyperplane.
In other words, ${ \bar{x} }$ is a vertex of ${ P}$ if and only if ${ P }$ is on one side of a hyperplane which meets ${ P }$ only at the point ${ \bar{x} }$
Basic Feasible Solution
Active constraints
Consider a polyhedron ${ P \in \mathbb{R}^n }$ defined by a family of linear equality and inequality constraints:
Definition: If a vector ${ \bar{x}^* }$ statisfies ${ \bar{a_i}^\top \bar{x}^* = b_i }$ for some ${ i \in L, E }$ or ${ G }$, we say that corresponding constraint is active or binding at ${ \bar{x}^* }$.
* When do the constraints active at a point charaterize that point uniquely?
Theorem: Fix a point ${ \bar{x}^* \in \mathbb{R}^n }$ and consider
-
the set ${ I_{\bar{x}^*} = \{i \in L \cup E \cup G| \bar{a_i}^\top \bar{x}^* = b_i\}}$ of indices of constraints that are active at ${ \bar{x}^* }$
-
and the set ${ \{ \bar{a_i} \in \mathbb{R}^n | i \in I_{\bar{x}^*} \} }$ of normal vector to the hyperplanes corresponding to constraints active at ${ \bar{x}^* }$
Then following are equivalent:
(1) There exists ${ n }$ linearly independent vectors in the set ${ \{ \bar{a_i} \in \mathbb{R}^n | i \in I_{\bar{x}^*} \} }$
(2) Every element of ${ \mathbb{R}^n }$ can be represented as a linear combination of ${ \{ \bar{a_i} \in \mathbb{R}^n | i \in I_{\bar{x}^*} \} }$. In other words, ${\text{span} \{ \bar{a_i} \in \mathbb{R}^n | i \in I_{\bar{x}^*} \} = \mathbb{R}^n}$
(3) The system of linear equations
\[\bar{a_i}^\top \bar{x} = b_i, \quad \text{for } i \in I_{\bar{x}^*}\]has a unique solution ${ \bar{x}^*}$.
Proof.
- (2) $\Rightarrow$ (1)
Suppose ${\text{span}\{\bar{a}_i \mid i \in I_{x^*}\} = \mathbb{R}^n}$. Then, one can extract a basis for $\mathbb{R}^n$ from the set ${\{\bar{a}_i \mid i \in I_{x^*}\}}$. The ${n}$ vectors that are constituting basis are necessarily linearly independent (by the definition of basis.)
- (1) $\Rightarrow$ (2)
Suppose $\bar{a}_{i_1}, \bar{a}_{i_2}, …, \bar{a}_{i_n}$ are ${n}$ linearly independent vectors in the set ${\{\bar{a}_i \mid i \in I_{x^*}\}}$. Then ${\text{span}\{\bar{a}_{i_1}, \bar{a}_{i_2}, …, \bar{a}_{i_n}\}}$ is an $n$-dimensional linear subspace of $\mathbb{R}^n$. Thus
\[\text{span}\{\bar{a}_{i_1}, \bar{a}_{i_2}, ..., \bar{a}_{i_n}\} = \mathbb{R}^n\]However, ${\{\bar{a}_{i_1}, \bar{a}_{i_2}, …, \bar{a}_{i_n}\} \subset \{\bar{a}_i \mid i \in I_{x^*}\} \subset \mathbb{R}^n}$, which implies
\[\text{span}\{\bar{a}_{i_1}, \bar{a}_{i_2}, ..., \bar{a}_{i_n}\} \subset \text{span}\{\bar{a}_i | i \in I^{n*}\} \subset \mathbb{R}^n\]It follows that ${\text{span}\{\bar{a}_i \mid i \in I^{n*}\} = \mathbb{R}^n}$.
- (2) $\Rightarrow$ (3)
Suppose the system $\bar{a}_i^{\top} x = b_i$ for $i \in I_{x*}$ has multiple solutions, say $x_1$ and $x_2$, then the (nonzero) vector $\bar{d} = x_2 - x_1$ satisfies $\bar{a}_i^{\top} \bar{d} = 0$ for all $i \in I_{x^*}$. But then $\bar{d}$ (being orthogonal to every vector $\bar{a}_i$ with $i \in I_{x^*}$) and the vector $\bar{d}$ is not a linear combination of those vectors, which contradicts that vectors $\bar{a}_i$ with $i \in I_{x^*}$ do not span $\mathbb{R}^n$.
- (3) $\Rightarrow$ (2)
Suppose the vectors $\bar{a}_i$ with $i \in I_{x^*}$ do not span $\mathbb{R}^n$, we can choose a nonzero vector ${\bar{d} \in (\text{span}\{\bar{a}_i \mid i \in I_{x^*}\})^\perp}$. Then, we have $\bar{a}_i^{\top} x^* = b_i,\forall i \in I_{x^*}$ and ${\bar{d} \in (\text{span}\{\bar{a}_i \mid i \in I_{x^*}\})^\perp \iff \bar{a}_i^{\top} \bar{d} = 0,\forall i \in I_{x^*}}$.
Therefore, $\bar{a}_i^{\top}(x^* + t\bar{d}) = b_i, \forall i \in I_{x^*}$ (the parameter $t$ can be any real number we want).
Thus, the linear system $\bar{a}_i^{\top} x = b_i$ for all $i \in I_{x^*}$ admits infinitely many solutions, which contradicts to that $x^*$ is the only solution of the system $\bar{a}_i^{\top} x = b_i$ for all $i \in I_{x^*}$. $\square$
Basic feasible solution
Definition: Consider a family of linear equality and inequality constraints describing a polyhedron ${ P\in \mathbb{R}^n }$
(1) A vector ${ \bar{x}^* \in \mathbb{R}^n }$ is said to be a basic solution if all equality constraints are active at ${ \bar{x}^* }$ and if there are ${ n }$ of the active constraints that are linearly independent.
(2) A basic solution that satisfies all constraints is called a basic feasible solution
Equivalence of Three Characterizations
Theorem: Let ${ \bar{x}^* }$ be a point in a non-empty polyhedron ${ P }$. Then following are equivalent
(1) ${ \bar{x}^* }$ is a vertex of ${ P }$
(2) ${ \bar{x}^* }$ is an extreme point of ${ P }$
(3) ${ \bar{x}^* }$ is a basic feasible solution of ${ P }$
Proof. Without loss of generality, we may assume that $P$ is represented by constraints of the form $\bar{a}_i^{\top} x \geq b_i$ and $\bar{a}_i^{\top} x = b_i$ exclusively.
- (1) $\Rightarrow$ (2)
Suppose $x^*$ is a vertex of $P$. Then, by definition, $\exists \bar{c} \in \mathbb{R}^n, \bar{c} \neq \bar{0}$ such that
\[\forall \bar{y} \in P, \bar{y} \neq \bar{x}^* \Rightarrow \bar{c}^{\top} \bar{x}^* < \bar{c}^{\top}y.\]Pick any two $\bar{y}, \bar{z} \in P$ with $\bar{y} \neq \bar{x}^*$ and $\bar{z} \neq \bar{x}^*$and any scalar $\lambda \in [0,1]$. We necessarily have
\[\bar{c}^{\top} \bar{x}^* < \bar{c}^{\top}\bar{y} \quad \text{and} \quad \bar{c}^{\top} \bar{x}^* < \bar{c}^{\top}\bar{z}\] \[\begin{aligned} \bar{c}^{\top}(\bar{y} - \bar{x}^*) &> 0 \quad \text{and} \quad \bar{c}^{\top}(\bar{z} - \bar{x}^*) > 0 \end{aligned}\]$\forall \lambda \in [0,1]$, we have
\[(1-\lambda) \bar{c}^{\top}(\bar{y}-\bar{x}^*) + \lambda \bar{c}^{\top}(\bar{z}-\bar{x}^*) > 0\] \[\begin{aligned} (1-\lambda) \bar{c}^{\top}\bar{y} + \lambda \bar{c}^{\top}\bar{z} > & (1-\lambda) \bar{c}^{\top} \bar{x}^* + \lambda \bar{c}^{\top} \bar{x}^* \\ \\ \bar{c}^{\top} \bar{x}^* < & \bar{c}^{\top}((1-\lambda)\bar{y} + \lambda \bar{z}) \end{aligned}\]Therefore
\[\bar{x}^* \neq (1-\lambda)\bar{y} + \lambda \bar{z}\]Thus $\bar{x}^*$ cannot be expressed as a convex combination of two any other elements $\bar{y}, \bar{z}$ of $P$, which means $\bar{x}^*$ extreme point of $P$.
- (2) $\Rightarrow$ (3)
Suppose $\bar{x}^* \in P$ is not a basic feasible solution. Let ${I_{\bar{x}^*} = \{ i \mid \bar{a}_i^{\top} \bar{x}^* = b_i \}}$ be the set of indices of constraints that are active at $\bar{x}^*$.
Since $\bar{x}^*$ is not a basic feasible solution, there do not exist $n$ linearly independent vectors in ${\{ \bar{a}_i \mid i \in I_{\bar{x}^*} \}}$. Hence ${\text{span} \{ \bar{a}_i \mid i \in I_{\bar{x}^*} \}}$ is a proper subspace of $\mathbb{R}^n$, and the dimension of ${(\text{span} \{ \bar{a}_i \mid i \in I_{\bar{x}^*} \})^\perp}\geq 1$, and we can pick a nonzero vector ${\bar{d} \in (\text{span} \{ \bar{a}_i \mid i \in I_{\bar{x}^*} \})^\perp}$.
In other words, there exists $\bar{d} \in \mathbb{R}^n$ such that $\bar{a}_i^{\top} \bar{d} = 0, \forall i \in I_{x^*}$. Consider the points of the form $\bar{x}^* + \epsilon \bar{d}$ where $\epsilon$ is a small positive number.
For all $i \in I_{\bar{x}^*}$, we have
\[\bar{a}_i^{\top}(\bar{x}^* + \epsilon \bar{d}) = \bar{a}_i^{\top} \bar{x}^* + \epsilon \bar{a}_i^{\top} \bar{d} = b_i + \epsilon \cdot 0 = b_i\]The constraints that are active at $\bar{x}^*$ are also active at $\bar{x}^* + \epsilon \bar{d}$. Furthermore, for $i \notin I_{\bar{x}^*}$, we have
\[\bar{a}_i^{\top}(x^* \pm \epsilon \bar{d}) = (\bar{a}_i^{\top} \bar{x}^*) \pm \epsilon (\bar{a}_i^{\top} \bar{d})\]Here, we know $\bar{a}_i \bar{x}^* \geq b_i$ (because $\bar{x}^*$ is a feasible solution, $\bar{x}^* \in P$) and $\bar{a}_i \bar{x}^* \neq b_i$ (because $\bar{x^*}$ is not active at this constraint, $i \notin I_{\bar{x}^*}$), which implies $\bar{a}_i \bar{x}^* > b_i$. Therefore, by choosing $\epsilon$ sufficiently small (such that $\epsilon \vert \bar{a}_i^{\top} \bar{d} \vert < \bar{a}_i \bar{x}^* - b_i$).
In other word, we construct two points $\bar{y}:=\bar{x}^* + \epsilon \bar{d}$ and $\bar{z}:= \bar{x}^* - \epsilon \bar{d}$ both belong to $P$. Since $\bar{x}^* = \frac{1}{2} \bar{y} + \frac{1}{2} \bar{z}$, which contradicts to that $\bar{x}^*$ is an extreme points of $P$.
- (3) $\Rightarrow$ (1)
Suppose $\bar{x}^*$ is a basic feasible solution of $P$ and let ${I_{\bar{x}^*} = \{ i \mid \bar{a}_i^{\top} \bar{x}^* = b_i \}}$ be the set of indices of constraints that are active at $\bar{x}^*$. Consider the vector $\bar{c} = \sum_{i \in I_{\bar{x}^*}} \bar{a}_i$. We have
\[\bar{c}^{\top} \bar{x}^* = \sum_{i \in I_{\bar{x}^*}} \bar{a}_i^{\top} \bar{x}^* = \sum_{i \in I_{\bar{x}^*}} b_i\]Furthermore, for any element $x \in P$, we have $\bar{a}i^{\top} x \geq b_i, \forall i \in I{\bar{x}^*}$ and thus
\[\bar{c}^{\top} \bar{x} = \sum_{i \in I_{\bar{x}^*}} \bar{a}_i^{\top} \bar{x} \geq \sum_{i \in I_{\bar{x}^*}} b_i \quad \quad \quad (*)\]This shows that $x^*$ is an optimal solution to the problem of minimizing the function $x \mapsto \bar{c}^{\top} x$ over the polyhedron $P$.
Equality holds in $(\ast)$ if and only if $\bar{a}_i^{\top} \bar{x} = b_i, \forall i \in I_{\bar{x}^*}$. However, since $\bar{x}^*$ is a basic feasible solution, there are $n$ linearly independent constraints that are active at $\bar{x}^*$ and $\bar{x}^*$ is therefore the unique solution of the linear system $\bar{a}_i^{\top} \bar{x} = b_i, i \in I_{\bar{x}^*}$ according to above theorem. $\square$