Divide-and-Conquer:Strassen, Fibonacci, Polynomial Multiplication

 

This lecture will talk about “Divide-and-Conquer”, which is a very powerful algorithm design technique. The main paradigm of it is shown in the following:

  • Divede: If we are given some big problem and don’t know how to solve it in an efficient way, we can try to solve it into some small sub-problems.

  • Conquer: Then we can conquer each problem recursively.

  • Combine: Sometings, we need combine the solutions of these subproblems into the solution of the big problem.

Powering a number

Given a number ${ x \in \mathbb{R}}$, and a integer ${ n \geq 0 }$. Computer ${ x^n }$.

We can use “Divide-and-Conquer” strategy to solve this problem. For ${ x^n }$, we can treat it as ${ x^{n/2} \cdot x^{n/2} }$. So, we just need to solve the ${ x^{n/2} }$, and do one more time of multiplying to get ${ x^n }$. That’s the big picture of our idea. Let’s show it in detail.

$$ \begin{equation} x^n = \begin{cases} x^{n/2} \cdot x^{n/2}, \text{ if } x \text{ is odd, } \\ x^{\frac{n-1}{2}} \cdot x^{\frac{n-1}{2}} \cdot x, \text{ if } x \text{ is even }. \\ \end{cases} \end{equation} $$

Therefore, we can get the recurence of ${ T(n)}$

$$ \begin{equation} T(n) = \begin{cases} T(\lfloor n/2 \rfloor) + 1, \text{ if } x \text{ is odd, } \\ T(\lfloor n/2 \rfloor) + 2, \text{ if } x \text{ is even, } \end{cases} \end{equation} $$

By Master Theorem, ${ T(n) = T(\lfloor n/2 \rfloor) + \Theta(1) = \lg n }$

Fibonacci numbers

The definiation of Fibonacci numbers

$$ \begin{equation} F_n = \begin{cases} 0, \text{ if } n = 0 \\ 1, \text{ if } n = 1 \\ F_{n-1}+F_{n-2} , \text{ if } n \geq 2 \end{cases} \end{equation} $$

Naive recursive alg.

We can just calculate ${ F_n }$ by recursion. Drawing the recursion tree, we will find that the size of problem only decrease ${ 1 }$ or ${ 2 }$ in each step. Therefore, we can get the running time of this algorithm is ${ T(n) = \Omega(2^{n/2}) }$ and ${ T(n) = O(2^n) }$.

In fact, we can prove ${ T(n) = \Theta(\varphi^n), \varphi = \frac{1+\sqrt{5}}{2}}$. We can get the recurrence is ${ T(n)=T(n-1)+T(n-2)+\Theta(1) }$. And ${ T(0),T(1) }$ is trivial. If we draw the recursion tree and aligned it to the tree of Fibonacci number. Like the following figure, we can calculate the number of notes.

For each level ${ i }$, we want to calculate its number ${ N(i) }$, which represent ${ F_{n-i+1} }$ in fact. So, we can calculate how many times of ${ F_{n-i+1} }$ appears. Acctually, ${ F_{n-i+1} }$ comes from ${ F_{n-i+2} }$ and ${ F_{n-i+3} }$, therefore the ${ N(i) = N(i-1)+N(i-2) }$. We find that ${ N(n) }$ is also a Fibonacci squence. It's ease to check that the number of leaves is ${ F(n) }$. We denote the sum of the Fibonacci squence as ${ S(n) = \sum_{i=1}^n F(i)}$. And the number of inner notes in the tree is ${ S(n-1) }$. Let's solve the general term fomula of ${ S(n) }$.
$$ \begin{equation} \begin{aligned} 2S(n) &= (F(1)+F(2)+\cdots+F(n-1)+F(n))+(F(1)+F(2)+\cdots+F(n-1)+F(n)) \\ & = F(1)+(F(1)+F(2)) + (F(2)+F(3)) + (F(3)+F(4)) + \cdots + (F(n-1)+F(n)) + F(n) \\ & = F(1) + F(3) + F(4) + \cdots + F(n) + F(n+1) \\ &= S(n) - F(2) + F(n+1) \end{aligned} \end{equation} $$
Then, we can get ${ S(n) = F(n+1) - 1 }$. Therefore, ${ T(n) = F(n)\cdot \Theta(1) + S(n-1)\cdot \Theta(1) = \Theta(F(n))}$ In the last part of this section, we calculate ${ F(n)=\frac{1}{\sqrt {5}} \left(\frac{1+ \sqrt {5}}{2}\right)^n - \frac{1}{\sqrt {5}} \left(\frac{1 - \sqrt {5}}{2}\right)^n }$. So, we can get
$$ \frac{1}{\sqrt {5}} \left(\left(\frac{1+ \sqrt {5}}{2}\right)^n - 1\right) < F(n) + \frac{1}{\sqrt {5}} \left(\left(\frac{1+ \sqrt {5}}{2}\right)^n +1\right) $$
Therefore, ${T(n) = \Theta(\varphi^n), \varphi = \frac{1+\sqrt{5}}{2} }$

But we want a polynomial algorithm, so let’s move forward.

Bottom-up algorithm

Because in the above algorithm, we calculate a lot of repeat item in the tree. That’s not necessary! We can just calculate it from ${ F_1 }$!

1
2
3
4
5
6
7
8
9
10
11
12
13
Bottom_up_Fibonacci(n):
    if n == 1 or n == 2
        return 1
    else
        a = 1
        b = 1
        i = 0
        while i < n-2
            c = b
            b = a + b
            a = c
            i++
        return b

It’s easy to check the total running time is ${ T(n) = \Theta(n) }$

${ \require{cancel} \bcancel{\text{Naive recursive squaring} } }$

From the following General Term Formula, we can get ${ F_n = [\varphi^n / \sqrt{5}] ,\varphi = \frac{1+\sqrt{5}}{2}}$. We denote operation ${ [\cdot] }$ as taking the nearest integer.

Ideally, we can use the algorithm in last section to calculate ${ F_n = \varphi^n / \sqrt{5}}$ and take the nearest integer in ${ \Theta(\lg n) }$ time.

However the above operation cannot be finished in a real machine. That’s because our computer have to reperent numbers, like ${ \varphi, \sqrt{5}}$, as floating point numbers. That means, we have fixed amount of precise bits, which can not promise we get the correct answer when we take nearest integer!!

Recursive squaring (${ \checkmark }$)

Thm: ${ \left( \matrix{F_{n+1}&F_n\\F_n&F_{n-1}} \right) = \left( \matrix{1&1\\1&0} \right)^n }$. We will prove it by induction Initially, we will check the base case
$$ \left( \matrix{F_{2}&F_1\\F_1&F_{0}} \right) = \left( \matrix{1&1\\1&0} \right)^1 $$
Then, we assume statement is true, when ${ n<k }$, now let's check it when ${ n = k }$
$$ \begin{equation} \begin{aligned} \left( \matrix{F_{n}&F_{n-1}\\F_{n-1}&F_{n-2}} \right)\left( \matrix{1&1\\1&0} \right) &= \left( \matrix{F_{n}+F_{n-1}&F_{n}+0\cdot F_{n-1}\\F_{n-1}+F_{n-2}&F_{n-1}+0\cdot F_{n-2}} \right) =\left( \matrix{F_{n+1}&F_n\\F_n&F_{n-1}} \right) \\ \end{aligned} \end{equation} $$
By assumption ${ \left( \matrix{F_{n}&F_{n-1}\\F_{n-1}&F_{n-2}} \right) = \left( \matrix{1&1\\1&0} \right)^{n-1} }$, we can get
$$ \left( \matrix{F_{n+1}&F_n\\F_n&F_{n-1}} \right) = \left( \matrix{1&1\\1&0} \right)^{n} $$
By induction, we prove our Theorem. Q.E.D

Therefore, we can computer ${ F_n }$ by nth power of that 2-by-2 matrix. And, the multiplication of two 2-by-2 matrices only need a constant running time. Therefore, we can compute ${ \left( \matrix{1&1\\1&0} \right)^n }$ in ${ \Theta(\lg n) }$ time by the strategy mentioned in Section #1.

General Term Formula

For Fibonacci Sequence, we want to find its General term formula. We want find an appropriate cofficient ${ \lambda }$ to make ${A_n = F_n - \lambda F_{n-1} }$ become a geometrical sequence

$$ F_n = F_{n-1} + F_{n-2} \Rightarrow F_n - \lambda F_{n-1} = (1-\lambda)F_{n-1} + F_{n-2} $$

We can solve the equation ${ \frac{1}{-\lambda} =\frac{1-\lambda}{1}}$ to get ${ \lambda = \frac{1\pm \sqrt {5}}{2} }$. We can select ${ \lambda = \frac{1- \sqrt {5}}{2} }$, then the ratio ${ q }$ of ${ A_n }$ is ${ \frac{1+ \sqrt {5}}{2} }$, that means

$$ F_n - \frac{1- \sqrt {5}}{2} \cdot F_{n-1} = \frac{1+ \sqrt {5}}{2}(F_{n-1} - \frac{1- \sqrt {5}}{2} \cdot F_{n-2}) $$

So, we can get ${ A_n = q^{n-2}A_2}$. Hence, we can solve ${ F_n }$ by ${ F_n -\frac{1- \sqrt {5}}{2} \cdot F_{n-1}= q^{n-2}A_2 }$, that is (note ${ \lambda \cdot q = -1 }$)

$$ \begin{equation} \begin{aligned} F_n &=\lambda \cdot F_{n-1} + q^{n-2}A_2 \\ &= \lambda (\lambda \cdot F_{n-2} + q^{n-3}A_2 )+q^{n-2}A_2 \\ &= \lambda^2 \cdot F_{n-2} + \lambda \cdot q^{n-3}A_2 + q^{n-2}A_2\\ & = \lambda^3 \cdot F_{n-3} + \lambda^2 \cdot q^{n-4}A_2+\lambda \cdot q^{n-3}A_2 + q^{n-2}A_2\\ & = \cdots \\ & = \lambda^{n-1}F_1 + A_2(\lambda^{n-2}q^0 +\lambda^{n-3}q^1 +\cdots + \lambda^2 \cdot q^{n-4}+\lambda \cdot q^{n-3} + q^{n-2}) \\ & = \lambda^{n-1}F_1 + A_2\left(\sum_{i=0}^{n-2}q^{n-2}\cdot \left(\frac{\lambda}{q}\right)^i\right)\\ & = \lambda^{n-1}F_1 + A_2 \left(\frac{q^{n-2}(1-\left(\frac{\lambda}{q}\right)^{n-1})}{1-\frac{\lambda}{q}}\right)\\ &= \lambda^{n-1} + A_2 \left( \frac{q^{n-1}-\lambda^{n-1}}{q-\lambda}\right)\\ &= (1-\frac{A_2}{q-\lambda})\lambda^{n-1} + \frac{A_2}{q-\lambda} q^{n-1} \\ & = \frac{1}{\sqrt {5}} \left(\frac{1+ \sqrt {5}}{2}\right)^n - \frac{1}{\sqrt {5}} \left(\frac{1 - \sqrt {5}}{2}\right)^n \end{aligned} \end{equation} $$

Matrix multiplication

Let’s recap the definition of Matrix multiplication. Suppose we have matrix ${ A\in\mathbb{R}^{n\times n}=[a_{ij}], B\in\mathbb{R}^{n\times n}=[b_{ij}]}$, So ${ C= A\cdot B\in\mathbb{R}^{n\times n}=[c_{ij}] }$

$$ c_{ij} = \sum_{k=1}^n a_{ik} \cdot b_{kj} $$

Standard Algorithm

We can write down an algorithm directly by definition, and the psedocode is shown in the following

1
2
3
4
5
6
7
8
Matrix_Multiplication(A,B):
    n = size_of(A)
    let C be a new n*n matrix
    for i = 1 to n
        for j = 1 to n
            c_ij = 0
            for k = 1 to n
                c_ij = c_ij + a_ik * b _kj

It’s easy to check the above algorithm take ${ \Theta (n^3)}$ running time.

Divide-and-conquer alg.

If we consider “Divide-and-conquer” strategy, we will have the following idea:

For a ${ n\times n }$ matrix, we can treat is a ${ 2\times 2 }$ block matrix of ${ n/2 \times n/2 }$ submatrix (here we suppose ${ n = 2^k, k\in \mathbb{N}^+ }$). We can show it as follow

$$ A = \left[ \matrix{A_{11}& A_{12} \\ A_{21} &A_{22} }\right], B = \left[ \matrix{B_{11}& B_{12} \\ B_{21} &B_{22} }\right], C = \left[ \matrix{C_{11}& C_{12} \\ C_{21} &C_{22} }\right] $$

And, we can check

$$ \left[ \matrix{C_{11}& C_{12} \\ C_{21} &C_{22} }\right] = \left[ \matrix{A_{11}& A_{12} \\ A_{21} &A_{22} }\right] \cdot \left[ \matrix{B_{11}& B_{12} \\ B_{21} &B_{22} }\right], $$

That means, to calculate ${ C }$, we need to recursively do the following operation

$$ \begin{align} C_{11} = A_{11}\cdot B_{11} + A_{12}\cdot B_{21} \\ C_{12} = A_{11}\cdot B_{12} + A_{12} \cdot B_{22} \\ C_{21} = A_{21} \cdot B_{11} + A_{22}\cdot B_{21} \\ C_{22} = A_{21} \cdot B_{12} + A_{22}\cdot B_{22} \end{align} $$

Notice the above operation, we will find that we need do ${ 8 }$ recursive multiplication of ${ n/2 \times n/2 }$ matrix, and ${ 4 }$ addtion of ${ n/2 \times n/2 }$ matrix. So we can get the recurrence of cost (we add two items in the same postion of two matrix together, which need be done for ${ n^2/4 }$ times)

$$ T(n) = 8 T(n/2) + \Theta(n^2) $$

By Master theorem, we solve the recurrence ${ T(n) = \Theta (n^{\log_2 8}) = \Theta (n^3) }$. It’s a suck result, we don’t improve our running time. BUT! The recurrence can give us some inspiration.

Strassen Algorithm

The whole idea of Strassen Algorithm is making the cofficient ahead of the ${ T(n/2) }$ become smaller, that means we need to decrease the times of multiplication. To complete it, we don’t care about the extra addition opertaion, because it only costs ${ \Theta(n^2) }$ time.

$$ \begin{align} P_{1} &= A_{11}\cdot (B_{12}-B_{22}) \\ P_{2} &= (A_{11}+A_{12})\cdot B_{22} \\ P_{3} &= (A_{21}+A_{22})\cdot B_{11} \\ P_{4} &= A_{22}\cdot (B_{21}-B_{11}) \\ P_{5} &= (A_{11}+A_{22})\cdot (B_{11}+B_{22}) \\ P_{6} &= (A_{12}-A_{22})\cdot (B_{21}+B_{22}) \\ P_{7} &= (A_{11}-A_{21})\cdot (B_{11}+B_{12}) \\ \end{align} $$

Then, we can use the above matrices ${ P_1,\cdots ,P_7 }$ to calculate ${ C_{11},C_{12},C_{21},C_{22} }$

$$ \begin{align} C_{11} &= P_5+P_4-P_2+P_6 \\ C_{12} &= P_1 + P_2 \\ C_{21} &= P_3 + P_4 \\ C_{22} &= P_5+P_1-P_3-P_7 \\ \end{align} $$
It's trivial to check the correctness, if you're interested, please click the left "${ \blacktriangleright }$" button.
$$ \begin{equation} \begin{aligned} C_{11} &= P_5+P_4-P_2+P_6 \\ &= (A_{11}+A_{22})\cdot (B_{11}+B_{22}) + A_{22}\cdot (B_{21}-B_{11}) - (A_{11}+A_{12})\cdot B_{22} + (A_{12}-A_{22})\cdot (B_{21}+B_{22}) \\ &= A_{11}\cdot B_{11} + \bcancel{A_{11} \cdot B_{22}} +\cancel{A_{22}\cdot B_{11}} + \xcancel{A_{22}\cdot B_{22}} + \bcancel{A_{22}\cdot B_{21}} - \cancel{A_{22}\cdot B_{11}}-\bcancel{A_{11}\cdot B_{22}} - \cancel{A_{12} \cdot B_{22}} \\ & \bcancel{- A_{22}\cdot B_{21}} - \xcancel{A_{22} \cdot B_{22}} + \cancel{A_{12}\cdot B_{22}} + A_{12} \cdot B_{21} \\ &= A_{11}\cdot B_{11} + A_{12} \cdot B_{21} \end{aligned} \end{equation} $$
$$ \begin{equation} \begin{aligned} C_{12} &= P_1 + P_2 \\ &= A_{11}\cdot (B_{12}-B_{22}) + (A_{11}+A_{12})\cdot B_{22} \\ &= A_{11}\cdot B_{12} + A_{12}\cdot B_{22} \end{aligned} \end{equation} $$
$$ \begin{equation} \begin{aligned} C_{21} &= P_3 + P_4 \\ &= (A_{21}+A_{22})\cdot B_{11} + A_{22}\cdot (B_{21}-B_{11}) \\ &= A_{21}\cdot B_{11} + A_{22}\cdot B_{21} \end{aligned} \end{equation} $$
$$ \begin{equation} \begin{aligned} C_{21} &= P_5 + P_1-P_3-P_7 \\ &= (A_{11}+A_{22})\cdot (B_{11}+B_{22}) + A_{11}\cdot (B_{12}-B_{22}) - (A_{21}+A_{22})\cdot B_{11} - (A_{11}-A_{21})\cdot (B_{11}+B_{12}) \\ &= A_{22}\cdot B_{22} + A_{21} \cdot B_{12} \end{aligned} \end{equation} $$

Notice that the above operations only take ${ 7 }$ multiplication. So, the recurrence is ${ T(n) = 7 T(n/2) + \Theta(n^2)= \Theta(n^{\log_2 7})=O(n^{2.81}) }$

PS: how about ${ n }$ is odd

Actually, we will meet the situation that ${ n }$ is odd, so we hit some bumps during using Strassen Algorithm. But, don’t worry! We can pad the matrices with zeros to solve it!

We will use the following fact

$$ \left[\matrix{A&0\\0&0}\right] \cdot \left[\matrix{B&0\\0&0}\right] = \left[\matrix{AB&0\\0&0}\right] $$

Here the two padded matrices are ${ N\times N }$. In other words, we pad ${ N-n }$ columns and ${ N-n }$ rows of zeros into original matrices.

In practice, we can pad the matrices recursively. When ${ n }$ is odd, we can pad it into ${ (n+1) \times (n+1) }$ matrix, and call Strassen Algorithm to compute the multiplication.

VLSI layout Problem.

VLSI (Very Large_Scale Integration) problem is embed a completed binary tree with ${ n }$ leaves into a grid with minimum area.

naive embedding

We can first come up a naive idea like the following figure. Therefore, we can write down the recurrence of the height ${ h(n) }$ and width ${ w(n) }$. ${ h(n) = h(n/2) + \Theta(1)=\Theta(\lg n), w(n) = 2w(n/2) + O(1) =\Theta(n) }$. Therefore the area is ${ n \lg n }$.

Can we do better? Maybe we can try ${ \Theta(\sqrt {n}) \cdot \Theta(\sqrt {n}) }$!!

So, what kind of recurrence is ${ \Theta(\sqrt {n})}$? We can guess ${ T(n) = 2T(n/4) + O(n^{1/2-\varepsilon}) }$.

H layout

Why we take “H”, because we want divide ${ n }$ into ${ 4 }$ part! And each part will have ${ n/4 }$ leaves! The root note is the middle!It’s easy to check ${ h(n) = 2h(n/4) + \Theta(1)= \Theta(\sqrt {n}), w(n) = 2w(n/4) + \Theta(1) = \Theta(\sqrt {n})}$.

So, we get the area that is ${ \Theta(n) }$