From today, I will post my study notes of Introduction to Algorithms (following SMA 5503). In the meantime, I am going to finish the assignments in this book and post my solutions coded in both python and C++ (share in my github repository). Besides, I also want to become familiar with C++ during this process.
The web page and videos are available on https://ocw.mit.edu/courses/6-046j-introduction-to-algorithms-sma-5503-fall-2005/video_galleries/video-lectures/
The assignments are available on https://ocw.mit.edu/courses/6-046j-introduction-to-algorithms-sma-5503-fall-2005/pages/assignments/. And the solutions is available on https://walkccc.me/CLRS/.
Just in case, I didn’t figure out how to solve some problems, I will refer to other solutions. C++ versions are at https://github.com/walkccc/CLRS-cpp and https://github.com/Yangjiaxi/ItA. Python version is at https://mitpress.mit.edu/9780262046305/introduction-to-algorithms/
Analysis of Algorithms
The theoretical study of computer-program performance and resources usage.
Let’s start out with a simple problem, sorting problem.
Input: A sequence of ${n}$ numbers ${<a_1,a_2,\cdots,a_n>}$ Output: A permutation (reordering) ${<a’_1,a’_2,\cdots,a’_n>}$, such that ${a’_1 \leq a’_2 \leq \cdots \leq a’_n}$
Insertion Sort
1
2
3
4
5
6
7
8
Insertion-sort(A) // sorts A[1...n]
for j <- 2 to n
key <- A[j]
i <- j-1
while i>0 and A[i] > key
A[i+1] <- A[i]
i <- i - 1
A[i+1] <- key
For each time the part of array before the ${j^{th}}$ element (${key}$) is sorted. And the goal for each loop is to increase the length of (add one) the sorted part. And the way we do that is that we copy the ${key}$ to move forward to find an appropriate place for ${key}$ and we insert it in that place.
Now, we will analysis the Insertion Problem. First is about Running time.
Running time
Runing time depends on a lot of thing.
-
Depends on input situation (eg. the array is already sorted or is a reverse sorted array)
-
Depends on input size (eg. 6 elements vs 6000 elements)
perameterize in input size (treat the time as a function of the input size)
-
Want know the upper bonds of the running time.
Kinds of analysis
Worst-case analysis
${T(n)}$ is defined as the maximum time on any input of size n.
Average-case analysis
${T(n)}$ is defined as the expected time over all input of size n. (need assumption of statistical distribution, like uniform distribution).
Best-case analysis (bogus)
Worst-Time of Insertion sort
BIG IDEA!! is asymptotic analysis
-
Ignore machine-dependent constants
-
Look the growth of the running time ${T(n), n \rightarrow \infty }$
Asymptotic notation
${\Theta}$ notation: Drop the low-order terms and ignore leading constants.
Ex. ${3n^3 + 90n^2 -5n + 60 = \Theta(n^3)}$
As ${n\rightarrow \infty}$, ${\Theta(n^2)}$ alg always beats a ${\Theta(n^3)}$ alg.
Come back to Insertion sort analysis
Worst-case: input is reverse sorted
${T(n)=\sum_{j=2}^{n} \Theta(j)=\Theta(n^2)}$
the above formula is a arithmetric series
Merge sort
1
2
3
4
5
Merge-sort(A) // sorts A[1...n]
1. If n=1, done // constants time
2. Recursively sort // 2T(n/2)
A[1...[n/2]] and A[[n/2]-1...n]
3. Merge 2 sorted lists // linear time (see note)
Note: here [x] means ${\lceil x \rceil}$. And the thrid step of merge is doing as follow:
Suppose we have two sorted list ${A_1}$ and ${A_2}$. We can just compare the first element of ${A_1}$ and the first element of ${A_2}$, and we will get the smallest elemen. Then we delete the smallest element from original list. Then, for the second-smallest element, we also compare the the first element of ${A_1}$ and the first element of ${A_2}$. So we only need ${n}$ operations to solve the problem.
So we can get the Recurrence equation:
Recursion tree technique:
Here we use ${cn, c>0}$ to represent the ${\Theta(n)}$, so ${T(n)=2T(n/2)+cn}$. And now we write the formula as a tree (see following).
And we keep doing that, we end up with this:
First, what’s the height (${h}$) of the tree? Actually, this question means How many times do I divide ${ n }$ by 2 to get 1, that is
So, the number of halving of ${ n }$ until we get 1 is ${\lg n}$, that is the height of the tree is ${\lg n +1}$. And, the leaves of the tree is ${n}$. So, we can calculate the total time of the Merge Sort, that is ${T(n)}$, which equals the sum of all notes in the tree.