We must know, we will know !

万物并作,吾以观复

Lecture 1:Knuth-Morris-Pratt (KMP) algorithm

At the beginning of our bioinformatic algorithms journal, we will focus on the similarity of strings. We we touch following aspects,

  • Equality
    • Find occurrences of string $S$ in a given string $T$.
  • Edit distance
    • Find edit distance between strings $S$ and $T$ (global alignment)
    • Find substring of $T$ with minimal edit distance to given string $S$ (local alignment)
  • Probability
    • Construct probability model that best describes a given set of strings.
    • Check if a given strings is likely to be of the same type.

Today, we will introduce the algorithm for first problem: find all occurrences of a string $P$ (called pattern string) in a given string $T$ as a substring.

Read more

Prerequisite:Logic and Number Theory

In this series, I will introduce the Fundamental Contents in Computer Theory and Algorithms. If you hope to learn more advanced algorithms and techniques, please move to my another series Algorithm Design and Analysis.

In this blog, I try to do some preparation for following contents. I prefer to introduce the algorithms in a more rigorous and strict way. In theoretical computer science and algorithms, we will encounter lot of statements, like the correctness and running time for some specific algorithm. The way to assert a statement is true is Proof, then the statement will become a theorem (or lemma, claim, corollary anyway). Hence, we start from Logic which is very beginning of our story (may not be romantic but must be interesting and attractive).

Another part of this blog is number theory, which is powerful and common tool in our journey. Off course, we hope we can master more math, which are always the enrich toolbox for computer scientists.

Read more

Virtual Memory

This blog talks about Virtual Memory. For each process (the running program), we need to allocate memory to it. We have two goals

(1) User use “memory” like a large and uniform (continuous) array.

(2) And each process has private memory space for safety.

So, we need virtual memory, which is a simple abstraction provided to users. CPU and OS can help us to translate efficient address (virtual address) into physical address.

Basically, we can use the strategy like “Cache”, Then we have some terms

  • A virtual memory “Block” is called a “Page”.
  • A virtual memory address translation “Miss” is called a “Page fault”.

Read more

Randomized Algs:Permutation Routing Problem

This blog, we still talk about randomized algorithms. Let’s consider the parallel computation problem: A network of parallel processors is modeled by a directed graph ${G = (N, E)}$. The nodes ${N}$ represent the processors and the edges ${E}$ model the communication links between the processors. All communication between processors occurs in synchronous steps. Each link can carry at most one unit message (packet) in one step. During a step, a processor can send at most one packet to each of its neighbors. Each processor is uniquely identified by a number between 1 and ${N}$.

Read more