Course Overview

 

From today, I will post my study notes of Introduction to Computer Systems (following CMU 15213). In the meantime, I am going to finish the assignments in this book and post my solutions.

The Course Page are available at https://www.cs.cmu.edu/~213/index.html. And the text book is

Randal E. Bryant and David R. O’Hallaron, Computer Systems: A Programmer’s Perspective, Third Edition, Pearson, 2016

In most of our work, we just write down some text (code) in some litte box, and then the machine will outcomes some behavior that we intended our program to do. But we don’t know why and how?

The purpose of this course is to give you enough understanding of what that “box” is doing when it executes your code.

Let’s see the first and simple program for most of the beginner in C, hello program

1
2
3
4
5
6
7
#include <stdio.h>

int main()
{
    printf("hello, world\n");
    return 0;
}

We begin our study of systems by tracing the life time of the hello program. The following sections will briefly introduce the main content of this course (textbook).

Information = Bits + Context

This hello program is saved in a text file called hello.c, which begins its life as source program (or source file). In face the source file is a sequence of bits, which further organized in 8-bit chuncks called bytes. Each bytes represent some text character in the program.

Most computer systems represent text character by ASCII standard. Like newline character ‘\n’ represented by interger value 10 (0000 1010 or 0A in hexadecimalism). The ASCII representation of hello.c shown in following Fig

The ASCII text representation of hello.c.1

This example can illustrates a fundamental idea of computer systems:

  1. All information stored in memory is represented as a bunch of bits.

  2. The only thing that distinguishes different data objects is the context in which we view them.

Translate source file to machine

The hello program as a high-level C program can be read and understood by human beings, but not by machine. Therefore, the individual C statements must be translated by other programs into a sequence of low-level machine-language instructions. These instructions are then packaged in a form called executable object program and stored as a binary disk file. Object programs are also referred to as executable object files.

On a Unix system, the translation from source file to object file is performed by a compiler driver:

1
linux> gcc -o hello hello.c

By this command, the GCC compiler driver will read the source file hello.c and translates it into an execuatable object file hello. The translation contains four phases shown below. The programs that perform the four phases (preprocessor, compiler, assembler, and linker are known collectively as the compilation system.

Compilation system1.

  1. Preprocessing phase.

    The preprocessor (cpp) modifies the original C program according to Hash symbol ‘#’. Like ‘#include <stdio.h>’, that tells the preprocessor copy all the contents of header file stdio.h and paste it in the source file hello.c. After preprocessing, we can get another C program (in fact a text file), typically with ‘.i’ suffix.

  2. Compilation phase.

    The complier (cc1) tranlates the text file hello.i into the text file hello.s, which is an assembly-language program. Assembly language is useful because it provides a common output language for different compilers for different high-level languages.

  3. Assembly phase.

    Next, the assembler (as) translates hello.s into machine-language instructions, packages them as a relocatable object program and stored it in the object file hello.o, which is a binary file and hard to read for us.

  4. Linking phase.

    In our source file, we will call some function, the body of which are stored in other file. Like our hello.c, we call a funtion called printf, which actually is a part of standard C library provided by every C compiler. And the printf function resides in a separated precompiled object file called printf.o. To implement this function, we need to merge it into our hello.o binary file. The linker (ld) is in charge of this merging. Finally, we get hello file, which is an executable object file (or simply executable) that is ready to be loaded into memory and executed by the system.

Preprocessor and Execute Instructions

After four phases of translation, source file hello.c have been translated into an executable object file called hello. To run it, we can type the following command

1
2
3
linux> ./hello
hello, world
linux> 

In Unix system, shell is an application program, which is a command-line interpreter. We can type a command line, and shell will performs it. If the first word of the command line is not a built-in shell command, the shell will assume it as a name of an excuatable file, then load and run it. In this case, the shell loads and run hello program, and waits for it to terminate. hello program print a message ‘hello, world’ to the screen, then terminates. In the end, shell prints a prompt ‘>’, and waits for next command.

Hardware Organization of a System

In this section, we will introduce the hardware organization of a typical system, which’s shown below.

Hardware Organization. CPU: central processing unit, ALU: arithmetric/logic unit, PC: program counter, USB: Universal Serial Bus1

  1. Buses.

    A collection of electrical conduits called buses is running throughout the system. The function of Buses is carrying bytes of information back and forth between the components. Buses are typically designed to transfer fixed-size chunks of bytes known as words. The number of bytes in a word (the word size) is a fundamental system parameter that varies across systems. Most machines today have word sizes of either 4 bytes (32 bits) or 8 bytes (64 bits).

  2. I/O Devices.

    Input/output (I/O) devices are the system’s connection to the external world. Like our keyboard and mouse for user input, a display for user output, and a disk drive (or disk) for long-term storage of data and program. Initially, the executatble hello program resides in the disk.

    Each I/O devices is connected to the I/O bus by either a controller or an adapter. The difference between them two is the mainly of packaging.

    • Controller are chip in the device itself or on the system’s main printed circuit board (often called motherboard).

    • Adapter is a card that plugs into a slot on the motherboard.

    Anyaway, the function of each is to transfer information between an I/O device and the I/O bus.

  3. Main Memory.

    The main memory is a temporary storage device. When the processor is executing the program, main memory holds both a program and the data it manipulates. Physically, main memory consists of a collection of dynamic random access memory (DRAM) chips. Logically, memory is organized as a linear array of bytes, each with its own unique address (array index) starting at zero. Each machine instructions are stored in main memory. And the any kinds of variable (int, float, etc.) are stored in main memory.

  4. Processor.

    The central processing unit (CPU), or simply processor, is the engine that interprets (or executes) instructions stored in main memory. There’s a word-size storage device (or register) called the program counter (PC) in CPU. At any point in time, the PC points at (contains the address of) some machine-language instruction in main memory.

    At any time the system is powered, the processor repeatedly executes the instruction pointed at by the PC, and updates PC to point to the next instruction. The next instruction, may or may not be contiguous in memory to the instruction that was just executed.

    A processor appears to operate according to a very simple instruction execution model, defined by its instruction set architecture. The processor reads the instruction from memory pointed at by the program counter (PC), interprets the bits in the instruction, performs some simple operation dictated by the instruction, and then updates the PC to point to the next instruction.

    There are only a few of these simple operations, and they revolve around main memory, the register file, and the arithmetic/logic unit (ALU).

    • The register file is a small storage device that consists of a collection of word-size registers, each with its own unique name.

    • The ALU computes new data and address values.

    Here are some examples of the simple operations that the CPU might carry out at the request of an instruction

    • Load:

      Copy a byte or a word from main memory into a register, overwriting the previous contents of the register.

    • Store:

      Copy a byte or a word from a register to a location in main memory, overwriting the previous contents of that location.

    • Operate:

      Copy the contents of two registers to the ALU, perform an arithmetic operation on the two words, and store the result in a register, overwriting the previous contents of that register.

    • Jump:

      Extract a word from the instruction itself and copy that word into the program counter (PC), overwriting the previous value of the PC.

Runing hello program

In this section, let’s take a general and simple view of what happens when we run our hello program.

When we type the command ./hello at the keyboard, the shell program reads each character into a register and then stores it in memory, shown below

Reading the hello command from keyboard1.

After we hit the enter key, the shell then loads the executable hello file by executing a squence of instructions. These instructions can copy the code and data in the hello object file from disk to main memory. The data includes the string of characters hello, world\n that will eventually be printed out.

Using a technique known as direct memory access (DMA), the data travel directly from disk to main memory, without passing through the processor. This step is shown in following Fig

Loading the executable from disk into main memory1.

Once the code and data in the hello object file are loaded into memory, the processor begins executing the machine-language instructions in the hello program’s main routine. These instructions copy the bytes in the hello, world\n string from memory to the register file, and from there to the display device, where they are displayed on the screen, shown below

Writing the output string from memory to the display1.