Table of Contents

Lecture 4 Scribe Notes

By Tom Carpel, Neil Huang and Sean MacIntyre

Processes: Abstraction and Implementation

Process

A process is considered one of the following:

Two concepts that are essential in understanding processes are Abstraction and Implementation.

Von Neumann Computer Architecture

A Von Neumann Computer Architecture closely represent the computers that we operate. It includes 4 components.

  1. ALU/CU [Arithmetic Logic Unit / Control Unit]
  2. Registers (Example: Instruction Pointer)
  3. Primary memory
  4. I/O Devices

ALU/CU (CPU: Central Processing Unit) and Registers

Virtualization

We can imagine an arcade machine emulation, implemented as such:

struct arcade_regs {  
	int instruction_ptr;
	.
	.
	.
}
void cpu (struct arcade_regs *r) {
	while (1) {
		i = get_next_instruction(r);
		if (i == jmp)
			r->instruction_str = r->accum;
		else if (I == ADD)
			r->reg0 = r->reg0 + r->reg1;
	.
	.
	.
}

Advantage:

Disadvantage:

For example, emulating an arcade game:

emulated CPU instructions  =  10 real CPU instructions

To improve on this we want to cut down the ratio of (# instructions executed)/(# emulated instructions executed). Want to get it to be 1, like running on the actual hardware.

Abstraction

We use Hardware!

Implementation

Maintain process isolation. How? By using kernel & architecture.

Main Motivation: Increases performance. But we can't necessarily use the real architecture directly!

MIPS X instruction:

OxCFC00000 hsc – Halt and Spontaneously Combust
Reg(31) <- PC

This is an example of a Dangerous Instruction! This is an actual instruction from the MIPS instruction manual (although it's probably just a joke). csl-tr-86-289.pdf

Solution: Change the architecture to prevent access to dangerous instructions! Architecture: (in x86 terms)

CPL 0 -> all privilege (kernel)
CPL 3 -> application privilege/user privilege

Kernel: Sets its own CPL to 0 and when running an application set its CPL to 3. An application shouldn't be allowed to change its CPL, for it violates the process isolation principle.

However, the question of how are applications now going to execute kernel code and System calls? The solution is Protected Control Transfer.

Protected Control Transfer

Allow applications to jump into selected kernel entry points: It needs:

  1. Application’s instructions for PCT
  2. Architecture’s implementation
  3. Kernel’s implementation

Example for 1):

int $48
int <- “interrupt”
$48 <- system call number

Example for 2):

The Interrupt Descriptor Table.

The Processor performs:
  - Looks up IDT entry.
  - Saves old registers onto a new stack.
  - Sets the new CPL.
  - Jumps to the routine.

Example for 3):

The Kernel performs:
  - Sets up interrupt descriptor table.
  - Loads IDT register (lidt, IDT register instruction, dangerous instruction).
  - Implements the system call.
Kernel must also save __all__ of the registers from the process so it can restore them later.
It saves them in a __Process Descriptor__. (One descriptor for each different process)

Primary Memory

Abstraction

Implementation

Main Motivation: Increases performance

The processor prevents a process (consider a single application) from modifying some parts of memory, hence, we have an isolated address space.

Applications can Read/Write:

Applications cannot Read/Write:

If the process was able to freely modify all parts of memory, specifically the kernel memory, it could potentially change its own CPL (current privilege level) from application privileges to kernel (or all) privileges, which could be exploited maliciously. The process could also possibly modify the interrupt descriptor table and cause the kernel to jump to the wrong memory location.

Process Address Space: A typical process address space is shown below of the 4GB block of byte-addressed memory.

The stack and heap both expand as the process’ program(s) execute. If they collide, we have overflow and this process dies. Using multiple process address spaces, we limit memory overflow-causing death to only that single process in which overflow occurred.

The infamous Blue Screen of Death is due to a lack of process isolation; one process overwrites another’s memory and the system crashes. We see fewer of these cases in Windows written with the NT kernel because it includes process isolation, which the earlier kernels lacked.

Multiple Processes

For multiple processes, we have multiple Isolated Address Spaces, each of which contains the sections described in the memory layout above.

Old School Implementation: Segmentation

//New School Implementation:// Virtual Memory

Process Descriptor Overview

Threads: A thread is a virtual processor. That is, there is a virtual execution context that contains:

  1. CPU
  2. Registers
  3. Local Variables/Return Addresses

We would like to be able to support a process with multiple threads. That is, we have multiple register sets and multiple stacks (since we are executing different functions) for a single process. This enables better utilization, but could introduce complicated race conditions, which would be discussed later.

I/O

Abstraction

Implementation

Main Motivation: Generality & Simplicity

I/O devices encompass a large range, from data storage, such as hard disks, USB drives, CDs, DVDs, etc., to peripherals like keyboards, mice, digital cameras, LCD and CRT monitors, and printers. To be able to handle such a wide variety, I/O devices must be abstracted in a very broad and general manner. The abstraction must also be general enough to accompany the use of additional devices as new products are created and developed. This kind of generalization is the motivation behind Unix's BIG PICTURE:

EVERYTHING is a file

A file descriptor (fd) is just a number that is assigned by the operating system that points (or refers to) to a specific file structure:

int fd = open(const char * name, int mode);

I/O devices are implemented using system calls, such as the open() function above, which are built into the operating system. The implementation takes place in the software rather than the hardware because these I/O devices are not very performance sensitive, and system calls provide a simple interface for software applications to easily access I/O devices. For example, a programmer wanting to write to a hard disk need not understand anything about the hardware itself, e.g. the sector size; the only necessary information is the name of the device (const char * name) and the file access mode (int mode), e.g. reading only, writing only.

The file descriptors used by a particular process are kept in the process descriptor structure, as seen below. Each of these file descriptors point to a file structure, which includes information about that file such as file type, size, and offset.

Starting a New Process

Implementation:

  1. Allocate process descriptor
  2. Initialize process descriptor
    1. Set process ID to an unused process ID
    2. Set CPL to 3
    3. Set new address space

However, problem like Infinite Loop occur.

L1:  jmp L1
Interrupt
Timer Interrupt
  1. Executes “int” every 1ms
  2. Kernel gets control
  3. Kernel runs different process by using a Scheduler

Process Isolation

In Process Isolation, we want:

So we need to set up the process environment prior to running a process.

fork() - We use fork() to create COPY of current process. execvp() - We use execvp() to run a new program in and replace the current process, and we keep the I/O devices (file descriptors).

pid_t p = fork();
If (p == 0) {
	New process;
}
Else {
	Old process;
}