You are expected to understand this. CS 111 Operating Systems Principles, Spring 2006
You are here: CS111: [[2006spring:notes:lec11]]
 
 
 

Lecture 11 notes

Topics:

  1. Memory management
  2. Virtual Memory: Part 0

Memory Management

Address Space

An Address Space maps a name (an address or a pointer value) to a memory slot holding a byte of data. The address space size is determined by the length of the address. For example, in a 32-bit machine, there exist 2^32 = 4GB of addressable memory. Address spaces are usually much larger than the amount of actual (physical) memory in the machine.

Physical address space is the address space defined by the memory controller, which connects the bus to the physical memory. The diagram below shows address 0 mapped onto the physical memory.

.:scribe-note4345.jpg

How do we run a program?

A process's address space contains the following:

         code/text
         data/global variables
         stack
         heap
         (kernel)

The data and global variables are divided into two sections: data and BSS. Data is the portion of variables and data that are non-zero on startup. BSS (Block Started by Symbol) are the ones that are zero on startup.

Segments are contiguous portions of the address space. Of the above, the text, stack, heap, data, and BSS are segments.

When we attempt to run an executable (in this example, threados-app), a loader is run to take the various segments off a disk file (for example, threados-app.c) and place them into memory. The result is something like the below: .:scribe-note.jpg

How WeensyOS works (worked)

In weensyOS, each of the processes was given an assigned addressed space to place its code, data, heap and stack. The memory ended up looking something like this:

.:scribe-note2.jpg

There are several problems with this interface:

  • The processes are not isolated from other processes. One process could read/write another process's space.
  • Because processes are assigned address spaces ahead of time, small processes that don't use up all the assigned space waste memory.
  • Another problem is premature exhaustion, the inability to run a new process even if we might have enough free space for one. For example, suppose we have three programs in memory already: Word which is taking memory from 0-2GB and Excel which is taking 3-4GB, and Powerpoint taking 2-3GB. Even if the program isn't doing anything or is actually usually much less than its reserved space, we've already reserved the address space for the program, so other programs cannot start.

The Solution: Dynamic Loading (to be discussed later).

Heap Allocation Strategies

Within a single process, how does malloc() work?

Imagine a single process. If you can't, here's a picture: A single process

Make the following assumptions:

  1. Assume that every allocation uses less than or equal to 4KB of memory. (p = malloc(s) -> s<=4096B)
  2. After every free(p), the allocated memory should be reusable by other processes.
  3. There is 4MB allocated for the heap.

First Strategy: The Free Bitmap

In this strategy, we use a free-page bitmap. Every bit in the bitmap represents one 4096 byte block in the heap. If the bit is 0, the block has been allocated. If the bit is 1, then it is free.

.:scribe-note3.jpg

In this case, the free page bitmap is 1024 bits wide: 4MB = 4096KB => 4096KB/4096B = 1024.

The following are pseudocode for malloc and free.

malloc(s)

  1. Find a bit that is equal to 1, indicating a free block.
  2. Set the bit to 0, indicating that we've allocated the block.
  3. Change the bit number -> address and return the address.

free(p)

  1. Change address -> bit number
  2. Set bit to 1

Problems NOT handled by this method:

  • The bit is already 1 on a free.
  • The address is not a multiple of 4096 on a malloc.

This method also has a performance issue: malloc is O(N) because of the need to search for a free bit, but free is O(1).

How can we achieve O(1) for malloc? Let's use a freelist!

Strategy One Point Five : Free List

A free list is a linked list of freeblock structures.

struct freeblock 
{
   struct freeblock *next;
   char garbage[4092];
}

.:untitled23423.jpg

In the data portion of the processes, we store a pointer which points to the first freeblock structure. The *next pointer of the structure points to the next free block.

With this in mind, we can update the malloc and free algorithms.

malloc(s)

  if (freelist == NULL)
    return NULL;
  p = freelist;
  freelist = p->next;
  return p;

free(p)

  p->next = freelist;
  freelist = p;

This algorithm is good if blocks are all the same size. There are no constraints on the ordering of the blocks, since all blocks are identical. For example, given that we have the following situation:

   p1 = alloc
   p2 = alloc
   p3 = alloc
   free (p2)
   p4 = alloc
   free (p1)

After all the allocations and frees, any new allocations for blocks with size up to 4096 bytes can happen in any free block.

Problem: Internal Fragmentation

What if we only need to allocate 1 byte of memory many times?

Internal fragmentation is the loss of portions of memory due to allocations of memory size N when only k(< N) is needed. "Internal fragmentation" literally means that there is wasted space within allocated blocks.

In our example, the maximum internal fragmentation is 4095 KB (when we only use 1 byte in each of 1024 blocks) and the minimum is 0 (when everything is used).

We can (attempt to) solve this problem by making variable sized allocations.

Strategy Two: Variable Size Allocations

Variable Size Allocations is, as you might imagine, allowing the allocation of memory of variable sizes. In order to allow this, we have to change how we allocate memory to the user: Instead of having several smaller blocks to allocate, instead we start with one big block of memory and split it as necessary to make smaller blocks.

To do this, we change our definition of a freeblock to include a size variable.

struct freeblock 
{
  struct freeblock *next;
  size_t size;
};
Note that we need to reserve 8 bytes for this in each block for this.

And we also change how malloc and free work.

p = malloc(s)
- search the free list for a block with at least size s
- If s < block->size, split the block
- remove the block from the list
- return the address

free(p)
- Add block to the free list.

Problems:

  • When you free a block, we need to know the size of the block.
  • The minimum size of a free block must be 8 bytes (to hold the pointers). This makes the maximum Internal fragmentation of the 7/8 of the total memory (which is in this case is 3.5MB).
  • External Fragmentation

Strategy Two Point Five: Preserving Size

In this method, instead of giving the whole block to the user, we use a small trick to keep the size and pointers in the block, while still giving the allocated amount of memory. Basically, what we will be doing is allocating 8 more bytes than necessary for the block and keeping the pointer and size in the first 8 bytes. First we update our freeblock structure a little:

struct freeblock 
{
  struct freeblock *next;
  size_t size;
  char data[];
};

Then we change the malloc and free algorithms a little:

malloc(s)

  1. search the free list for a block with size >= s + 8
  2. split the block if necessary
  3. return &(block->data[0])

free(p)

  1. p' = p - 8 bytes. Now p'->size is the size of the block.
  2. free p' as normal.

The more problematic problem is external fragmentation.

Problem: External Fragmentation

External fragmentation is wasted space between allocations.

The following example illustrates external fragmentation. If we go through the following steps:
p = alloc(2 MB)
alloc(1 KB)
free(p)
alloc(3 MB)

.:untitled3423423.jpg

As shown, we cannot allocate the 3 MB chunk contiguously due to the 1 KB block in the middle. However, it is possible to divide the 3 MB segment into chunks.

The solution to external fragmentation is compaction, which is the shifting of allocated space to reduce external fragmentation. The problem with this is that it is very bug prone and expensive.

Virtual Memory

Virtual Memory refers to the ability to map physical memory locations to virtual addresses that are accessible only by the process that owns the memory and the kernel.

The idea of virtual memory was born when people started to run more than one process at time on their computers. Before then, a process could access any memory address it needed. Since it was the only process running, it would never run into any problems. With multiple processes running, processes needed to be protected from others, or else processes would just run all over each other.

This idea has had a couple of solutions throughout the years, with both new hardware and new software being implemented.

A History of Virtual Memory Solutions

Dynamic Loading

Dynamic Loading is a software solution to the virtual memory problem that was introduced around 1970.

Dynamic loading basically allows a process to run at a new address space by using relocation symbols. These relocation symbols allow the loader to figure out how to change addresses and references so that the processes can be loaded into different address spaces. For example, a dynamic loader might've loaded threados-app like this:

DYNAMIC-O!

The dynamic loader, in this example, would change an instruction like movl $0x19000,%eax to movl $0x29000,%eax.

The dynamic loader, of course, doesn't actually provide any isolation at all by doing this: A process that references some memory that it shouldn't have access to will still be able to access that memory.

Segmentation Registers

To try and simplify implementations of the dynamic loader later, around 1975, hardware manufacturers changed the processor to support a new register. This new addition was the segmentation registers.

Processors now add a base address to any address that it was given for an instruction. This base address is held in the segmentation register. This means that processors could now transform any address into a physical memory address by adding the address to the segmentation register. Different processes would have different values for their segmentation registers, thus providing each process with a different address spaces. This brings about the concept of a Virtual Address Space: A process's address space is actually different from physical address space. So, what a process accesses in it's process space at a certain address may or may not be at that address in physical memory.

Segmentation Registers alone still don't provide isolation: Even though process's address space are different, if you know where a process is relative to the base address (or if you are accessing memory randomly), you can still overwrite other process's memory spaces.

Segmentation Length

To try and provide isolation, we can add another register called Segmentation Length. As you might guess, segmentation length contains the length of the segment of the process. Any addresses that try to access memory outside of the segmentation length are illegal.

In order to completely isolate processes, both of the segmentation registers must be protected. The instructions that change the registers must be protected instructions. In order to do this, we also need a concept of protected and unprotected code space. This brings about the creation of a new register: the Current Privilege Level or CPL register. The CPL register works in the following way:

CPL Value Privilege Level Protected Instructions
0 Kernel Privilege Protected Instr OK
1-3 User Privilege Protected Instr NOT OK

Current Privilege Level is also a protected register. When user level processes call system calls or when interrupts occur, CPL is set to zero.

There is one problem with this: by providing the length of a segment (that has to be contiguous and has no maximum size), we introduce the possibility of External Fragmentation (which is very bad).

Paged Virtual Memory

The way that most modern operating systems fix this is through Paged Virtual Memory.

The operating system basically gets fixed-sized units of memory to work with, and the user is allowed to make variable sized allocations. Fixed-sized allocations means that each memory spot is interchangeable, which makes External Fragmentation completely disappear. It also means that pages can randomly appear in memory, which means that we need some way of translating hardware and virtual addresses to each other.

Operating System addresses also will no longer be the same as application addresses, so there also needs to be support for changing OS addresses into application addresses.

Hardware supports a Binding/Pagemap Function also known as moniker "Beta" (ß). For now, we will assume that ß works by mapping a virtual address(va) to a physical address(pa) like this:

ß(va) => pa 
ß(va) => ß(pagenum * 4096 + offset) => ß(pagenum*4096)+ offset => physpage*4096 + offset
 
2006spring/notes/lec11.txt · Last modified: 2006/09/26 11:42 (external edit)
 
Recent changes RSS feed Driven by DokuWiki