You are expected to understand this. CS 111 Operating Systems Principles, Fall 2006
You are here: CS111: [[2006fall:notes:lec11]]
 
 
 

Lecture 11 notes

Paul Torrez, Tim Hallner, Kiran Mathrani
12 November 2006

modified by Anu Bhaskar
10 December 2006

Topics

  • Virtual Memory

Virtual Memory


Why use Virtual Memory?

  • If applications require access to more information than can be stored in physical memory
  • Makes the job of the application developer much easier by providing access to virtually unlimited memory
  • Simulates main memory in a way that is invisible to user applications

What is Paged Virtual Memory & Why use it?

  • The most common method of implementing Virtual (and Physical) Memory
  • Paged Virtual Memory saves a lot of space
  • A page consists of a range of consecutive addresses in Virtual Memory
    • Each address corresponds to an address in physical memory
  • A page table is used to store the mapping of virtual to physical addresses
  • Implemented by a function: addrmap(va) = pa -> pmap(vpn) = ppn

What is pmap & How does the processor implement it?

  • pmap is the mapping of virtual addresses to physical addresses
  • The simplest idea is to use a giant lookup table.

Single Level Page Table
Calling the pmap function with a virtual page number as an argument will return the physical page number associated with it:

pmap(vpn) {
  return PTABLE[vpn];
}

single-page-table.jpg

  • The address of the top of the PTABLE in memory is stored in the %cr3 register on x86 machines.
  • On an x86 machine:
    • 2^32 addressess / 2^12 addresses per page = 2^20 pages
    • Assuming each page takes 4 bytes to store, there will be 4 * 2^20 = 4 MB worth of pages to store in memory

2-Level Page Table
There is now a hierarchy of two different levels of tables, the first encountered being Level 1 and the second Level 0.

pmap(vpn) {
  uvpn = vpn / 2^10;
  lvpn = vpn % 2^10;
  l0pt = PTABLE[uvpn];
  if (l0pt == FAULT)
    return FAULT;
  return l0pt[lvpn];
}

2level.jpg

                  Level 1                          Level 0

For example, if the virtual address = 0x00002003 = 0 * 2^22 + 2 * 2^12 + 3, then
offset = 3
uvpn = 0
lvpn = 2

Assuming we only have a small program running that only uses one physical page number (49):

  • pmap(0x00002) = 49
  • addrmap(0x0002003) = 49 * 2^12 + 3

To implement this scenario with a 2-level page table we would need only one Level 1 page table and one Level 0 page table. Because each table takes 4 KB to store, our entire page table takes only 8 KB to store versus 4 MB in the Single Level case. 2-Level page tables save space in a sparse address space. x86 machines use a 2-Level page table.

How is the pmap function set up?

  • Not set up by the process, because that would break isolation
  • The KERNEL sets it up!
    • Memory that holds page tables is accessible only if the current protection level or cpl = KERNEL
    • The instruction that changes PTABLE(%cr3) can only be executed if cpl = KERNEL

On a context switch, the kernel loads a new pmap function:

  • pmap(va) can give different results depending on:
    • current protection level (kernel or user)
    • access type (read or write)

Either return physical address or FAULT

  • Read-only mapping

typical-address-space.jpg

<code>
if (access_type == write)
  return fault;
else
  return pa;
</code>
* Kernel-only mapping
<code>
if (cpl == user) //Remember Kernel must handle the context switch to preserve isolation
  return fault;
else
  return pa;
</code>

On a fault, the processor causes a trap, the kernel gets control, and the kernel runs the pfault function.

pfault(va, cpl, atype) { // page fault handler
  - either: install a mapping and restart process at faulting instruction
  - or: kill the process (segmentation fault)
}

Wait a second...what is a fault?

  • A page fault is an exception when a page that is not mapped in physical memory is attempted to be accessed
  • In this case, a fault represents any illegal action trying to be executed
    • i.e. trying to write to a read-only mapping, or trying to perform Kernel operations when cpl is user

Robustness through paged virtual memory

Problems solved by virtual memory

  • We have 4GB of addresses but machine only has 512 MB primary memory
  • The good_fairy() -> malloc (4 KB) and the wicked_witch() -> malloc (512 MB) how can they both run?
    • If the wicked witch runs first she could prevent the good fairy from running, but this would violate the principle of isolation!
  • What if we use the memory as a cache to access memroy on the disk?
    • We are treating primary memory as a cach of a larger memory that is actually stored on disk.
    • Once we do that its hard for the wicked witch to affect the good fairy. Lets see how we do this...

Multi-level memory

  • Treat memory as a cache of a larger memory stored on disk
  • Cache miss policy?
  • Cache hit vs. cache miss?
  • pmap function only lists pages in primary memory
  • pfault handler handles swapping:
    • responds to cache misses by reading in pages from disk
  • New kernel function:
  swapmap(proc, va) {
    - either: return disk address
    - or: return FAIL
  } 
pfault(va, cpl, atype) {
  if (swapmap(current, va) != FAIL) {
    <process, va'> = removal_policy(); // process = process to steal page from, va' = which page to steal
    pa = process->pmap(va');
    write page pa to disk @ swapmap(process, va');
    set process->pmap(va') to FAULT;
    read disk @ swapmap(current, va) into page pa);
    set current->pmap(va) to pa;
    return;
  }
  else
    kill;
}

memory-access.jpg

Let's see how this process works.

Removal Policy

  • Why do we need to remove pages?

If we didn't, than we would be infinitely creating more and more pages as more applications are run...

   What would this result in?

Goal: Remove a page that won't be used and avoid "thrashing" - a high portion of accesses require a swap (because it's slow)

Note: If all memory accesses are uniformly random there is no good answer to the removal policy. We rely on Locality of Reference

  • When N consecutive accesses to memory touch way less than the total number of address => Accesses are clustered
Belady's Optimal Algorithm
  • Replace the page that will be accesses farthest in the future. (This is impossible but we can try!)
Least Recently Used (LRU)
  • Replace the page whose most recent access was furthest in the past
Simpler Algorithm (FIFO
  • Replace the page that was swapped in farthest in the past
Evaluating page replacement
  • Reference String - sequence of page accesses + Physical Memory Size = # of swaps (smaller are better)
Examples
  • FIFO = 9 swaps
FIFO 1 2 3 4 1 2 5 1 2 3 4 5
P1 1 1 1 4 4 4 5 5 5 5 5 5
P2 2 2 2 1 1 1 1 1 3 3 3
P3 3 3 3 2 2 2 2 2 4 4
  • BELADY'S OPTIMAL = 7 swaps
Belady's 1 2 3 4 1 2 5 1 2 3 4 5
P1 1 1 1 1 1 1 1 1 1 3 3 3
P2 2 2 2 2 2 2 2 2 2 4 4
P3 3 4 4 4 5 5 5 5 5 5
  • Least Recently Used = 10 swaps
LRU 1 2 3 4 1 2 5 1 2 3 4 5
P1 1 1 1 4 4 4 5 5 5 3 3 3
P2 2 2 2 1 1 1 1 1 1 4 4
P3 3 3 3 2 2 2 2 2 5 5
  • FIFO (4 pgs) = 10 swaps
    • With 4 pages this reference string sufferres from Belady's annomally where "more pages causes more swaps"
FIFO 1 2 3 4 1 2 5 1 2 3 4 5
P1 1 1 1 1 1 1 5 5 5 5 4 4
P2 2 2 2 2 2 2 1 1 1 1 5
P3 3 3 3 3 3 3 2 2 2 2
P4 4 4 4 4 4 4 3 3 3
  • LRU (4 pgs) = 8 swaps

LRU and Belady's Optimal don't suffer from Belady's Annomally Stack Algorithms never suffer from Belady's annomally

Optimizations Page hasn't changed so there is no need to swap and Demand Paging

 
2006fall/notes/lec11.txt · Last modified: 2007/09/28 00:25 (external edit)
 
Recent changes RSS feed Driven by DokuWiki