====== Lecture 15 Scribe Notes: Virtual Memory II ======
//By Chang-Hung Chiang, Chris Clark, and Albert Chern//
===== Overview =====
Last lecture we began discussing virtual memory. Let's recap and see how it works.
The diagram below illustrates how a virtual address is stored and located in physical memory. We are given the virtual address of //0x00901050//. The way we interpret this value is by dividing it into three distinct indices:
* 10-bit Page Directory Index
* 10-bit Page Table Index
* 12-bit Page Offset
This is analogous to the idea of a book. The Page Directory Index (pdindex) can be interpreted as the chapter that you must turn to in your book. The Page Table Index (ptindex) is interpreted as the page to turn to within the given chapter. Finally, the Page Offset represents the specific line on the page to start reading from.
We begin by looking in the //%cr3// register to locate the page directory table. We then index (using the pdindex from our virtual address) from that location to grab the corresponding page table (here it is //0x00102000//). Notice that we have thrown away the //0x007//. These 12-bits are used as accessibility and validity flags, and have no meaning in the location of the page table.
Now we have found our page table. We can index from here (using the ptindex from our virtual address) to grab the location of the actual page in memory. Again we throw out similar flags, and get a value of //0x3000//. Finally, we use the offset in the virtual address to index from //0x3000// and grab our true data.
{{scribe_pic2.gif?720x540}}
==== Abstraction ====
So now that we have the actual algorithm for referencing and looking up virtual addresses, how can we abstract these actions in a user friendly way for the kernel to use?
We create two functions, //pgdir_get// and //pgdir_set// to retrieve and set contents in the two-level page table structure. The function headers are shown below:
ptentry_t pgdir_get(ptentry_t * pgdir, uintptr_t va);
void pgdir_set(ptentry_t * pgdir, uintptr_t va, ptentry_t pte);
* Note: The //ptentry_t// structure basically defines a single page table entry *
The //pgdir_get// function basically takes the given page directory and a virtual address, and returns the ptentry_t object that //va// refers to. The //pgdir_set// function takes the given page directory and places //pte// in the location specified by the virtual address, //va//. This could potentially require additional memory allocation in some cases.
For instance, consider the diagram above. Now suppose we were to execute the following line:
''pgdir_set(pgdir, //0x00803000//, //0x00010007//)''
This execution requires no additional memory allocation. We look at the page directory index (which in this case is //0x010//), and we notice that the page directory entry already exists. Therefore, we can follow it to the page table and place //0x00010007// in the page table index (in this case //0x003//).
However, now consider execution of the following line:
''pgdir_set(pgdir, //0x01803000//, //0x07010007//)''
We follow the page directory to the pdindex value (here //6//) and notice that it is empty. Therefore, we must now allocate a brand new page table to reference items under this index. Only then can we input the page table entry.
===== Virtual Memory Benefits=====
So far this seems fairly pretty complicated. What are the benefits of virtual memory? If we look back to the 5 operating system design goals, we see that there are benefits all around:
* __Utilization__:
* As described in the previous section, virtual memory organizes memory into fixed size **pages** which reduces external fragmentation of physical memory.
* Programs are no longer limited to the size of physical memory because VM uses the disk to make memory appear much larger than is actually available. However, using the disk introduces its own set of problems because the disk is //very// slow compared with physical memory accesses.
* __Robustness__: Virtual memory can enforce process isolation by giving each process its own virtual address space.
* __Simplicity__: The hardware and OS implement virtual memory. Application programs and their programmers don't have to worry about their size or memory allocation issues.
* __Versatility__: The virtual memory abstractions described previously allow for a variety of application such as performance improvements.
* __Performance__: Thanks to the versatility of the virtual memory interface, application performance can be improved with **demand paging** and **shared libraries**.
===== Utilization: Swapping =====
Let us first consider the different types of utilization issues that can arise within memory
* Applications that are idle, but take up space in memory
* Parts of application code that are NOT in use, but still placed in memory
* Parts of application data that are NOT in use, but still placed in memory
* Memory leaks in applications
We can improve utilization of primary memory by moving unused data to disk!
**Swapping** is when pages are moved from memory to disk and vice versa. Swapping is the responsibility of the OS and we devote a section to discussing //eviction algorithms// which decide which pages to swap out.
HOW does the OS implement its swapping mechanism?
The basic idea is that the architecture MUST inform the kernel when a "swapped" address is trying to be accessed. The way this is handled is through a //Page Fault Trap//. The architecture executes an interrupt when an application attempts an illegal access, and the kernel handles the interrupt with a PAGE FAULT HANDLER, which we will define below.
To allocate a page when memory is full, we must:
- Choose a physical page to reuse
- Write the physical page's contents to the //swaparea// (a portion of disk used to store primary memory pages temporarily)
- For every process //p//, virtual address //va// accessing page: ''pgdir_set(p->pgdir, va, 0);''
- Use physical page for new data
Below is a general outline of the //page_fault_handler// used to handle the page fault interrupts:
page_fault_handler(uintptr_t va, int cpl/*protection level*/, int write) {
if(va was swapped out) {
allocate physical page;
read swapped page from disk;
pgdir_set(...);
run(current);
}
}
But we also have another issue. Where is the page stored in this new //swaparea//(a portion of disk used to store primary memory pages temporarily)? We need to store this information:
typedef struct {
process_t * p;
uintptr_t va;
bool swapped;
off_t swaploc;//location of the page in swap area;
} vpageinfo_t;
The //swapped// member lets us know whether or not a given virtual page has been swapped out, and the //swaploc// is the location of the page in the swaparea.
So now we can express the handler in much more detail:
page_fault_handler(uintptr_t va, int cpl, int write) {
vpageinfo_t * vp = vpageinfo(current, va);
if(vp && vp->swapped) {
vpageinfo_t * evict = eviction_policy(); //find a victim page
evict->p->state = BLOCKED;//block the owner first?
ptentry_t pte = pgdir_get(evict->p->pgdir, evict->va);
pseudo code: write pte physical page to swaparea at evict->swaploc //swapping
pgdir_set(evict->p->pgdir, evict->va, 0);//let the owner know
evict->p->state = RUNNABLE;//let go the victim owner
pseudo code: read into pte physical page swapare at vp->swaploc
pgdir_set(current->pgdir, va, pte);
run(current);
}
}
To get a better understanding of what is happening here, let us assume we have two processes, A and B, which each have their own virtual address space. Also, let us assume that they have completely filled up all of primary memory with their contents.
{{swapping1.gif?}}
Now suppose that process B decides to allocate a new page of memory during execution. Well, according to B's virtual address space, there is plenty of room to allocate more memory. So by the concept of process isolation, B should be able to allocate this memory and continue with execution. Behind the scenes, however, we truly have no additional space to allocate memory. What do we do??
Following the swapping algorithm above, we choose a page in primary memory to swap (in this case we've chosen a page from A's memory). We rewrite the existing contents onto the disk in the //swaparea//. Then, we overwrite the contents in primary memory to do our memory allocation. In the background, we are modifying and updating the //swapped// and //swaploc// members of the corresponding vpageinfo_t objects.
{{swapping2.gif?}}
==== Thrashing ====
One of the problems that can occur with swapping is **thrashing**. Thrashing is when there is high swapping activity. Since the disk is so slow compared to the CPU, thrashing can grind performance to a halt. The easiest solution is to buy more physical memory. The more complex solution is to carefully design page replacement algorithms, or eviction policies, that select which pages to swap out such that they are unlikely to be swapped back in. This the topic of the next section.
===== Utilization: Eviction Policies =====
In the previous section we discussed HOW pages were swapped out to disk. Now we need to tackle the question WHICH pages should the OS swap out to disk? This must be done carefully so we can avoid evil thrashing.
Here are a series of examples that demonstrate the effects of different eviction policies by using one carefully designed **Reference String//(A list of page accesses)//**.
==== Optimal Page Replacement Policy ====
What's the optimal policy? Apparently the most desirable way is to evict the page which will be used in the furthest future. Given the reference string, we have the following table:
^**Reference String**^ start ^ **--1--** ^ **--2--** ^ **--3--** ^ **--4--** ^ **--1--** ^ **--2--** ^ **--5--** ^ **--1--** ^ **--2--** ^ **--3--** ^ **--4--** ^ **--5--** ^
^Physical Page 1| 8 | //**(1)**// | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | //**(3)**// | 3 | 3 |
^Physical Page 2| 9 | 9 | //**(2)**// | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | //**(4)**// | 4 |
^Physical Page 3| 10 | 10 | 10 | //**(3)**// | //**(4)**// | 4 | 4 | //**(5)**// | 5 | 5 | 5 | 5 | 5 |
This theoretic eviction algorithm is also known as the [[http://en.wikipedia.org/wiki/Belady%27s_Min|Belady's MIN]], devised by [[http://en.wikipedia.org/wiki/Laszlo_Belady|Laszlo Belady]] in 1966[Wikipedia]. In this case there are totally 7 swaps out of 12 page accesses, with only three physical pages. However, this algorithm can't be implemented in practice because it requires oracle to predict the future.
==== First-In First-Out ====
^**Reference String**^ start ^ **--1--** ^ **--2--** ^ **--3--** ^ **--4--** ^ **--1--** ^ **--2--** ^ **--5--** ^ **--1--** ^ **--2--** ^ **--3--** ^ **--4--** ^ **--5--** ^
^Physical Page 1| 8 | //**(1)**// | 1 | 1 | //**(4)**// | 4 | 4 | //**(5)**// | 4 | 4 | 4 | 4 | 4 |
^Physical Page 2| 9 | 9 | //**(2)**// | 2 | 2 | //**(1)**// | 1 | 1 | 1 | 1 | //**(3)**// | 3 | 3 |
^Physical Page 3| 10 | 10 | 10 | //**(3)**// | 3 | 3 | //**(2)**// | 2 | 2 | 2 | 2 | 2 | //**(5)**// |
As in scheduling, besides the optimal policy that //**requires magic**//, the first policy we come up with is simply FIFO. In the example table above, we evict the page that came in the furthest past, and have a total of 9 swaps out of 12 page accesses. Only a slightly worse than the optimal policy? Looks good. It seems that the only problem is, on the 5th and 6th page accesses, we swapped out pages right before we use them. Let's add another physical page and see if it helps:
^**Reference String**^ start ^ **--1--** ^ **--2--** ^ **--3--** ^ **--4--** ^ **--1--** ^ **--2--** ^ **--5--** ^ **--1--** ^ **--2--** ^ **--3--** ^ **--4--** ^ **--5--** ^
^Physical Page 1| 8 | //**(1)**// | 1 | 1 | 1 | 1 | 1 | //**(5)**// | 5 | 5 | 5 | //**(4)**// | 4 |
^Physical Page 2| 9 | 9 | //**(2)**// | 2 | 2 | 2 | 2 | 2 | //**(1)**// | 1 | 1 | 1 | //**(5)**// |
^Physical Page 3| 10 | 10 | 10 | //**(3)**// | 3 | 3 | 3 | 3 | 3 | //**(2)**// | 2 | 2 | 2 |
^Physical Page 4| 11 | 11 | 11 | 11 | //**(4)**// | 4 | 4 | 4 | 4 | 4 | //**(3)**// | 3 | 3 |
Now we have... a total of 10 swaps!? Even more than when we have only three physical pages! This is outrageous!
=== Belady's Anomaly ===
In software systems, things are called [[http://en.wikipedia.org/wiki/Anomaly_in_software|anomalies]] when results come out differently from expectation. [[http://en.wikipedia.org/wiki/Belady's_anomaly|Belady's Anomaly]] basically says that even if we increase the physical page count, page faults using FIFO can //increase//! This is not desirable at all, so we will look at another eviction policy.
==== Least Recently Used(LRU) First Policy ====
Like in a scheduling problem, we first look at the straight-forward FIFO policy, and then go on to focus on things that have extreme attributes. Here how we predict future accesses from the past is similar to the //**locality of reference**// concept, which means storage units that have related attributes(in space, time) are more likely to be accessed nearly frequently. In page replacement, we say that the least recently used page is also least likely to be used in the future.
^**Reference String**^ start ^ **--1--** ^ **--2--** ^ **--3--** ^ **--4--** ^ **--1--** ^ **--2--** ^ **--5--** ^ **--1--** ^ **--2--** ^ **--3--** ^ **--4--** ^ **--5--** ^
^Physical Page 1| 8 | //**(1)**// | 1 | 1 | //**(4)**// | 4 | 4 | //**(5)**// | 5 | 5 | //**(3)**// | 3 | 3 |
^Physical Page 2| 9 | 9 | //**(2)**// | 2 | 2 | //**(1)**// | 1 | 1 | 1 | 1 | 1 | //**(4)**// | 4 |
^Physical Page 3| 10 | 10 | 10 | //**(3)**// | 3 | 3 | //**(2)**// | 2 | 2 | 2 | 2 | 2 | //**(5)**// |
^**Reference String**^ start ^ **--1--** ^ **--2--** ^ **--3--** ^ **--4--** ^ **--1--** ^ **--2--** ^ **--5--** ^ **--1--** ^ **--2--** ^ **--3--** ^ **--4--** ^ **--5--** ^
^Physical Page 1| 8 | //**(1)**// | 1 | 1 | 1 ^ 1 ^ 1 | 1 | 1 | 1 | 1 | 1 | //**(5)**// |
^Physical Page 2| 9 | 9 | //**(2)**// | 2 | 2 | 2 ^ 2 | 2 | 2 | 2 | 2 | 2 | 2 |
^Physical Page 3| 10 | 10 | 10 ^ //**(3)**// ^ 3 ^ 3 ^ 3 | //**(5)**// | 5 | 5 | 5 | //**(4)**// | 4 |
^Physical Page 4| 11 | 11 | 11 | 11 ^ //**(4)**// ^ 4 ^ 4 | 4 | 4 | 4 | //**(3)**// | 3 | 3 |
This yields 10 swaps when there are 3 physical pages, and 8 when 4 physical pages. The LRU policy does not suffer from Belady's Anomaly, which means we can always improve(or at least not worsen) performance by adding more physical memory. Variants of LRU are largely used in contemporary operating systems.
==== Tracking Page Accesses ====
While discussing the eviction policies, you might have thought how does the kernel know what pages have been accessed?
There is a software and a hardware method to attack this problem.
== Software ==
The software trick is to //force// page faults so that the kernel can track pages. It works as follows:
* Start every process with blank address space (during every timer interrupt)
* On page fault,
if (page in memory) {
move page to front of LRU list
mark page as accessible
return
}
else
swap
Note: Page faults require context switching and can be expensive, so this method should be used with care.
== Hardware ==
In the hardware method, once a page is accessed, the processor sets the "accessed" bit in the corresponding page table entry. The kernel can check this bit to determine page usage.
===== Robustness =====
Aside from utilization, virtual memory helps with robustness by enforcing process isolation. The operating system takes care of this by doing the following:
- Each process is given its own virtual address space. We know that virtual memory no longer contrains programmers by giving processes the illusion of a large memory space. But by getting its own address space, processes cannot interfere and access the code of other processses. Virtual address 0x100000 for process A is not the same as virtual address 0x100000 for process B. You probably noticed this when completing the [[http://www.cs.ucla.edu/~kohler/class/07f-osp/weensyos2.html|2nd minilab]]. .
- The page directories and tables are stored in the kernel portion of memory so that processes may not alter them. By keeping the page tables "off limits" to processes, processes cannot modify their mappings to gain access to unauthorized parts of physical memory.
===== Performance =====
So far we've seen that virtual memory has helped with utilization and robustness. The party doesn't end there though. Virtual memory can improve performance as well, especially application startup performance.
==== Demand Paging ====
What percentage of the features do most people use on huge applications such as Adobe Photoshop? Probably no more than 20%, and as a result large portions of binary code sit in memory unused. This is not only bad utilization, it's bad performance if the //entire// program must be loaded. For a program like Photoshop, this can take more time than we have patience for!
The solution is [[http://en.wikipedia.org/wiki/Demand_paging|demand paging]]. Demand paging is when the OS only loads process code at the moment it's needed. We implement this by setting the swaploc to point to the locations of the binary executable on disk. When a process needs certain code that hasn't been loaded, it causes a page fault and the OS swaps in the portion from disk. With a buffer cache, the benefits of demand paging are even greater.
==== Shared Libraries ====
Nearly all software programs today use software [[http://en.wikipedia.org/wiki/Library_%28computing%29|libraries]] to make program development easier. However, this means every running process probably has some repetitive code in it. This is again bad utilization and can negatively affect application startup performance. Shared libraries save the day on this issue. Essentially only one copy of a library shared by processes is stored in memory instead of each process having its own copy. This can introduce some thorny issues because processes are accessing pages that aren't theirs. How do we deal with this? We need to make sure the ptentry flags mark the page as read only. See the [[notes:lec16|next lecture]] to read more about it!
===== Summary =====
Virtual memory provides benefits to all areas of OS interface design. Basically virtual memory takes advantage of the disk to provide applications with the illusion of large physical memory and to enforce process isolation with virtual address spaces.
In particular, the following key concepts were covered:
* [[lec15#Overview|Virtual addresses]] - With the help of page tables, a virtual address is //mapped// to a physical address in memory.
* [[lec15#Utilization: Swapping|Swapping]] - Swapping is when the operating system takes a page from memory and places it in the disk's swap area so that it can load a different page.
* [[lec15#Thrashing|Thrashing]] - Evil! This is when the operating system spends more time swapping pages between disk and memory rather than actually letting processes do useful work.
* [[lec15#Utilization: Eviction Policies|Eviction Policies]]:
* [[lec15#First-In First-Out|First in first out]] - Evict the page that was swapped in furthest in the past.
* [[lec15#Least Recently Used|Least recently used]] - Evict the page that was last used furthest in the past.
* [[lec15#Optimal Replacement|Optimal Replacement]] - Evict page that will be accessed farthest in the future. The best eviction policy, but impossible to implement perfectly without a crystal ball.