You are expected to understand this. CS 111 Operating Systems Principles, Winter 2011
You are here: CS111: [[notes:lec15]]

Lecture 15 Scribe Notes

Buffer Cache and Virtual Memory

Buffer cache is the primary cache for File System data. It is used to store things such as dallying and prefetch data. In order to do its job, the buffer cache needs:

  • File system data
  • Location of the data on the disk: Map cache memory to a disk location. (The cache is useless if it does not keep exact track of what data it is caching).
  • A mapping of file locations to pages: This requires the use of the fmap function.

The second and third attributes must be implemented so that the buffer cache works correctly. In addition, utilization may decrease in terms of memory if multiple pages of the same page are stored within the buffer cache. This wastes space and is unnecessary as only one page is required. Thankfully, fmap() ensures that each block has at most one location in the buffer cache.
The fmap structure is as follows:

fmap(file, offset)

  • Returns either the primary location or address of a page, or a 0 if such a page is not available.
  • File: This argument may be thought of as an inode.
  • Offset: This argument

Another problem that may arise when using a buffer cache is a possible violation of the Safety Property. For example, let two processes, A and B, write to a file named a.txt. A writes to the file by executing the command echo "foo" >> a.txt while B executes its own write command with echo "bar" >> a.txt. The following diagram depicts the current scenario, and the contents of the cache may be ambiguous depending on the contents of the cache:

If the write commands above are executed and sequential consistency is maintained, the only possible outputs in a.txt are “foobar” or “barfoo”. However, if there are multiple copies of a.txt in cache and each process happens to write to a different copy, a.txt may contain only “foo” or “bar” instead of both because the disk may only update itself with only one of the cache copies of a.txt.

Virtual Memory

Process Virtual Memory and Physical Memory typically look like this:

Process Virtual Memory

Physical Memory

The operating system provides each process with virtual memory so each process believes it has access to all the physical memory. The operating system also tries to make the buffer cache as large as possible without hurting performance.

The process code in virtual memory is typically fixed size and read-only. Both the process data and the read-only data are initialized by the disk. Since the data is read-only, it is wise to try and store it all in the buffer cache. This is accomplished with one simple step:

  • Map the process code (typically read-only, unless dynamic) and read-only data from the buffer cache, which is achieved with the following sequence.

This step yields the next illustration for the buffer cache.

It makes sense to map the code of each process in virtual memory to the same location in the physical memory buffer cache, if possible, to increase the utilization of disk space. Since read-only sharing is implemented, process A and B are still isolated since they cannot alter the state of the other process. In order to maintain read-only sharing and process isolation, the processor must generate exceptions on write to read-only pages (as it normally should).

Running Out of Space

In case the buffer cache runs out of memory, an EVICTION (flushing a portion of the buffer cache) must take place. Two things need to take place when performing an eviction:

  1. Write changed data to disk (Optional)
  2. Mark the memory space as free/reusable

If the data has changed since the last read, it will be written to disk. A structure is necessary to track this, or the cache data would have to be checked against disk data for each eviction.

fmap() will change the return value to show that a page is available after eviction. This is also true for the mapping of cache memory to a disk location. However, another important thing to consider is that process data should not be changed when the process might be utilizing such data. If a process needs access to data that has been evicted, the processor must generate an exception. To address this concern, a page table is required.

Page Table

The page table is a processor-interpreted data structure mapping virtual addresses to physical addresses.

It is important to throw an exception if a process attempts to access a page not present in the cache. The page can then be loaded from disk into physical memory, and the process can then resume at the location where it left off. Therefore, the processor now needs an additional rule when it comes to dealing with pages. The previous and new rule are stated below:

  • The processor needs to throw or must generate an exception on a write to read-only pages.
  • The processor must generate an exception on access to a non-present page.

The page table can then be utilized with the following function:

pagetable(pid_t, address)

  • Returns the physical address of a page, its permissions (R, RW, Kernel Only), or a 0 if no such page exists.


Recalling that a buffer may be full, another possibility is that a process may require the use of a page that is currently not present in the cache. Therefore, it may be necessary to remove a page from the buffer and also find out the address to the store the page into. This requires having a buffer cache process map. The function bcpmap() would work in the following manner:

bcpmap(pid_t, address)

  • Returns a zero if target is not found. Otherwise it provides information about the virtual page. The virtual page information consists of the file, the offset, a mapped variable (indicates if page is in buffer cache), the mapped address (mapped_addr), and whether it is a copy on write (is_it_copy_on_write, more on this later).

With this addition, the steps necessary for an eviction now become:

  1. Write changed data to disk (Optional)
  2. Mark the memory space as free/reusable.
  3. Modify fmap() to mark file data as unmapped.
  4. Modify bcpmap() to mark file data as unmapped.
  5. Modify pagetable() to mark virtual addresses as unmapped.

How to Evict

When cache is full or a program needs a page, it is necessary to evict. A decision is required on what to evict. The problem of deciding what to evict becomes a scheduling problem. As with other scheduling problems, two policies can be implemented:

  • First In First Out (First page in, first page out)
  • Shortest First (Least Recently Used or page with fewest accesses)

First In First Out

Treating the disk pages as numbers, it is possible that some files make use of several pages:

From this it is only necessary to worry about accessing the cache. For now consider a cache that only has room for three pages and uses the FCFS scheduling algorithm with the page sequence: 1, 2, 3, 4, 1, 2, 4, 1, 2, 3, 4, 5. The following result is obtained with this algorithm (The sequence is shown in the top row, the remaining rows show the cache contents):

The cache is initially empty. If the page is currently not in the cache, a disk access is required. Each circle represents a disk access where a page is loaded by the disk since it was not in the cache. The circles or disk loads are called swaps. Notice that the cache is full after loading page 3, and that to load page 4 an eviction is required. Page 1 is evicted because it was the first page loaded into the cache. At some point no swaps (circles) are found because the required pages are already in the cache. However, page 3 is required after this. Looking back, page 1 was again the page that was loaded first (during the 4th column or page in the sequence) and is therefore the one that gets swapped for page 3. The number of total swaps for this example is 9 loads.

Next consider what happens when 4 pages are used in the cache instead of 3.

Although one would intuitively think that adding more memory to the cache would increase performance, in this case we see that the number of loads increases! This phenomenon where increasing the number of pages stored in cache increases the number of cache misses when using a FIFO scheduling algorithm is known as Belady's Anomaly.

Least Recently Used (LRU)

Now load the pages using LRU, where one evicts pages that were accessed furthest in the past.

With 3 pages, there are 10 loads, which seems that the algorithm may be worse, but with 4 pages the result is:

The total loads are now 8, which is much better than with 3 pages or when using FIFO. In addition, this algorithm does not suffer from Belady’s anomaly. An optimal algorithm would require knowledge of future page requests. Such an algorithm would evict the page that will be accessed furthest in the future. This is not possible in general, but there are algorithms that can somewhat predict the future. For example, it was mentioned earlier that a file may use two or more pages, so a file may use pages 1 and 2 or 3 and 4. If page 1 is accessed, then it most likely that page 2 will be required since it is part of the file. The same is true if the file used pages 3 and 4. The purpose of making better algorithms is to reduce the number of loads. A load requires disk access, which includes other costs such as inter-request costs among others. Removing loads therefore speeds things up if a page is already in memory.

Blocks vs. Pages

The relationship between blocks and pages can be viewed using a simple analogy:

Block : File System :: Page : Physical Memory

A block avoids external fragmentation of disk data by implementing fixed-size allocations of disk data. Similarly, a page avoid external fragmentation of primary memory by implementing fixed-size allocations. A page is usually 4096 bytes, so it makes sense to have a block size of 4096 bytes as well.

Improving Performance of fork()

When using fork(), some of the contents in the copied process may be shared (e.g. kernel, code) between the processes in physical memory, while other information (e.g. global data, heap, stack) may be copied to physical memory.

To simplify things, only consider the code, stack, and kernel:

In order to increase performance, we can initially have the child’s stack point to the same location in physical memory as the parent’s stack, and mark the stack as read-only. Then only after a write is attempted on the stack, an exception can be thrown, which will allow the OS to take over and create separate copies of the stack for the child and parent. A large reason for copy-on-write fork's efficiency is because a fork is often shortly followed by an exec() command in code, which wipes out all of the processes' memory anyway. Therefore, copying the stack during the initial fork operation is rendered pointless.

However, in order to perform this operation, the OS needs to ensure that it is a fork on copy instruction (or copy on write) to know whether the instruction is a segmentation fault or whether a copy is necessary.

Regarding Fork Bombs

Fork bombs still pose a problem in this implementation. The reason being that bcpmap(), pid, and process descriptors are all examples of information that is not shared between forked processes. Fork bombs make use of this fact to try and fill the buffer cache. In the buffer cache, a WORKING SET is known as the pages that are currently being actively accessed and used. If the working set is much greater than the available physical memory, many evictions would be required to fulfill all requests. Evictions will lead to large amounts of page loads from disk, which can take millions of cycles to complete. Eventually all memory access becomes a disk access situation (which is a what a fork bomb strives to achieve) causing the system to crawl to a halt. This situation is known as THRASHING. It may be the case that many (or even every) access requires a disk access, with each one taking many thousands of cycles to complete and causing the system to become unusably slow.

Distributed Systems and Security (Defensive Programming)

A distributed system is when a process talks to another process in another system, which is essentially a network. Distributed systems are difficult to implement. As an example, suppose there are multiple computers communicating over a network. In order to abstract such a system, a Remote Procedure Call (RPC) is utilized. The RPC allows communication to take place between two systems, and does so by allowing a program to execute a procedure in another computer. For more information on RPC see

Remote Procedure Call

The RPC looks like a function call. An RPC implementation contacts another computer first, and then the other computer executes the function code.

Some functions are not meant to be RPCs, but there are others that make sense to implement such as download_web_page(const char *url). The RPC consists of the following steps:

  1. Arguments are marshalled into a sequence of bytes.
  2. Transmit bytes to remote computer.
  3. Unmarshalled in the other computer.
  4. Calculate
  5. Marshal return values and send the response.

There is a weird aspect when it comes to networks and RPCs. For a regular function, it is expected that a function returns. However, in a network, many functions may never return due to unreliability, intrusion, or disconnection. The application and functions must be built to handle these possibilities. Also, the openness of the transmission protocol has many implications for security. Data must be transmitted securely when necessary since any computer can eavesdrop or misrepresent itself as the target. Worse still, a hostile computer could pretend to be someone else and transmit data that you will receive and act on. Step 3 above is where many vulnerabilities lie - incoming data must be checked carefully for authenticity and safety unless you want a compromised system.

notes/lec15.txt · Last modified: 2011/03/16 17:54 by alanq
Recent changes RSS feed Driven by DokuWiki