====== Lecture 2 Scribe Notes ======
//By Darren Yeung, Zhu Yuyuan, and Eric Williams//
====== Good Interfaces: An Interface Tour of an OS ======
===== Qualities of an Interface =====
In order to allow two systems to communicate and communicate well, one of the most important things is to create a quality interface. The main features that are associated with a quality OS interface are:
* Robustness
* Generality
* Simplicity
* Utilization
* Performance
The hypothesis of this lecture is that someone has asked the professor to create a program that counts the number of words in a given file. The program will run on an x86-compatible computer, read its file from an IDE disk, and print the results to a CGA-compatible console. The professor decides to write an entire operating system whose only purpose is to count words in a file (why not?). As we analyze the resulting program, and improve it in terms of robustness, generality, simplicity, utilization, and performance, we'll discover a lot of the general concepts and OS details that shape modern operating system designs, ending up with something a lot like a modern OS.
Let’s look at the basic ideas behind the Word Count program. This program should:
* Count the number of words in a file
* Print that number to the screen
* Run on an x86-compatible machine
* Use an IDE disk
* Use a console screen (CGA)
**Operating System** – an operating system, or OS, is defined by Wikipedia as "a set of computer programs that manage the hardware and software resources of a computer. An operating system processes raw system and user input and responds by allocating and managing tasks and internal system resources as a service to users and programs of the system. At the foundation of all system software, an operating system performs basic tasks such as controlling and allocating memory, prioritizing system requests, controlling input and output devices, facilitating networking, and managing file systems. Most operating systems have a command line interpreter as a basic user interface, but they may also provide a graphical user interface (GUI) for ease of operation. The operating system forms a platform for other system software and for application software."
**Kernel** – a kernel is defined by Wikipedia as "the central component of most computer operating systems. Its responsibilities include managing the system's resources and the communication between hardware and software components. As a basic component of an operating system, a kernel provides the lowest-level abstraction layer for the resources (especially memory, processors and I/O devices) that applications must control to perform their function. It typically makes these facilities available to application processes through inter-process communication mechanisms and system calls."
{{200px-kernel.png|}}
Image Copyright(C) Wikipedia 2007
===== Memory Partitioning =====
Where are programs considered to live? Programs are considered to live in system memory, and for this reason, it is up to us to at some point determine where we want our Word Count program to reside in memory.
//"640kb of memory ought to be enough for anybody”// – often attributed to Bill Gates ([[http://tickletux.wordpress.com/2007/02/20/did-bill-gates-say-the-640k-line/|but he didn't exactly say it]])
{{system_memory.jpg|}}
The way that memory is partitioned is based on a standard for x86 machines. The lower 1MB of memory is used for special purposes, such as hardware interfaces, where as the higher 1GB memory range is used for the OS kernel. In our example system, the range 0x80000000 – 0xC0000000 will be used to hold our Word Count program in memory.
The process by which a computer starts up and loads and runs its initial set of software is called //**bootstrapping**//. This process includes the following steps:
==== Bootstrapping Steps ====
- The PC firmware looks for disks, and upon finding one, reads the first sector of the disk. (Note: a sector is 512 bytes). This sector is then written into memory location 0x7C00.
- The computer then jumps to this address in memory, at which point the OS of the system takes over.
- Setup now occurs, which is required to be a short and simple process. This is all dependent on the fact that during the initial bootstrap step, only 512 bytes of disk were loaded into memory. The purpose of the setup code is to load additional, larger portions of code into memory which will eventually load the OS completely.
- Jump to the kernel’s addresses in memory (here 0xC0000000, or the 3GB marker)
- Load the Word Count program into memory using similar code to below used for setup
- Jump to the Word Count program’s code in memory (here 0x80000000, or the 2GB marker)
==== Bootstrapping Setup Code Example ====
for (i = 1; i <= 20; i++)
{
read_sector (i, 0xC0000000 + (i – 1) * 512);
}
read_sector (int secno, uint32_t addr)
{
/* Specialized code for IDE disks */
while ((inb (0x1F7) & 0xC0) != 0x40)
/* Do Nothing */;
outb (0x1F2, 1);
outb (0x1F3, (secno & 255));
outb (0x1F4, (secno >> 8) & 255);
outb (0x1F5 (secno >> 16) & 255);
outb (0x1F6 (secno >> 24) & 0xE0);
outb (0x1F7, 0x20);
while ((inb (0x1F7) & 0xC0) != 0x40)
/* Do Nothing */;
insl (0x1F0, addr, 128);
}
**Note**: 0x1F2 – 0x1F7 represent port addresses
{{disk_image.jpg|}}
===== Base Application Code =====
So what exactly does our program do? Below is a possible instance of application code that could have been written to perform the tasks we require. We will need software for both counting the words and then display that results later.
==== Word Count Portion ====
int nwords = 0; /* Number of words */
char buf[512]; /* Buffer holding the contents of a given buffer */
int s = 31; /* File starting sector number */
while (s < 3000)
{
read_sector (s, &buf[0]);
int j = 0;
while (j < 512) {
if (buf[j] == ‘ ‘)
nwords++;
}
s++;
}
==== Printing Portion ====
uint16_t *screen = 0xB8014; /* Screen memory address */
while (nwords) {
*screen = (nwords % 10) + ‘0’; /*ASCII math to convert digits to printable values */
nwords /= 10;
screen--;
}
===== Critiques and Solutions =====
==== Initial Critiques ====
Given the initial design of the program as seen above, there are some critiques that can be leveled at this approach. As a matter of **//robustness//**, the above program design fails in a couple ways. First, the program could potentially overwrite memory it shouldn’t given a file with an extremely large number of words. By the same token, the program at this point is able to overwrite any location in memory at any time. This comes as a result of the application having full access to the hardware through outb() and inb(). In addition, the above design fails in terms of **//generality//** given its specific nature. The above program is only able to read from IDE disks and it only works on certain formatted files.
==== Robustness ====
Let's first consider the problem of robustness. As a solution to the second robustness problem, we can change //read_sector//() to be a call to //sys_read_sector//(). This would enforce the application to make a system call to read a sector, while at the same time adding simplicity and lessen the number of possible errors. This type of change would change the modularity of the system.
**Modularity** – breaks a system into subsystems or modules that interact at their interface\\
**Hard Modularity** – modules are prevented from violating interface boundaries\\
**Soft Modularity** – modules do not violate interface boundaries by convention\\
Note that the original program design that we wrote above was merely using soft modularity. As mentioned before, the //outb//() method was left accessible to the program Word Count. In addition, it is still possible to change the //sys_read_sector//() system call to use 0x30 in place of 0x20. This would cause problems: the IDE command 0x20 means READ SECTORS, but 0x30 means WRITE SECTORS, so the word count program could scribble all over the disk!
Further solutions to improve our OS and program revolve around hardware support and process isolation. Process isolation protects the kernel and I/O from applications by using hard modularity.
The goals that we set for ourselves are as follows:
* We would like to make //inb//() and //outb//() inaccessible to the application entirely
* We want to ensure that applications can’t change the kernel memory or I/O memory
* We want to give the application only limited access to the disk
* This can be done via checks in the //read_sector//() method
* We would like to limit system calls so that they can’t jump into memory randomly
* //Protected control transfer// means that applications transfer control into the kernel in limited, specific ways: random jumps into kernel memory that would otherwise force the kernel to perform illegal operations are prevented
At this point, issues of robustness have been greatly improved from our original design. The next issue to focus on is the issue of generality.
==== Generality ====
As mentioned before, one of the most obvious limitations of our original design is that it can only use IDE disks. A possible solution that comes to mind first is to create additional method calls that would work for other sources, such as SCSI, CD-ROM, and others. Those methods would look something like:
sys_read_scsi_sector (int secno, uint32_t addr);
sys_read_cdrom_sector (int secno, uint32_t addr);
sys_read_*_sector (int secno, uint32_t addr);
The problem with such a solution is that it then forces programmers to use specific methods each time data needs to be accessed in their application. In addition, they would need to update their software when data sources change. A possible next step then would be to chain logical statements in the application that would ensure the program always used the correct methods.
if (IDE)
sys_read_IDE_sector;
else if (SCSI)
sys_read_scsi_sector;
else if (CDROM)
sys_read_cdrom_sector;
...
A problem with the above code though is that it would need to reside in each and every application. As a starting point for a more logical solution, consider the following piece of pseudo code would be called by various applications, but reside in the OS kernel:
sys_read_disk_sector (int sectorno, uint32_t addr)
{
if (IDE) read IDE;
if (SCSI) read SCSI;
if (CDROM) read CDROM;
...
}
This code represents a concept known as **//virtualization//**.
**Virtualization** – a module that simulates an environment or module that provides roughly the same behavior as the original thing.
There is still a problem though with this method, which is that we are limited to reading in sizes of only 512 bytes, or the size of one sector. Furthermore, the current virtualized method still assumes that there is only one disk that we would ever want to read from or write to.
The solution to the latter problem is rather simple and straightforward. In order to add support for multiple data sources, all that needs to be done is a parameter needs to be added. This extra parameter in the read and write methods would denote which source we would like to use. For example:
//sys_read_disk_sector// (int sectorno, uint32_t addr, int diskno);
There are problems to consider such as getting a file from a network. Extending the disk-virtualization concept then, we can virtualize our system to use a file rather than a sector.
This brings up an interesting point related to UNIX systems. In UNIX, the big concept is that everything in the system is a file. This includes every little thing too, even all the way down to the position of a mouse being maintained as a file.
**File** –
* A file is a source and/or sink of data (meaning that it is readable or writable)
* A file is opened via its name (usually)
* Only opened files can be read or written to
* The amount of data that can be read or written to can be an arbitrary group of bytes
There are a number of important system calls related to files, but here are perhaps the most important:
int open (const char *name, int rw);
ssize_t read (int fd, char *buf, size_t size);
/* fd = file descriptor */
ssize_t write ( int fd, char *buf, size_t size);
int close (int fd);
**Abstraction** – a virtualization with a significantly different interface from the real thing; it is usually more general
At this point generality has been greatly improved. Continuing on, the next set of issues to overcome are issues of performance.
==== Performance ====
One of the performance issues is related to having extra copies of read sectors in memory. Extra copies in memory have a few indirect consequences, such as providing less memory to store other things which are more needed. A more dire problem though will occur when we are only reading or concerned with a single character at a time. In an overly simplistic design, we will be forced to read the same disk sector repeatedly in order to gain access to various parts of the sector at different times. Consider the following code:
sys_read(int fd, char *buf, size_t size) {
struct file *f = MAGIC(fd); /* f contains the diskno, disktype, filelocation(sector), current file position */
}
This code will then perform the following steps:
- Find the sector no
- char sectorbuf[512];
- sys_disk_read_sector(sectorno, §orbuf);
- memcpy(buf …, sectorbuf …, size);
sys_read(int fd, char *buf, size_t size) {
struct file *f = MAGIC(fd); /* f contains the diskno, disktype, filelocation(sector), current file position */
while (size > 0) {
int sectorno = f->offset / 512; /* Find sector number: 512 bytes per sector */
int sectoroff = f->offset % 512; /* Find offset within sector */
int nbytes; /* Number of bytes to read from cur sector */
char sectorbuf[512]; /* Create a buffer to hold the read sector */
sys_disk_read_sector(sectorno, §orbuf); /* Fill the buffer with the read data */
nbytes = (size < 512 - sectoroff ? size : 512 - sectoroff);
memcpy(buf, §orbuf[sectoroff], nbytes); /* Copy the data over to the user's buffer */
buf += nbytes; /* Advance pointers and file offset */
size -= nbytes;
f->offset += nbytes;
}
}
With this design, we would read in an entire sector for a character, only to throw it away right after. In the event we are reading nearby, single characters multiple times, this means the same sector is going to be continually read into memory 512 times in order to access each character of the sector when we want it. This is absolutely horrible for performance! The less severe problem of duplicate copies also appears here during the fourth step in which an extra copy of the read memory exists. In general, the above process would also be considered slow. Given that it is highly likely we will need to access the disk multiple times just to bring the same data into memory, it would wise to have a way of getting around the relatively slow disk access time.
=== Quick Performance Solutions Overview ===
Solutions to these problems include **//batching//**, **//prefetching//**, and **//caching//**.
**Batch processing** – for those of us in CS111, batch processing (or "batching") is defined as performing multiple requests as one combined one to reduce request overload. //Wikipedia gives a longer variation on this which states that batching processing is "the execution of a series of programs ("jobs") on a computer without human interaction, when possible. Batch jobs are set up so they can be run to completion without human interaction, so all input data is preselected through scripts or commandline parameters. This is in contrast to interactive programs which prompt the user for such input."//
**Speculation** - speculation was defined by Professor Kohler in class in two manners based on the same concept. The first is that speculation is assuming something will happen before it happens so that data will be ready. The second is that speculation is performing potential requests in advance so that results are immediately available when asked for.
**Prefetching** – given the definition of speculation above, Professor Kohler defined prefetching then as speculation as it pertains to reads. //Wikipedia also has this to say on the issue: "modern microprocessors are much faster than the memory where the program is kept, meaning that the program's instructions cannot be read fast enough to keep the microprocessor busy. Prefetching is the processor action of getting an instruction from the memory well before it will need it. In this way, the processor will not need to wait for the memory to answer its request."//
**Caching** – making a temporary local copy of request results to avoid future, repetitious requests
Adding caching to the 4-step design above can be done fairly easily. In order to add caching, the code can be converted into the following pseudo code:
//Steps 2 and 3 above become//:
if (sectorno not in cache)
{
allocate space in cache for that sectorno;
sys_disk_read_sector(sectorno, cache[sectorno]);
}
//Step 4 becomes//:
memcpy(buf …, cache[sectorno], size);
By applying these changes, the performance of the read can be improved by a factor of 512 on paper, and even greater in practice. The idea is that by keeping the read sector around for a slightly longer period of time, the next time we need to access a single character we need only refer to the cached copy of the sector to find it, rather than accessing the disk an additional time for the very same sector on disk.
===== Further Critiques =====
There are still many critiques that can be leveled at the base design. These critiques will be covered in future lectures.