You are expected to understand this. CS 111 Operating Systems Principles, Fall 2007
You are here: CS111: [[notes:lec2]]
 
 
 

Lecture 2 Scribe Notes

--- Yan Mayhar-Fang 2007/10/08 23:45

Writing an operating system to count words?

Ursula is looking for someone in our class to write a program for her with a very nice offer $200,000. But she wants to completely control all the source code involved.

Here are Ursula's initial requirements:

  1. A Dell PC computer with an Intel x86-like processor.
  2. An ATA (ATA = Advanced Technology Attachment) hard disk drive.
  3. She will store a text file and our program on this hard disk drive.
  4. When she turns her Dell on, it should print the number of words to the screen with the file unchanged.

What are the possible questions we may raise about her requirements?

  1. How is the file placed on the disk?
  2. Is it an ASCII text file?
  3. Is the file terminated by a zero character '\0'?
  4. How much memory does this Dell PC have?
  5. How is the program stored on disk?
  6. How is the word defined?
  7. What are the future requirements?

What are the answers to those questions?

  1. We decide how the file is placed on the disk.
  2. It is ASCII text file.
  3. The file is terminated by a zero character '\0'.
  4. This Dell PC has 1GB memory.
  5. We decide how the program is stored on disk
  6. Word is a sequence of letters separated by non-letters.
  7. Future requirements include:
  • Different types of disks
  • Running other programs safely which means no overwriting is allowed.

What do we need to know about the hardware?

  • the processor architecture --- X86 manual
  • ATA protocol manual
  • What happens when the machine turns on? --- BOOTSTRAPPING
  • How can the processor fetch information from the disk?
  • How can the processor cause information to appear on the screen? --- I/O DEVICE INTERACTION

--- Nahal Farhi 2007/10/08 19:51

Bootstrapping

The process by which a computer starts up and loads and runs its initial set of software is called bootstrapping [Spring 2007 lecture 2]. In order to run a software on machine, it should be loaded in memory. So let's look at memory and addressing.
Computer memory is arranged as an array of bytes. Each byte is a value between 0 and 255. Address is an array index into memory. In x86, addresses are bounded to 32 bits. Therefore, we can have at most 4GB memory. Ursula's machine has 1GB of memory which means the highest meaningful bit is 230-1. This memory space is not all RAM. In one part, it contains hardware (BIOS) instructions. When the computer is turned on, the instruction pointer points to the BIOS. Now let's see what BIOS does to find software on the machine.

Bootstrapping Steps

--- Nahal Farhi 2007/10/07 18:51

  1. BIOS looks at the disks plugged into the system and checks if the disks contain bootable code.
  2. The ATA hard disk is considered bootable iff the 510th and 511th bytes of its contents equal 0x55 and 0xAA, respectively.
  3. On finding the bootable disk, BIOS reads the first sector of the disk at primary memory location 0x7C00. This sector is called boot sector. (Note: Sector is the unit of disk storage and it is 512 bytes.)
  4. BIOS jumps to location 0x7C00 in memory. From this point on, BIOS is gone and software is in charge.

Boot sector contains a boot loader. A boot loader is a program which loads the main operating system into memory.

To decide on how to implement the boot loader, we have to address 3 questions:

  • Where is the operating system on the disk?

OS can not be in sector zero because that is the boot sector. A possible location for OS is sector one. If OS is around 10,000 bytes, OS starts from sector 1 and ends at sector 19 (byte# 512 to byte# 10240).

  • Where should we load the OS in memory?

We can not load the OS where BIOS and the boot loader reside. A possible location for OS on memory is address 0x100000 which is after the memory address of boot loader. Here is the disk layout:

  • How can we communicate with the disk?

For each type of disk we should use the appropriate instruction for reading from and writing to disk available in the manual. Here we have a sample boot loader working with an IDE hard drive:

 
/* This loop loads sectors 1 to 19 of the disk (operating system) into memory */
/* Sector 1 is loaded in address 0x100000 Sector 2 is loaded in address 0x100000+512 and so on */
/* After loading the OS in memory the instruction pointer jumps to the beginning of OS at address 0x100000 */
for (i = 1; i < 20; i++)
    read_ide_sector(i, 0x100000 + (i - 1) * 512);
goto 0x100000;
 
/*  read_ide_sector reads sector number secno from disk and loads it into memory starting from address addr */
void read_ide_sector(int secno, uint32_t addr) {
    /* PROGRAMMED I/O REGISTERS send commands to the disk controller */
 
    /* unsigned char inb(int port) reads a byte from a port */
    while ((inb(0x1F7) & 0xC0) != 0x40) /* wait until the disk is not busy */
        /* do nothing */;
 
    /* void outb(int port, uchar_t data) writes a byte to an 8 bit I/O port */
    outb(0x1F2, 1);	/* read 1 sector */
    /* secno is an integer (4 bytes) but with outb you can only write one byte at a time */
    /* therefore we write secno byte by byte to ports 0x1F3 to 0x1F6 */
    outb(0x1F3, (secno & 255));
    outb(0x1F4, (secno >> 8) & 255);
    outb(0x1F5, (secno >> 16) & 255);
    outb(0x1F6, (secno >> 24) & 255);
    outb(0x1F7, 0x20); /* 0x20 == READ SECTORS command */
 
    while ((inb(0x1F7) & 0xC0) != 0x40) /* wait until the disk is not busy */
        /* do nothing */;
 
    /* read 512 bytes into addr. 128 is number of words the disk should transfer. 128 == 512/4 */
    insl(0x1F0, addr, 128); 
}

This interaction mode between processor and an I/O device where explicit commands are sent using special instructions is called programmed I/O. After loading the operating system into memory the lay out of memory is as follows:

Now let’s look at the operating system code. As mentioned before, this OS should load the Ursula’s file from the disk into memory and count the number of words in it. But where is Ursula’s file on disk? Ursula can store the file anywhere after sector 19 on disk and in disk case she has stored it in sector 20. We do not need to know the end of the file because the file is terminated with '\0'.

 
void main() {
    int nwords = 0;	/* Number of words */
    int in_word = 0;	/* Non-zero when this_alpha is alphabetic otherwise 0*/
    char buf[512];	/* Buffer holding the contents of a sector */
    int s = 20;	/* File's beginning sector */
    while (1) {
	// read sector s to buf
        read_ide_sector(s, &buf[0]);
        for (int j = 0; j < 512; j++) {
            // examine the buffer for end of file ‘\0’
            if (buf[j] == '\0')
                goto eof;
	    // int isalpha(int character) determines if character is alphabetical character 
            int this_alpha = isalpha((unsigned char) buf[j]);
            /* if th current character is not alphabetic but the previous one is alphabetic, we have a word*/
            if (!this_alpha && in_word)
                nwords++;
            in_word = this_alpha;
        } /* for */
        s++;
    } /* while */
    eof:
    if (in_word) /* if last byte is alpha, we have not counted the last word */
        nwords++;
    // code to write the number of words on CGA console
}
 

--- Nima Nikzad 2007/10/09 00:34

Memory Mapped I/O

Memory mapped I/O provides a method for processor-I/O device interaction involving special locations in shared memory. This is often used by display devices and network adapters.

  • How does it work?

Data is written to a segment of shared memory, allowing the CPU and device to communicate indirectly.

  • Example: CGA Console

The CGA console reads from memory starting at location 0xB8000. The span of memory defines a 80 by 25 grid on the display. This method of display can be used in our OS to display the number of words in the file to the screen. Starting at 0xB8000, each character is defined by a pair of bytes. The low-order byte designates the character, while the high-order byte designates the color of the character on the display.

  • Example: Memory location defining the first character on the display
    • 0xb8000 - First character
    • 0xb8001 - First character's color

The following code writes the number of counted words to the screen using CGA memory-mapped I/O:

 uint8_t *screen = 0xB8014;         /* Screen memory address */
 while (nwords != 0)
{
     screen[0] = (nwords % 10) + '0'; /* ASCII math to convert digits
                                         to printable values */
     screen[1] = 0x07;                /* color: grey on black */
     nwords /= 10;
     screen -= 2;
 }

An example of how a network adapter uses memory-mapped I/O to communicate with the rest of the system. Source: http://www.karbosguide.com/.

Modularity

Modularity is the process of breaking a system into subsystems, or MODULES, that communicate at their interfaces. A good modularity can help programmers overcome challanges. The keys to good modularity are interface design and understanding the system goals and implementation requirements. A good example to consider is our earlier 'read_sector' function.

  • Example: The 'read_sector' function

What is the best way to approach this function?

void read_sector(int sectorno, uint32_t addr)
{
     switch(disk_type){
          case IDE:
               read_ide_sector(sectorno, addr);
               break;
          case FLASH:
               read_flash_sector(sectorno, addr);
               break;
    ...etc...
}

What are the problems with this approach? For one, we do not know how many and what types of disks the user has!

Disk# Type Sector size
0 IDE 512
1 FLASH 512
2 Other? 128?

What if there is some type of disk we did not anticipate? What if the user wants to use a disk that is not the primary disk? We would greatly benefit from a much more versatile interface!

void read_disk(int diskno, uint32_t addr, int offset, int nbytes);

This allows us to use the same function for any of the disks the user has, while also allowing us work properly with different sector sizes without a whole new function.

  • More on this next lecture!
 
notes/lec2.txt · Last modified: 2007/10/09 02:40 by nnikzad
 
Recent changes RSS feed Driven by DokuWiki