You are expected to understand this. CS 111 Operating Systems Principles, Spring 2007
You are here: CS111: [[2007spring:notes:lec13]]
 
 
 

Lecture 13 Scribe Notes

Compiled by: Gerald Chang, Yiwei Guo, Eric Marcin-Cuddy

Last class: We talked about file systems and briefly began discussing characteristics of the FAT file system.

FAT File System

The FAT file system uses 1-kilobyte blocks to store data, and blocks are linked together in the file allocation table (FAT). The FAT contains one entry per block of data on the disk, and therefore must be large enough to store an entry each address on the disk. By not requiring entries to be contiguous, we avoid external fragmentation. A sample FAT file system below is 2MB large and has 4-byte addresses. (For comparison, the standard FAT32 has 32-bit addresses.)

Block No. 0 1 2 to 9 10 to 2047
Name Boot sector Superblock File allocation table Data blocks
Information Contains file system type FAT implementation data 1 entry per block Actual data

The FAT itself contains addresses to successive blocks in a file, or the number zero if that particular block is the terminal block of the file. The number zero can also mean that the block is not permitted to be written, as is the case for the boot sector, superblock, and FAT blocks. It will contain -1 if the block is free for writing. If for example we had a 4-kilobyte file that occupied blocks 101 to 104 and a 2-kilobyte file that occupied blocks 100 and 105, the FAT would look something like this:

Block No. 0-9 10-99 100 101 102 103 104 105 106-2047
Address all 0 all -1 105 102 103 104 0 0 all -1

The first block for actual file system data is block #10. A file may not necessarily occupy an entire block. If this is the case, it will fill up its final block until it is full and then continue onto the next free block. The advantage of this process is that it is simple and few special data structures are stored on the disk. Furthermore, the FAT is unambiguous as to which block is free and also how to get to the next block. However, there is not as much locality of reference since file blocks do not necessarily have to be contiguous, which means that special procedures like prefetching do not work as well. The file system is similar to a linked list, but the links are stored in FAT.

Block Address Characteristics:

  • -1: Free Space
  • 0 : No succeeding block (last block of the file)

Advantages: There are few data structures stored on disk; tells us which and where data is stored
Disadvantages: Locality of reference; complex prefetching

Invariants

An invariant is a statement that must always be true for a data structure to remain consistent. Here are some invariants for FAT:

  • FAT entries must be -1 ≤ FAT ≤ 2047 (here, the size of the disk)
  • No chain of pointers can contain -1
  • All file pointer chains must be disjoint (except for 0)
  • Boot sector, superblock, and FAT pointers must be 0
  • No file's pointer chain can include the boot sector, superblock, or FAT addresses
  • Every pointer chain terminates at 0

For Directories:

  • Subdirectory chains are linear
  • If a block is above the FAT and is not accessible from the root directory, then its FAT must be -1
  • Every directory entry has a different name
  • No two directory entries point to the same block
  • Subdirectories are valid
  • Directories' block pointers must be above the FAT block
  • Size ≤ 1024B * (length of pointer chain)

Directories

Directories map names to files. They are stored like a normal file in the data blocks, and can span multiple blocks, if necessary, just like a regular file. Directories contain entries, which are of the following format

Name Pointer to the first block File size File type Additional information (FS-specific)
unique Cannot be boot sector, superblock, or FAT block Regular/Directory Permissions, user ID, other flags

Ex: How to find "/home/kohler/grades"?

  1. Find root directory (located in superblock)
  2. Search root directory for "home"
  3. Search "home" for "kohler"
  4. Search "kohler" for "grades"

Directory Hierarchy

1970: Unix File System

The Unix file system introduced two major ideas:

  1. uses a tree of pointers instead of a linked list, which gives O(log n) seek time instead of O(n)
  2. allows multiple directory entries to point to the same file

For the example below, which is the same 2MB disk as above, the free block bitmap will store one bit of data per block, which will only take up 256 bytes of space. Your actual mileage may vary.

Block No. 0 1 2 (256B used) 3 upward
Name Boot sector Superblock Free block bitmap Data blocks
Information Contains file system type FAT implementation data Says whether block is free Actual data

Maximum File Size:
direct: 10 x 1KB = 10KB
indirect: 10KB + 1KB x 256 = 266KB
indirect2: 266KB + 1KB x 256 x 256 = 266 + 65536 = 65802KB

nlink: How many directory entries refer to this file?

Renaming a File Robustly

  • Consistent FileSystem (one that follows all invariants)
  • At least 1 version of the file exists

2 Ways to Remove the File:
Remove first:

  1. Write 10 (root directory) ~ kill hello.txt
  2. Write 11 (sub) ~ add hello.txt

Add first:

  1. Write 11 (sub) ~ add hello.txt
  2. Write 10 (root) ~ kill hello.txt

*Invariant broken: 2 directories point to the same block; more invariants will be broken...

Both ways DON'T work. If power is pulled between steps 1 and 2, invariants are broken. The first order is better only because the file is lost and not the entire disk.

Invariants

  1. Every block used for exactly 1 purpose
  2. All referenced blocks are initialized
  3. No referenced blocks are free.
  4. All un-referenced blocks are free.

*The 4th invariant can sometimes be broken, because only disk space is lost.

 
2007spring/notes/lec13.txt · Last modified: 2007/09/28 00:28 (external edit)
 
Recent changes RSS feed Driven by DokuWiki