====== Lecture 13 Scribe Notes ====== //Compiled by: Gerald Chang, Yiwei Guo, Eric Marcin-Cuddy// Last class: We talked about file systems and briefly began discussing characteristics of the FAT file system. ===== FAT File System ===== The FAT file system uses 1-kilobyte blocks to store data, and blocks are linked together in the file allocation table (FAT). The FAT contains one entry per block of data on the disk, and therefore must be large enough to store an entry each address on the disk. By not requiring entries to be contiguous, we avoid external fragmentation. A sample FAT file system below is 2MB large and has 4-byte addresses. (For comparison, the standard FAT32 has 32-bit addresses.) ^ Block No. | 0 | 1 | 2 to 9 | 10 to 2047 | ^ Name | Boot sector | Superblock | File allocation table | Data blocks | ^ Information | Contains file system type | FAT implementation data | 1 entry per block | Actual data | The FAT itself contains addresses to successive blocks in a file, or the number zero if that particular block is the terminal block of the file. The number zero can also mean that the block is not permitted to be written, as is the case for the boot sector, superblock, and FAT blocks. It will contain -1 if the block is free for writing. If for example we had a 4-kilobyte file that occupied blocks 101 to 104 and a 2-kilobyte file that occupied blocks 100 and 105, the FAT would look something like this: ^ Block No. | 0-9 | 10-99 | 100 | 101 | 102 | 103 | 104 | 105 | 106-2047 | ^ Address | all 0 | all -1 | 105 | 102 | 103 | 104 | 0 | 0 | all -1 | The first block for actual file system data is block #10. A file may not necessarily occupy an entire block. If this is the case, it will fill up its final block until it is full and then continue onto the next free block. The advantage of this process is that it is simple and few special data structures are stored on the disk. Furthermore, the FAT is unambiguous as to which block is free and also how to get to the next block. However, there is not as much locality of reference since file blocks do not necessarily have to be contiguous, which means that special procedures like prefetching do not work as well. The file system is similar to a linked list, but the links are stored in FAT. **Block Address Characteristics:** * **-1:** Free Space * **0 :** No succeeding block (last block of the file) **Advantages:** There are few data structures stored on disk; tells us which and where data is stored\\ **Disadvantages:** Locality of reference; complex prefetching ==== Invariants ==== An **//invariant//** is a statement that must always be true for a data structure to remain consistent. Here are some invariants for FAT: * FAT entries must be -1 ≤ FAT ≤ 2047 (here, the size of the disk) * No chain of pointers can contain -1 * All file pointer chains must be disjoint (except for 0) * Boot sector, superblock, and FAT pointers must be 0 * No file's pointer chain can include the boot sector, superblock, or FAT addresses * Every pointer chain terminates at 0 **For Directories:** * Subdirectory chains are linear * If a block is above the FAT and is not accessible from the root directory, then its FAT must be -1 * Every directory entry has a different name * No two directory entries point to the same block * Subdirectories are valid * Directories' block pointers must be above the FAT block * Size ≤ 1024B * (length of pointer chain) ==== Directories ==== **//Directories//** map names to files. They are stored like a normal file in the data blocks, and can span multiple blocks, if necessary, just like a regular file. Directories contain entries, which are of the following format ^ Name ^ Pointer to the first block ^ File size ^ File type ^ Additional information (FS-specific) | | unique | Cannot be boot sector, superblock, or FAT block | | Regular/Directory | Permissions, user ID, other flags | **Ex: How to find "/home/kohler/grades"?** - Find root directory (located in superblock) - Search root directory for "home" - Search "home" for "kohler" - Search "kohler" for "grades" === Directory Hierarchy === ===== 1970: Unix File System ===== The Unix file system introduced two major ideas: - uses a tree of pointers instead of a linked list, which gives O(log n) seek time instead of O(n) - allows multiple directory entries to point to the same file For the example below, which is the same 2MB disk as above, the free block bitmap will store one bit of data per block, which will only take up 256 bytes of space. Your actual mileage may vary. ^ Block No. | 0 | 1 | 2 (256B used) | 3 upward | ^ Name | Boot sector | Superblock | Free block bitmap | Data blocks | ^ Information | Contains file system type | FAT implementation data | Says whether block is free | Actual data | **Maximum File Size:**\\ direct: 10 x 1KB = 10KB\\ indirect: 10KB + 1KB x 256 = 266KB\\ indirect2: 266KB + 1KB x 256 x 256 = 266 + 65536 = 65802KB\\ nlink: How many directory entries refer to this file? ===== Renaming a File Robustly ===== * Consistent FileSystem (one that follows all invariants) * At least 1 version of the file exists **2 Ways to Remove the File:**\\ Remove first:\\ - Write 10 (root directory) ~ kill hello.txt - Write 11 (sub) ~ add hello.txt Add first:\\ - Write 11 (sub) ~ add hello.txt - Write 10 (root) ~ kill hello.txt //*Invariant broken: 2 directories point to the same block; more invariants will be broken...// //**Both ways DON'T work. If power is pulled between steps 1 and 2, invariants are broken. The first order is better only because the file is lost and not the entire disk.**// ==== Invariants ==== - Every block used for exactly 1 purpose - All referenced blocks are initialized - No referenced blocks are free. - All un-referenced blocks are free. //*The 4th invariant can sometimes be broken, because only disk space is lost.//