You are expected to understand this. CS 111 Operating Systems Principles, Spring 2006
You are here: CS111: [[2006spring:lab3]]
 
 
 

Lab 3

Quick link: Getting kernel source

Overview

In Lab 2, we used a RAM disk block device to teach you about the Linux kernel setting and about synchronization. In Lab 3, you'll learn about file systems by writing your own file system driver for Linux.

Operating systems generally support many different file systems, from old-school FAT (common in DOS machines) to new formats such as Linux's ext3. Each file system type is written to fit a common interface, called the VFS layer. (File systems are one of the object oriented aspects of the OS we discussed in class.) You will see how a real VFS layer works by filling out a simple file system implementation. You will load that file system module into the kernel and mount the file system, allowing you to access your file system with usual Unix commands like cat, dd, emacs, and even firefox. The file system driver stores its data in a built-in RAM disk, kind of like your RAM disk from Lab 2.

Like Lab 2, this lab requires that you use a Linux machine; additionally, you must have the necessary permission to load kernel modules on that machine. The Linux lab machines in 4405 Boelter have been set up for this, but you can also use your own machine.

Lab materials

The skeleton code you will get for this lab consists of the following files:

ospfs.h Commented header file defining file system structures and constants.
ospfsmod.c The file system module's source code. All exercises in the lab involve changing this code.
truncate.c Utility: Truncates a file to a specified length.
base/ The contents of this directory are copied into your file system's initial RAM disk. You can change the contents of base/, then type make clean; make, to add files to or remove files from your RAM disk.
runospfs.sh This script automates the procedure of loading your module code into the kernel.
Makefile
ospfsformat.c Utility: Creates a RAM disk image from the base/ directory.
fsimgtoc.c Utility: Links the disk image into your module.

When you unpack lab3.tar.gz (with the command tar xvzf lab3.tar.gz) you will get a directory lab3 which contains these files. Do your work in this directory.

Handin procedure

When you are finished, edit the LAB file and follow the instructions to fill in your name(s), student ID(s), email address(es), short descriptions of any challenge problems you did, and any other information you'd like us to have. Then run "make tarball" which will generate a file lab3-yourusername.tar.gz inside the lab3 directory. Upload this file to CourseWeb using a web browser to turn in the project. (Please do not upload .zip files!)

Background information

File system preliminaries

The file system you will work with, OSPFS, is much simpler than most "real" file systems, but it is powerful enough to read and write files organized in a hierarchical directory structure. OSPFS currently does not support symbolic links, timestamps, permissions, or special device files like most UNIX file systems do. Additionally, your module is only capable of using an in-memory file system, meaning that the data "on disk" is actually stored entirely in main memory. This means that changes made to the file system are not saved across reboots (or across module loads and unloads).

File system structure

Most UNIX file systems divide available disk space into two main types of regions: inode regions and data regions. UNIX file systems assign one inode to each file in the file system; a file's inode holds critical metadata about the file such as its attributes and pointers to its data blocks. The data regions are divided into large data blocks (typically 4KB or more), within which the file system stores file data and directory metadata. Directory entries contain file names and pointers to inodes; a file is said to be hard-linked if multiple directory entries in the file system refer to that file's inode. OSPFS is designed in much the same way.

Both files and directories logically consist of a series of data blocks, which may be scattered throughout the disk much like the pages of a process's virtual address space can be scattered throughout physical memory.

Sectors and blocks

Modern disks perform reads and writes in units of sectors, which today are almost universally 512 bytes each. File systems actually allocate and use disk storage in units of blocks. Be wary of the distinction between the two terms: sector size is a property of the disk hardware, whereas block size is an aspect of the operating system using the disk. A file system's block size must be at least the sector size of the underlying disk, but could be greater (often it is a power of two multiple of the sector size).

The original UNIX file system used a block size of 512 bytes, the same as the sector size of the underlying disk. Most modern file systems use a larger block size, however, because storage space has gotten much cheaper and it is more efficient to manage storage at larger granularities. Our file system will use a block size of 4096 bytes.

 An OSPFS file system with N blocks and M inodes.

Superblocks

File systems typically reserve certain disk blocks, at "easy-to-find" locations on the disk (such as the very start or the very end), to hold meta-data describing properties of the file system as a whole: the block size, disk size, any meta-data required to find the root directory, the time the file system was last mounted, the time the file system was last checked for errors, and so on. These special blocks are called superblocks.

Our file system will have exactly one superblock, which will always be at block 1 on the disk. Its layout is defined by struct ospfs_super in ospfs.h. Block 0 is typically reserved to hold boot loaders and partition tables, so file systems generally never use the very first disk block. Most "real" file systems maintain multiple superblocks, replicated throughout several widely-spaced regions of the disk, so that if one of them is corrupted or the disk develops a media error in that region, the other superblocks can still be found and used to access the file system. OSPFS omits this feature for simplicity.

The block bitmap: managing free disk blocks

Just as a kernel must manage the system's physical memory to ensure that each physical page is used for only one purpose at a time, a file system must manage the blocks of storage on a disk to ensure that each block is used for only one purpose at a time. In file systems it is common to keep track of free disk blocks using a bitmap rather than a linked list, because a bitmap is more storage-efficient than a linked list and easier to keep consistent on the disk. ("FAT" file systems use a linked list.) Searching for a free block in a bitmap can take more CPU time than simply removing the first element of a linked list, but for file systems this isn't a problem because the I/O cost of actually accessing the free block after we find it dominates performance.

To set up a free block bitmap, we reserve a contiguous region of space on the disk large enough to hold one bit for each disk block. For example, since our file system uses 4096-byte blocks, each bitmap block contains 4096*8 = 32768 bits, or enough bits to describe 32768 disk blocks. In other words, for every 32768 disk blocks the file system uses, we must reserve one disk block for the block bitmap. A given bit in the bitmap is set (value 1) if the corresponding block is free, and clear (value 0) if the corresponding block is in use. The block bitmap in our file system always starts at disk block 2, immediately after the superblock. We'll reserve enough bitmap blocks to hold one bit for each block in the entire disk, including the blocks containing the boot sector, the superblock, and the bitmap itself. We will simply make sure that the bitmap bits corresponding to these special, "reserved" areas of the disk are always clear (marked in-use).

 An OSPFS inode with associated directory entry and data blocks, including indirect blocks.

Inodes

Each file and directory on the system corresponds to exactly one inode. The inode stores metadata information about the file, such as its size, its type (file or directory), and the locations of its data blocks. Directory information, such as the file's name and directory location, is not stored in the inode; instead, directory entries refer to inodes by number. This allows the file system to safely move files from one directory to another, and also allows for "hard links". Not all file systems have inodes, but OSPFS does.

The layout of OSPFS's inodes is described by struct ospfs_inode in ospfs.h. The oi_direct array in struct ospfs_inode contains space to store the block numbers of the first 10 (OSPFS_NDIRECT) blocks of the file, which we call the file's direct blocks. For small files up to 10*4096 = 40KB in size, this means that the block numbers of all of the file's blocks will fit directly within the ospfs_inode structure itself. For larger files, however, we need a place to hold the rest of the file's block numbers. For any file greater than 40KB in size, therefore, we allocate an additional disk block, called the file's indirect block, to hold up to 4096/4 = 1024 additional block numbers. The 10th block number is the 0th slot in the indirect block. Our file system therefore allows files to be up to 1034 blocks, or a bit more than four megabytes, in size. To support larger files, "real" file systems typically support double- and triple-indirect blocks as well. See the picture at right.

Inodes are relatively small structures; OSPFS's take up 64 bytes each. They are generally stored in a contiguous table on disk. The size of this table is specified when the file system is created, and in most file systems it cannot later be changed. Just like blocks, an inode can be either in use or free, and just like blocks, a file system can run out of inodes. It is actually possible to run out of inodes before running out of blocks, and in this case no more files can be created even though there is space available to store them. For this reason, and because inodes themselves do not take up much space, the inode table is often created with far more available inodes than the file system is expected to need. (Remember, each file uses an inode, so the maximum number of files and directories the file system can store at once equals the number of inodes.)

In OSPFS, the inode table is allocated immediately after the free block bitmap, and its blocks are also marked as in use in the free block bitmap. The number of inodes is stored in the superblock, and for convenience the block number of the first block of the inode table is also stored in the superblock. (Why is this only for convenience? How could we determine this value based only on the size of the disk?)

Even though your kernel module will not support creating new hard links (unless you do that challenge problem), each inode in OSPFS has a "link count," which is a number indicating how many directory entries refer to it (since the ospfsformat utility knows how to create hard links when it creates the initial OSPFS file system for your module to use). If there are no hard links to a file, then the link count will be 1 (since only one directory entry refers to the inode). For each hard link created in a "real" UNIX file system, the link count increases by 1, and for each hard link removed, the link count decreases by 1. Technically speaking, even the file's original name is called a "hard link" -- it is no different than later directory entries created for the inode except that it was created first. If the link count reaches 0, it means that the inode no longer has any hard links, and its storage space (both the blocks that make up the file and the inode itself) can be marked as free and reused. You will need to support this behavior when you delete files from OSPFS.

You may find the -i and -l options to the ls program useful. The -i option causes ls to display the inode number of each file. The -l option causes ls to display a "long listing" which includes, among other information, the link count and size of each file. These options can also be combined.

Here's an OSPFS file system corresponding to the base/ directory we handed out. (Note that the hello.txt data block is also linked to world.txt -- there is a file with two hard links!)

 An OSPFS file system corresponding to the base/ directory we handed out.

Directories versus regular files

An ospfs_inode structure in our file system can represent either a regular file or a directory; these two types of "files" are distinguished by the type field in the ospfs_inode structure. The file system manages regular files and directory-files in exactly the same way, except that it does not interpret the contents of the data blocks associated with regular files at all, whereas the file system interprets the contents of a directory-file as a series of ospfs_direntry structures describing the files and subdirectories within the directory. Each ospfs_direntry structure just contains a filename and an inode number. All other information about the file -- including size, type, and pointers to data blocks -- is stored in the inode.

Inode number 1 in our file system is special: it is the inode for the root directory of the file system. (Inode number 0 is reserved and must never be used.)

What you have to do

Your assignment will be to implement the routines that handle free block management, changing file sizes, read/write to the file, reading a directory file, and deleting files.

You will need to implement exercises in the following functions for this lab. One plausible order is this:

  • Support for reading files: ospfs_read
  • Support for reading directories: ospfs_dir_readdir
  • Free block bitmap bookkeeping: allocate_block, free_block
  • Support for changing file sizes: change_size
  • Support for writing files: ospfs_write
  • Support for deleting files: ospfs_unlink

All of these functions have comments explaining exactly what you must do. Also read the comments in ospfs.h and towards the top of ospfsmod.c to get oriented.

Challenge problem suggestions

  • Add support for creating files in a directory. This can be done by implementing an additional function in the inode_operations structure (see the function call for create). You may want to look into the d_instantiate() function.
  • Add support for symbolic links. (Symbolic links are created by using the -s option of the ln utility.)
  • Add support for file renaming. (Note that in Unix, renaming a file can also move it to a new directory!)
  • Add support for creating hard links via the link operation. You may want to look into the d_instantiate() function.
  • Use a real block device, such as your RAM disk from Lab 2, instead of the built-in image.
  • Implement UNIX-style permissions. This will require adding new fields to the on-disk inode structure, as well as adding code to read and write the permissions.
  • Add synchronization operations as required to protect the file system against simultaneous modification. The current code is quite unsafe, since it has no locks.

Other information

The Makefile builds an initial OSPFS image using the contents of the base directory. If you want to add files to your OSPFS image, simply add files and directories to base, then run make clean and make to rebuild the image. If make reports an error that the resulting file system is too big, raise the number 512 in the Makefile to something larger.

By default, the image created by ospfsformat contains two directory entries which refer to the same file (that is, they are hard links): hello.txt and world.txt. You can add more files like this by adding hard links to the base directory using the ln command. Read its manual page by typing man ln at a shell prompt. (Note: the link counts on directories are always larger than one. Why might this be?)

The resulting module knows how to provide access to the linked-in OSPFS image, but just loading the module doesn't make the file system available. The script runospfs.sh will perform the necessary steps to make the file system available. After cleaning up any existing copies of the module, it runs make, and then mounts the file system on an existing directory. This hides the directory's existing contents, turning the directory into a "gateway" into the new file system. After loading OSPFS, the runospfs.sh script mounts that file system onto the test directory. When your file system is ready, ls base and ls test should return the same results (except for things like ownership, permissions, and inode numbers).

It is worth discussing how the functionality of the script is implemented. The following list shows the commands used:

  • Installing a module in the kernel: insmod ospfs.ko
  • Checking to see if the module is installed: cat /proc/modules or lsmod
  • Mounting the filesystem onto the test directory: /bin/mount -t ospfs none test
  • Unmounting the filesystem from test: umount test
  • Removing a module from the kernel: rmmod ospfs

If you are not using your own machine, you will need to use the following command instead of running runospfs.sh directly:

sudo /root/runospfs.sh (This has to be done in your lab3 directory)

Testing

After implementing these functions, you should be able to do the following in your filesystem: read and write to files, change the size of files, append to existing files, and read the contents of directories. Note: even after the code is implemented, you will not be able to create files, unless you have completed the challenge portions of the project.

Here is a list of the test cases we will run on your code. Feel free to write a testing script that checks these things automatically for you! You will probably find the dd utility very useful to perform many of the tests. You can read its documentation (man dd) for details of how to use it: check out its if=, of=, bs=, seek=, skip=, and conv=notrunc options. Also the truncate utility we included with the skeleton code will help you test some of the cases. (Suggestion: compare the results of performing operations on your OSPFS file system to doing the same operations on the normal Linux file system.)

  • Directory reading tests: listing files in directories
    • ls of the root directory
    • ls of the root directory including file types (ls -F)
    • ls of subdirectory
    • ls of subdirectory using more than one data block
    • ls of subdirectory using indirect data blocks
  • Reading tests: reading various parts of a file
    • read the first byte of a file
    • read the first block of a file
    • read half of the first block of a file
    • read starting partway through the first block and into part of the next
    • read more than one block
    • try to read past the end of a file
    • try to read into an invalid buffer pointer
  • Overwrite tests: all these tests overwrite part of a file without changing the rest of the file
    • overwrite the first few bytes of a file
    • overwrite the first two blocks of a file
    • overwrite the second block of a file
    • overwrite the middle part of the first block of a file
    • overwrite the second half of the first block of a file
    • overwrite the second half of the first block and the first half of the second block of a file
    • try to write from an invalid buffer pointer
  • Truncation tests: truncating files to various lengths
    • truncate a single data block file to 0 length
    • truncate a file with only direct data blocks to 0 length
    • truncate a file with indirect data blocks to 0 length
    • truncate a single data block file to nonzero but smaller length
    • truncate a file with only direct data blocks to a smaller length that uses the same number of data blocks
    • truncate a file with only direct data blocks to a smaller length which uses fewer data blocks
    • truncate a file with indirect data blocks to a smaller length that still uses indirect data blocks
    • truncate a file with indirect data blocks to a smaller length which only uses direct data blocks
  • Appending tests: adding data to the end of a file (not necessarily using O_APPEND)
    • append data to a file with a single data block without requiring a new block to be allocated
    • append more than one data block of data to a file
    • append several data blocks to a file that already has indirect data blocks
    • append data to a file with no indirect data blocks so that the appended data uses indirect data blocks
    • try to append data to a file so that there would be more than the maximum allowed data blocks
  • Block management tests: checking to make sure the block bitmap is managed correctly
    • try to allocate a block when there are no free blocks
    • free a block and check that that block can be reallocated
    • try to allocate 3 blocks when only 2 are free
    • truncate a file so that it no longer has any indirect data blocks and ensure that the indirect pointer block is freed
  • Deletion tests: checking that deleting files works correctly
    • delete a file from a directory
    • delete a file with more than one hard link from a directory
    • delete all hard links to an inode

Good luck!

 
2006spring/lab3.txt · Last modified: 2006/09/26 11:38 (external edit)
 
Recent changes RSS feed Driven by DokuWiki