====== Midterm Preparation ====== ===== Definitions ===== **System**. A set of interconnected components that has a specified behavior observed at the interface with its environment. **Emergent properties**. The commonly observed system characteristic that some properties of a system are not present in the components, but show up only in the aggregate. **Propagation of effects**. The commonly observed system characteristic that small changes propagate to affect the whole system's behavior. **Incommensurate scaling**. The commonly observed system characteristic that not all parts of a system follow the same scaling rules, so as a system increases in size or speed, things go wrong. **Tradeoffs**. The commonly observed system characteristic that not all of a system's goals can be simultaneously achieved, since some conflict with one another. **Modularity**. Breaking systems into subsystems, or //modules//, that interact only at their interfaces. **Hard modularity**. A type of modularity where modules are prevented from violating interface boundaries. **Soft modularity**. A type of modularity where modules do not violate interface boundaries //by convention//. **Performance**. A desired systems property where a system handles requests efficiently, without significant slowdown. **Robustness**. A desired systems property where errors in a module have only limited effects. **Generality** or **Versatility**. A desired systems property where a system's interface effectively supports many different types of application. **Simplicity**. A desired systems property where a system's interface is easy to understand and work with. **Utilization**. A desired systems property where most of a system's capacity is spent doing useful work. Related to performance. Utilization is also a metric between 0 and 1, measured as the fraction of a module or service's capacity that is used for useful work. **Isolation**. A property of modularity where modules cannot modify, or (sometimes) even detect, other modules' internal state except through the modules' interfaces. Aids robustness. **Virtualization**. A type of module interface where a module simulates another module or environment very closely, providing roughly the same behavior as the real instance. Example: virtual memory. **Abstraction**. A type of module interface where a module provides a significantly different interface than the module or environment that it represents. The provided interface is usually more general. Example: file descriptors. **Busy waiting** or **spinning**. A method of waiting for a condition that involves repeatedly checking for the condition, to the exclusion of other processing. **Polling**. A method of waiting for a condition that involves repeatedly checking for the condition, but yielding after each check to allow other processing to continue. **Blocking**. Releasing control of the processor for a long period. The blocked code will only resume when explicitly woken up by other code. Also, a method of waiting for a condition that involves blocking when the condition is false, and waking up when the condition may become true. **Process**. A virtual computer; a program in execution in an isolated domain. The main form of hard modularity in modern operating systems. **Kernel**. The part of a modern operating system that runs with full privilege and direct access to hardware devices. Contrast processes. **Process isolation**. The type of hard modularity, or isolation, implemented by modern operating systems for user-level applications. **System call**. A kernel request made by a process. **Protected control transfer**. The mechanism, often involving //software interrupts// or //traps//, by which a process makes a request of the kernel. System calls are implemented via protected control transfer. **Current privilege level** or **CPL**. A special register in modern higher-end processors whose value distinguishes the kernel from processes. **Dangerous instruction**. An instruction that can only be executed by the kernel (lest process isolation be violated). **Address space**. The portion of primary memory addressible by a particular process. **Virtual memory**. The mechanism by which an operating system and a processor architecture can isolate process memory. Virtual memory is used to implement address spaces. **Thread**. A virtual processor; a virtual execution context. Each process contains at least one thread. **File descriptor**. A virtual I/O device on a Unix-like operating system. The file descriptor interface is significantly abstracted from real I/O device interfaces. **Process descriptor**. The in-kernel data structure that represents a process. **Process state**. The portion of a process descriptor that says whether a process is runnable or blocked. **File abstraction**. A virtual I/O device that represents a linear array of bytes of known, possibly finite size, allowing random access. **Stream abstraction**. A virtual I/O device that represents a linear array of bytes of unknown, possibly infinite size, allowing only sequential access. **Pipe**. The Unix implementation of a stream abstraction among a process, or multiple processes, on the same machine. **Bounded buffer**. A particular implementation strategy for pipes or other stream abstractions involving a circular array of bytes. **Signal**. A virtual interrupt on a Unix-like operating system, where a process is interrupted due to some condition, then possibly executes code in a special //signal handler// before returning to normal processing. **Scheduling**. The technique of sharing a single resource among multiple requests, by serving those requests in some order. **Process scheduling**. A variant of scheduling used by an operating system kernel to share a processor by running processes in some order. **Cooperative scheduling**. A type of scheduling where once a request gets control of the resource, it can only give up control voluntarily. **Preemptive scheduling**. A type of scheduling where requests may be //preempted//, meaning a running request loses control of its resource without voluntarily giving it up. **Arrival time**. In request scheduling, the time when a request arrives. **Amount of work** (tau). In request scheduling, the amount of time a request will take to complete. **Waiting time** (W). In request scheduling, the interval between a request's arrival time and when the request first gets control of the resource. **Turnaround time** (TT). In request scheduling, the interval between a request's arrival time and when the request completes. **Starvation**. A property of a scheduling algorithm where a request's turnaround time may be infinite, even though the request represents only a finite amount of work. **Priority**. A number indicating the importance of a request, where numerically smaller priority values are generally more important. **Strict priority scheduling**. A type of request scheduling in which a request with priority P will not run until all more important requests have completed. **Real-time scheduling**. A type of request scheduling in which requests obtain guarantees about when they will run. **Synchronization**. Algorithms and methods used to ensure that multiple requests, or processes, can robustly run in parallel. **Sequential consistency**. Two or more processes are //sequentially consistent// if, for any execution, the same results as that execution could be obtained by interleaving the operations of the processes in some strictly sequential order that obeys each process's own order. **Race condition**. A programming error that might cause sequential consistency to be violated. **Isolation atomicity** or **indivisibility**. Two sequences of operations are //atomically isolated// if their effect, from the point of view of the invokers, is the same as if one sequence completed before the other sequence began. **Critical section**. A set of instructions where sequential consistency is preserved as long as //at most one// processor's instruction pointer is in the set at any instant. **Minimal critical section**. The smallest possible critical section. **Mutual exclusion**. A synchronization property that prevents more than one process from performing some operation simultaneously. **Spinlock** or **polling mutex**. A synchronization object that implements mutual exclusion using spinning. Its operations are ''acquire'' and ''release''. **[Blocking] mutex**. A synchronization object that implements mutual exclusion using blocking. Like a spinlock, but blocking. **Sleep/wakeup race**. A type of race condition common to simplistic implementations of blocking mutexes where a process can block forever, waiting to acquire the mutex, even though the mutex is unlocked. **Condition variable**. A synchronization object that implements waiting for a condition. Its operations are ''wait'' and ''notify''. **System call synchronization**. The synchronization behavior defined by the operating system for each system call. Most system calls guarantee isolation atomicity. **Deadlock**. A synchronization error where one or more threads blocks forever, making it impossible for them to make further progress, because of some incorrect use of synchronization primitives such as mutex locks. ===== Topic outline ===== * The systems perspective * **System**: A set of interconnected components that has a specified behavior observed at the interface with its environment * System characteristics * Emergent properties * Propagation of effects * Incommensurate scaling * Trade-offs * Coping with complexity: **Modularity** * Modularity goals * Performance * Utilization * Robustness/fault-tolerance * Versatility * Simplicity * An OS is software that provides programs with access to a virtual computer/abstract machine * Protection and sharing * Interface design * Counting words: from direct hardware access, through process isolation, to file descriptors * Layering to improve interface usability while keeping performance * Virtual computer * Von Neumann computer characteristics * Virtual I/O, virtual processor, virtual memory * Hard modularity through process isolation * The process abstraction * Memory layout * Isolation properties * System calls & implementation * Protection levels & protected control transfer * File descriptors * Unix process and file system calls * Bounded buffers and pipes * Process hierarchy & zombie processes * Context switches * Threads * Process vs. thread * Threads, stacks, and shared state (How does creating a new thread differ from fork and why?) * Interrupts & signals * Unix signal system calls * Scheduling * Requests * Arrival time and amount of work * Waiting time * Turnaround time * Utilization * First-come, first-serve (FIFO) * Shortest job first * Round robin: preemptive, cooperative * Priority scheduling * Priority inversion * Aging * Real-time scheduling * Deadline scheduling, earliest deadline first * Starvation * Synchronization * Indivisibility & atomicity * Critical section * Minimal critical section * Bounded buffer * One-writer principle * Enforcing critical sections * Atomic instructions * Mutual exclusion locks * Coarse-grained vs. fine-grained locking * Deadlock * Waits-for graph * Lock ordering * Read/write locks * Blocking wait * Sleep-wakeup race * Semaphores * Condition variables * Utilization ==== Other ideas (not covered in detail this quarter) ==== **Caching**. A performance enhancement strategy where a program makes a temporary local copy of request results to avoid future requests. **Speculation**. A performance enhancement strategy where a program performs multiple requests in advance so that their results are immediately available when needed. **Prefetching**. Speculation on read requests. Requires a cache. **Locality of reference**. A common property of programs that requests clustered //in time// tend to refer to objects and/or data that are clustered //in space//. ("Space" might be address space for memory, or sector number or file name for disk.) For example, if some application accesses the first byte in a file, then its future accesses will likely refer to other nearby bytes in the file. Locality of reference increases the effectiveness of caches, prefetching, and other performance enhancement strategies. **Cache coherence**. A cache is //coherent// if the cache always contains the most up-to-date copy of the results it is storing. **Batching**. A performance enhancement strategy where a program combines several requests into one to reduce the per-request overhead. **Yielding**. Temporarily releasing control of the processor, allowing other code to run. **Lock contention**.