====== Lecture 6 Scribe Notes ====== //Lecture date: 04/18/2007// //Notes by: Angel Darquea, Dominic Reinhard & Andrew Chao// ==== Process Interaction II and Scheduling ==== === System calls: wait(), exit() and waitpid()=== ==wait()== The [[http://homepages.cwi.nl/~aeb/linux/man2html/man2/wait.2.html|wait()]] system call can be seen as the abstraction of an I/O device (//virtual I/O device//). A parent process uses wait() when it wants to wait until a child process terminates and to determine the child process' exit status: pid_t wait(int *status); wait() will return the process ID of a terminated child process, freeing up the process descriptor of a process so that it can be re-used. ==exit()== The [[http://homepages.cwi.nl/~aeb/linux/man2html/man2/_exit.2.html|exit()]] system call will simply terminate a process. void exit(int status); A process that has exited must continue to exist, in some reduced form, so that the parent can eventually retrieve its exit status. These dead, but living, processes are called zombie processes. For details on [[http://www.youtube.com/watch?v=HKh2CI6T_c0|how to kill a zombie]] –process, keep reading. If every process had a unique process ID, there would be no ambiguity of whether a process had exited or not. (Allocate process IDs sequentially starting from 1. Then if you want to check whether process ID X is running, check as follows. If X is bigger than the largest process ID ever created, then X never ran. Otherwise, if X is running, then it's running. Otherwise, X was created but has exited.) However, the process ID space would grow without bound, requiring really long process IDs eventually! Current systems were designed to use 32-bit process IDs. But on modern machines, this ID space is pretty small, so we have to worry about process ID reuse. Consider the following: while (1) if (fork() == 0) exit(0); //This resulted in 220,000 processes per second, wrapping a 32 bit ID space in 6 hours. The solution is zombie processes. When a process has exited, we retain its process descriptor and exit status until the exit status is collected with ''wait'' or ''waitpid''. Then we free the process descriptor, allowing the process ID to be reused. In this model, a process's exit status may only be collected //once//, since after that point, the ID might be reused by a different process. ==waitpid()== The [[http://homepages.cwi.nl/~aeb/linux/man2html/man2/waitpid.2.html|waitpid()]] system call is very similar to the wait() system call with the difference that when used, it allows the parent to specify which child process to wait for: pid_t waitpid(pid_t pid, int *status, int flags) The pid_t pid parameter asks if the process with ID 'pid' has exited yet. Then, the exit status of 'pid' is stored in int *status. Finally, int flags are set as follows: 0 to block, WNOHANG to poll. This system call will return 'pid' if the specified child process exited or '-1' if it has not exited yet. It's implementation is as follows: p= fork(); if (p = 0 ) { ; ; execvp(); } else if (p>0) { int status; waitpid(p, &status, 0) } ---- **Note 1:** When exit() is called to terminate a process, the process' exit status needs to eventually be collected; if this does not occur, the process is called a zombie process. Zombie processes are killed once their status is collected by wait() or waitpid(). **Note 2:** It is important to note that in Unix, only the parent can collect the status of a children process. ---- \\ \\ **Processes' Hierarchy:** {{processes_hierarchy.jpg|}} \\ It is important to remember that the system calls wait() or waitpid() will collect a process's exit status at most once: when they are used, they report that the child has exited, freeing up the process descriptor so that the process ID can be reused; basically killing that zombie. Here lies a disadvantage of block: once blocked, the process is no longer runnable, therefore there is no way to know if the process has exited. Consider the following: p=fork(); if( p ==0) printf("Hi\n"); else exit(0); In Unix, if a process dies, its parent parents the orphans which are then called orphan zombies. === Signals === A signal is like a “virtual interrupt” for processes. What is useful about an interrupt? An interrupt helps utilization by allowing a processor to do other work while waiting for hardware. Hardware interrupts have hardware handlers, and likewise, signals (software interrupts) have their own handlers. Here's some sample code for a signal handler: typedef void (*sighandler_t)(int signo); // A type of function that returns void & takes a single int argument sighandler_t signal(int signo, sighandler_t handler); signal(5, &f); // add to process descriptor 5 a signal handler table // Function f will now be executed when signal 5 occurs. In hardware, if executing a non-existing instruction, dividing by 0 , accessing non-existing memory, etc. the processor will take an interrupt. Signals give processors a chance to print a nicer error message before the process dies. There are about 32 singals supported by UNIX, which are divided into several subcategories: Processor Fault Interrupts: * SIGILL - illegal instruction (dangerous instructions) * SIGFPE - floating point error (e.g., divide by 0) * SIGSEGV - segmentation violation (process tries to access kernel memory, non-existing memory, etc.) Timer interrupt: * SIGALRM - alarm signal (alarm (sec)) (sec- delay in seconds between calls to alarm) I/O errors: * SIGPIPE - process is trying to write to a closed pipe User signals: * SIGKILL - kills process * SIGUSR1 - user-defined signal * SIGUSR2 - user-defined signal Every signal has a default action, which is normally to kill the process it interrupted. However, signal() can be used to install a new handler. To restore the default handler, simply call: signal(signo, SIG_DFL); //SIG_DFL - default Here's an example of when a signal is used: //say we're in a process with CPL (current privilege level) of 3 ... lidt 0x2000 // the process calls lidt and tries to change the current interrupt descriptor table. TRAP! // oops! only the kernel (CPL = 0) can change the interrupt descriptor table. // control of the processor is taken by the OS The operating system now: - Determines the current process. - Finds its SIGILL signal handler - Calls that handler (at CPL = 3) The process sees the handler run and come back to the same spot in the code that caused the trap. Even further virtualization of this interrupt concept allows processes to call interrupts on other processes, or to send signals to other processes. Example: kill(pid_t p, int signo); // sends signo to process p. This would be useful in the case where a process ran into an infinite loop and needed to be closed. This concept 'violates' process isolation, but makes the system much easier to manage (signals are actually defined in the process isolation of Linux OS, so we aren't technically violating the OS's rules). Another example of 'violating' process isolation for manageability is a debugger. A debugger is a separate process that has full access to another processes memory and can change registers. This also violates process isolation but makes it easier to manage. === Threads === previous scribe notes about threads: [[http://www.read.cs.ucla.edu/111/2006fall/notes/lec5|fall 2006]] [[http://www.read.cs.ucla.edu/111/2006spring/notes/lec5|spring 2006]] **What is a thread?** //“A thread is a virtual processor running in the context of a process.” – Kohler// In other words, a thread has a very similar functionality to a process. Its purpose is to provide the program with another way to fork itself that can help improve utilization. Much like a process, it has a thread descriptor that holds its registers, thread ID, state (wait queue), and stack. **If they are so similar, why not just use processes? What makes a thread unique?** Threads will share the same memory; they break down the process isolation that separates processes from each other. To get a better grasp on this concept, we go back to the quote above. Threads that are generated within a process will share the process descriptor, address space, and file descriptor table. This allows for faster communication and faster processing. ie. A thread is trying to read disk, but is blocked. Another thread can immediately pick up and keep the processor running. Another variation on the above example could be a server receiving bad requests. If a server receives an infinite size request that drains the server of its memory, or a non-existent request that blocks server, the errors can be rectified by creating a new thread. A new thread allows the server to bypass the bad request. **While we have taken a look at the advantages of threads, what are the disadvantages? What are some concerns?** Due to the lack of process isolation, a poorly implemented thread can easily overwrite the data of another thread. Because of this, it is much more difficult to synchronize the threads and coordinate the flux of data. New threads can't be created using a function like fork() function; this is because when a new thread is forked, its pointers for the registers and stack will continue to point to the parent version. The way to deal with this problem is to start with an empty stack. Let’s take a look into a standard implementation of threads. **POSIX (Portable Operating System Interface) Threads** POSIX is an IEEE standard int pthread_create(pthread_t *threadid_out, const pthread_attr_t *attr, void *(*start_function)(void *), void *start_function_arg); ie. possible thread function for void *(*start_function)(void *) void *f(void *arg){ char *x = arg; return x+5; } int pthread_join (pthread_t tid, void* *exit_status); void pthread_exit (void *exit_status); This server code has problems with slow or malicious connections, because it only handles one connection at a time. For example, a malicious Internet user can cause the server to block forever, simply by never completing its request! int main (…){ int fd = socket(1…); bind(fd, PORT 80); listen(fd); while (1) { int conn_fd = accept(fd, ...); char* buf = malloc(1024); int bufsiz = 1024; int len = 0; while (request is not complete && bufsize < 2000000) { ssize_t s = read(conn_fd, buf+len, bufsiz - len); len += s; if (len >= bufsiz) grew buffer; } handle_request(conn_fd, buf, len); close(conn_fd); } } This variant of the server code doesn't suffer from this problem, since it serves every request in a separate thread. void *read_and_handle_request(void *arg) { int conn_fd = (int) arg; char* buf = malloc(1024); int bufsiz = 1024; int len = 0; while (request is not complete && bufsize < 2000000) { ssize_t s = read(conn_fd, buf+len, bufsiz - len); len += s; if (len >= bufsiz) grew buffer; } handle_request(conn_fd, buf, len); close(conn_fd); pthread_exit(0); } int main (..) { while(1) { pthread_t ptid; int conn_fd = accept(fd,…); pthread_create (&ptid, NULL, &read_and_handle_request, (void *)conn_fd); } } === Scheduling === How can we multiplex the physical computer's resources among virtual computers (processes)? ** ABSTACT SCHEDULING MODEL ** The OS has a set of REQEUSTS, each with an ARRIVAL TIME and an AMOUNT OF WORK (tau). Our goal is to run requests until they're all complete. The simplest implementation of this is the first come first server model or FCFS model, also known as the FIFO or First in First Out model. First Come First Serve – Run requests, in order of arrival time, to completion. Example: ^Process ^Arrival time ^ Amount of work ^ | A | -4 | 5 | | B | -3 | 2 | | C | -2 | 9 | | D | -1 | 4 | |A. . . . . |B . . |C . . . . . . . . . |D . . . . | A starts at t = 0 \\ B starts at t = 5+x \\ C starts at t = 7+2x \\ D starts at t = 16+3x \\ The total execution time = 20+3x \\ Where x = context switch time Utilization = 20/ (20+3x), Wavg = (38+6x)/4 = 9.5+1.5x Utilization – The amount of work to be done, divided by how long it takes to do that work. Waiting time (W)- Distance from arrival time to first run Turn around Time (TT)- Distance from the arrival to completion. Context Switch Time (X) – The time it takes to switch between two processes. Please See the lecture 7 notes for more on scheduling.