====== Process ====== Q:- What is a process? A:- A program in execution on a von Neumann computer. Von Neumann Machine: contains the following hardware * ALU * Control Unit * Primary memory * I/O devices **Note**: Process is a program running at a particular time. Main user abstraction in modern OS: Kernel<-------------->Process OS designer (MS,Linux etc)<------------>Application designer One Kernel<------------->Many processes **Some Definitions:** • //Abstract machine interface// is the interface between a process and the kernel. • //System call// is a call through which a process makes a request to the kernel. • //Classic Process// is a process running on a von Neumann computer. **Note:** Abstract Machine is single processor machine. ===== Process Descriptor (Process control block): ===== **Accounting** * Process ID (pid) * Owner (user name) * Statistics (For how long have the computer been running etc.) * Priority (Which process is more important than the other etc.) **Abstract Machine:** * Address Space (For programs, Data) * Registers. [eip (instruction pointer), esp (stack pointer)] * Process State (Blocked or Running) * Wait Queues & Ready Queues (Which kernel list is the process on) **Kernel Resources & Kernel State:** * File descriptor. * File structure. * Signal handler. * Kernel Status and Kernel stack. Von Neumann Computer ---------- Process ---------- C Program Interface ^ Turing Machine ^ Process State ^ | Tape | Read-Write Data (Global Variables) | | Transition Function | Instructions (Program text) which is read only | | Current Turing Machine State | Instruction Pointer (IP) | | Head Position | Register | ===== Address Space ===== **Linux Process address space (x86):** {{.:fig3-5.jpg|.:fig3-5.jpg}} Let’s see how we got the addresses above: #include #include int x; //creating a globalm variable int main (int c, char **v) { void *p = malloc (1); //dynamically allocating a memory space printf ("Prog %p\n Data %p\n //print on the screen: Prog, Data, Heap, Stack Heap %p\n Stack %p\n", //now get their addresses in your computer: &main, &x, p, &c); //Program address is where your main begins -> 0x8000000 } //Data is where the global variable being created //Heap starts at your dynamically allocated memory place ->0x8660000 //Stack begins where you push your first local variable ->0xc00000 ===== Stack (LastInFirstOut) ===== The stack is an area of memory reserved for function arguments and local variables. The stack is allocated a function at a time: when we call a function, we push more space onto the stack to hold that function's local variables; and when the function returns, we pop its local variable space off the stack. The architecture has a special register, called the stack pointer (%esp on x86), that points to the current stack location. Stacks generally grow from the top down, so when a function is called, the stack pointer is set to a smaller value. We need to keep local variables that are used during the running program somewhere in memory. Where can we store Local Variables in memory? To place Local Variables where Global Variables are kept is a bad idea. Here is why: * //Thought Experiment:// Let's treat local variables like global variables. That is, every function will reserve some space in the global variable area (Read-Write Data) for its local variables. * Why this doesn't work? //Recursion:// Functions that can call themselves! When a function calls itself (see below for an example), the function is effectively running twice, and each copy needs its own local variables. Thus, we need to allocate the function space in the different location. All modern machines do this with a stack. * Stack contains limited hardware registers. Stack stores extra //“registers”// used by program. Let's look at a simple recursive function: //"Factorial"// int factorial(int x) { if (x == 0) return 1; else return x * factorial(x - 1); } pseudo-assembly code: EX: 'factorial(5)' 0x02: pushl $5 0x05: call _factorial 0x06: popl %eax // pop the argument we pushed 0x09: ... continue ... factorial: // The 'factorial' function definition 0x10: cmpl 4(%esp), $0 // Is x == 0? 0x12: jne 0x1D // If not, jump ahead 0x14: movl $1, %eax // Return 1 0x17: ret 0x1D: movl 4(%esp), %eax // %eax = f; 0x20: subl %eax, $1 // %eax--; 0x22: pushl %eax // push 'x - 1' as argument 0x24: call _factorial // call 'factorial' recursively // result is returned in '%eax' 0x26: mull 4(%esp), %eax // %eax *= x; // now %eax == x * factorial(x - 1) 0x29: ret // return! **When the function is called:** - Any function arguments are pushed on the Stack. (In some architectures, the first couple arguments are stored in registers; but on x86, everything goes on the stack.) See the pushl instruction at 0x02. - The return address (the next instruction's address from where the function was called) is pushed on the Stack. - The processor jumps to the start of the function code. On x86, steps 2 and 3 are combined into one instruction, call (see instructions 0x05 and 0x24). The call instruction pushes the return address and jumps to a function atomically. - The function pushes space for its local variables onto the stack. (In our example, factorial has no local variables, so it doesn't need to do this.) - Within the function code, arguments and local variables are referred to using stack-indirect addressing. For example, in factorial, the address "4(%esp)" refers to the function's argument f. Why? The last thing pushed onto the stack was the return address (Step 2), so the stack pointer points there. Four bytes above that address -- at 4(%esp) -- is the first argument. - When the function is done, it pops its local variable space off the stack. - Then it pops the return address from the stack, and jumps to that address. In x86, the ret instruction does this, and the function's return value is stored in the %eax register. - Finally, the caller pops off any arguments it pushed. These steps are repeated for each function call. {{.:fig3-4.gif|.:fig3-4.gif}} Stack grows downward, because it is much more natural and convenient to refer to arguments and local variables with positive offsets, such as 4(%esp). Operating System needs to know only how to grow the stack (user processes can arrange stack other ways.). The stack assumes that function never returns more than once. It is a feature that almost every programming language has. The reason is: When a function returns all the local variables and also the state that the function was in it will be deleted. Therefore that functions state and variables will not exist any more. ===== Heap (Dynamic Memory Allocation)===== Memory Layout (4 GB) Another section of memory is heap. In the heap we store dynamically allocated memories. Dynamic memory is being used to limit the process resource usage dynamically based on availability. In edition we also have to store the register state and the program counter. The main reason is that we must be able to run more than one process at a time. We use the following commands to allocate and free memory: * Malloc( ) -----> Memory allocation * Free( ) -----> Free Memory The Process Control Block is located in the kernel memory and has the following structure: **Kernel Memory (Process Control Block)** {{.:fig3-6.jpg|.:fig3-6.jpg}} ===== Blocking (System Calls) ===== Process blocks when it needs a resource that’s not available now. It doesn't return until finished (forces process to wait until done). Kernel puts process to sleep until it’s ready. It avoids busy waiting and useless work. A blocking system call might put a process on a wait queue, if the operation can't complete right away. ---------------------------------------------------------------------------------------------------------------------------- {{.:fig04-01.jpg|.:fig04-01.jpg}} **How to create a process?** * Each process should be started from another process; in other words, each process has one parent. (Processes are arranged in a hierarchy) * First process (pid = 1)has no parent and it is called “init“. (kernel starts this process) * The fork() (no parameter needed) function creates a new process. The new process, the child process, is an exact copy of the calling process, the parent process. - If successful, fork() returns 0 in the child process, and returns the process ID of the child process to the parent process. - Even though the child process inherits some attributes from the parent process, the child process also has its own attributes such as a unique process ID, a different parent process ID (i.e., the process ID of the process that called fork()). e.g.: int x; pid_tp = fork(); x = p; printf (“%d\n”, x) Output: 2, 0 * ''int execve(const char *program_filename , const char *argv[] )'' . - Replace current program with a fresh instace of the program in the program file.\\ - After execve( ) has executed the program that called it is no longer loaded. **Some other functions that handle processes:** * ''void exit(int status)'' -- terminates the process which calls this function and returns the exit status value. * ''int waitpid(pid, statusp, options)'' -- waits for process termination. The function suspends execution of the current process until a child as specified by the pid argument has exited. * ''ps'' -- lists all the processes that you own. * ''ps aux'' -- lists all the processes on the machine. * ''kill()'' -- terminates a process by using the kill command. We can find a process’s pid (by using ps), and then type kill [pid] //**Note:**// Zombie is a process that is exited, but it’s parent haven’t called waitpid( ) on it.