You are expected to understand this. CS 111 Operating Systems Principles, Fall 2006
You are here: CS111: [[2006fall:notes:lec9]]
 
 
 

Lecture 9 notes

By Grace Shih, Alex Wu, Josephine Chen

Observability

Observability is a measure of the global state of a system. A system is said to be observable if its current state can be determined using globally-visible outputs. Observable actions change the global state.
Thus, on Unix, ways to change and observe the global state are:

  • global variables visible across threads
  • file output
  • file descriptor table
  • other system calls

Note that these are global with respect to a single process.

A Synchronization Failure Example

   T1:                        T2:
   0  x = 0;    /* global variable, this assignment happens before the threads run */
   ----------------------------------------------------------
   1  x += 5;                 A  x += 5;
   2  printf("%d\n", x);      B  printf("%d\n", x);

Assume that each line of code above executes atomically. What are the possible outputs for these threads running in parallel? It turns out there are 6 possible orders: 12AB, 1A2B, 1AB2, AB12, A1B2, and A12B. These orders give rise to two possible outputs.

5
10

This is the output for orders 12AB and AB12.

10
10

This is the output for orders 1A2B, 1AB2, A1B2, and A12B.

If we make the example a bit more realistic, and consider how printf() is implemented, we can add some possible outputs. The single printf("%d\n", x); function call breaks into a number of sub-steps:

2.1  movl x, %eax
2.2  pushl %eax
2.3  pushl "%d\n"   # address of string
2.4  call printf
2.5  (in printf:) create string buffer from arguments (many steps)
2.6  (in printf:) call write() system call
2.7  (in printf:) return

Assume these sub-steps are executed atomically, but line 2 itself might not be. Then we achieve one more output:

10
5
T1 T2 x Output
1 5
2.1-2.3 5
A 10
B.1-B.7 10 10
2.4-2.7 10 5

But no other output sequence is possible. That means that any other output is a failure. For example,

5
5

would represent a synchronization failure. We've seen such a failure before: it is due to the fact that x += 5; is implemented in multiple instructions.

Another example using open() and close()

The system calls open() and close() might be implemented using code like the following:

   int open__find_available_fd(proc_t *p) {
1:     int i;
2:     for (i = 0; i < MAXFD; i++) {
3:         acquire(&p->fdtable_mutex);
4:         if (p->fdtable[i] == 0) {   /* this slot in the fdtable is not used */
5:             p->fdtable[i] = 1;
6:             release(&p->fdtable_mutex);
7:             return i;
8:         }
9:         release(&p->fdtable_mutex);
10:    }
11:    return -1;
   }

   void close__fd(proc_t *p, int fd) {
1:     acquire(&p->fdtable_mutex);
2:     p->fdtable[fd] = 0;
3:     release(&p->fdtable_mutex);
   }

Last lecture, the professor argued that the open() implementation was observably different from what was desired, because the open() mutex locks were used to protect individual file descriptors. This meant that another thread in the same process could call close(), making a file descriptor available, but open() would not notice. This was his claim, let's evaluate it!

Let T1 and T2 be:

T1:                               T2:
  int t = 0;  //local variable        
  if (open(file) < 0)             close(0);
      t += 2;

Suppose open(file) returns -1 and an error code ENFILE, which means there are no more file descriptors available. But say that close(0) returns in absolute time before open() does. Is the open() implementation observable? Not for this program, which uses only local variables -- that is, non-global state. So T2's close could happen before or after the open, and no one could tell.

Now let T1 and T2 be:

T1:                               T2:
0  x = 0;  //global variable        
-------------------------------------------------------
1  x = open(file);                A  if (x == 0) {
2  printf("%d\n", x);             B     close(0);
                                  C     printf("c\n");
                                  D  }

Assume that every line of code executes indivisibly. Then the bad output here is "c, -1": that is, a file descriptor got closed, and T1 got an error when it tried to open. Such an output would not be possible, again assuming that every line executes indivisbly. Let's look at some cases to see what would be possible.

If the execution order is:

T1 T2 Output
0
A
B
C c
D
1
2 0

open() was successful so the expected output would be:
c
0

If the execution order is:

T1 T2 Output
0
A
D
1
2 -1

T2 never executes B and C because x is -1 in this case.
Here, open() failed so the expected output would be:
-1

But again, assuming every line of code is executed indivisibly, we cannot see c, -1. The job of our system call implementation is to provide the illusion of indivisibility. That is, no matter what interleaving of instructions we choose inside the system call implementations, we should see either c 0 or -1. Let's look at what the implementation actually does. Can we get a different output? Yes, if we execute lines like this:

T1 T2 Output
0
1.1-1.10 Thread 1 executes the for () loop in the kernel's open implementation, but stops before returning
A
B
C c
D
1.11 This step assigns "x" to the return value, -1
2 -1

We've observed an unacceptable output.

But is this difference important? If we again consider how user code is actually executed, then NO, it is not observably different! Line 1, the statement x = open();, will not execute indivisibly on a real computer, even if the system call appears to execute indivisibly. This is because system calls do not return their values atomically to global variables, like x. In fact, the statement x = open(); will be executed something like this, in two distinct steps:

       ... set up arguments for open ...
1      int $48        # system call; return value will be stored in %eax register
1X     movl %eax, x   # moves return value to memory

Line 1, which changes only a thread-local value (a register), does not change global state. Thus, a failed call to open() changes no global state -- it leaves the fdtable untouched and returns its value in a thread-local variable -- and cannot possibly be observable. (A student mentioned errno, a global variable that says what error occurred in the system call, as potentially observable global state, but in fact errno is thread-local too.) So with this expansion, one can achieve c -1 output even with atomic system calls:

T1 T2 Output
0
1
A
B
C c
D
1X This step assigns "x" to the return value, -1
2 -1

Thus, our implementation is not observably different from an atomic open, given these properties.

We can actually make our implementation observably different by adding new types of observation. For example, a system call uint32_t nsyscalls(void) that returns the number of system calls that have completed for this process (i.e. the system call has effectively returned, even if the return value hasn't been given back to the process yet -- not counting any nsyscalls system calls). This "number of system calls" becomes a piece of global state that can be observed.

Let T1 and T2 be:

T1:                               T2:
0  x = nsyscalls();
------------------------------------------------------------------------
1  int t = open(file);            A  if (x == nsyscalls()) {
2  printf("%d\n", t);             B     close(0);
                                  C     printf("c %d\n", nsyscalls());
                                  D  }

If system calls execute atomically/indivisibly, we would expect to see one of the following types of output. Each output is paired with the sequence of steps that can generate it. Assume that line 0 sets x to 0 (because before that line, the process has executed no system calls), and assume that printf() doesn't count as a system call.

Output Steps
-1 1, 2, A, D
c 1, 0 A, B, C, D, 1, 2 The close() system call increments the nsyscalls() counter, so T2 prints "c 1".
0, c 1 A, B, 1, 2, C, D
c 2, 0 A, B, 1, C, D, 2 Both open() and close() increment the nsyscalls() counter.
0, c 2 A, B, 1, 2, C, D
-1, c 2 A, 1, 2, B, C, D The open() returns before the close() can occur, but the nsyscalls() counter must see both.
c 2, -1 A, 1, B, C, D, 2

But there is one type of output that cannot happen:

c 1, -1

If c 1 is printed, then only one system call has returned by the time of line C. That single system call must have been close(), since it precedes the printf(). Thus, open() has not happened yet. We are assuming that open() happens atomically, so it must happen after close(): it will see the empty file descriptor and return 0.

But c 1, -1 is possible for our implementation. The following execution order would print exactly that output.

T1 T2 Output
1
2.1-2.10
A
B
C c 1
The nsyscalls() counter will return 1, since open() has not yet returned!
D
2.11
3 -1

Since we introduced an event count, the nsyscalls() counter, an open that fails has now become an observable change to a global state.

Deadlock

Deadlock is the condition where one or more thread can never make progress because each thread is waiting for a resource held exclusively by another thread.

Here are the four well-known necessary and sufficient conditions for deadlock.

  • Circular wait (see below)
  • Mutual exclusion
    • Resources, such as locks, are held exclusively: threads cannot share a resource.
  • Hold and wait
    • When a thread is waiting for a resouce to be released, it continues to hold its own resources.
  • No preemption
    • Threads release locks voluntarily; locks cannot be taken from a thread.

Wait-for Graphs

Wait-for Graphs are used to detect circular wait

locks/resources
threads

To indicate that a thread is waiting to acquire a lock:


To indicate that a lock is held by a thread:


When there is a loop in the wait-for graph, we have detected circular wait. This means that there is a chain of threads and resources T1, R1, T2, R2, ..., Tn, Rn where T1 wants to acquire R1, which is held by T2, which wants to acquire R2, which ... is held by Tn, which wants to acquire Rn, which is held by T1. For example:

Lock Ordering

Lock ordering is a technique used to avoid deadlock when threads use multiple locks.
To create lock ordering:

  • arrange all locks in system in a total ordering
  • only allow threads to acquire locks in order
    • A thread can acquire lock 2 while holding lock 1 only if order(lock2) > order(lock1).

This breaks circular wait, since there's no sequence of locks lock1...lockn where order(lock1) > order(lock2) > ... > order(lockn) > order(lock1).

Blocking Wait

So far we have been creating mutexes that are waited upon by polling. Polling wastes cpu time so a blocking mechanism is desired.
Specifications for a blocking mutex:

  • acquire
    • block if the lock is held
  • release
    • unblock any blocked thread

Blocking Mutex

typedef struct bmutex {
    mutex_t l;
    int locked;
    proc_t blocked_list[];
} bmutex_t;

   void acquire(bmutex_t *l) {
1:     while (1) {
2:         acquire(l->l);  /* normal, polling acquire! (spinlock) */
3:         current process state is blocked;
4:         add current process to l->blocked_list[];
5:         if (!l->locked) {
6:             l->locked = 1;
7:             set current state to runnable;
8:             remove current process from l->blocked_list[];
9:             release(l->l);
10:            return;
11:        } else {
12:            release(l->l);
13:            schedule();   /* will block this process, UNLESS its state was set to runnable already */
14:        }
15:    }
   }

   void release(bmutex_t *l) {
       acquire(l->l);
       set all l->blocked_list[] processes to runnable;
       l->locked = 0;
       release(l->l);
   }

This code avoids the sleep/wakeup race by setting a process's state to blocked while the mutex is held that protects the blocked_list[]. Thus, even if another thread calls release() between the acquiring thread's executing lines 12 and 13, then the acquiring thread will be on the blocked_list[], and set to runnable by release(). (Note that some other line orders can work, but it is important to have the thread set to blocked and placed on the blocked_list[] before releasing the l->l spinlock mutex.)

Semaphores

Semaphores are locking/synchronization mechanisms which can be used to derive all other mechanisms. The semaphore was created by Edger Dijkstra and is considered the original blocking wait locking mechanisms.

  • A semaphore is an integer.
typedef struct {
    int s;
} semaphore_t;
  • It supports two atomic functions: P and V.
  • P is from the Dutch word pakken, which means "to grab", and V is from the word verhogen, which means "to increase".
  • How these functions achieve atomicity is unspecified.
// to acquire lock
void P(semaphore_t *s) {
    while (s->s == 0)
        block;
    s->s--;
}

// to release lock
void V(semaphore_t *s) {
    s->s++;
}

// implementing a blocking mutex with a semaphore
typedef struct {
    semaphore_t s;  /* initially 1 */
} bmutex_t;

void acquire(bmutex_t *m) {
    P(m->s);
}

void release(bmutex_t *m) {
    V(m->s);
}

The semaphore is locked when it equals 0 and unlocked when it is greater than 0. This allows semaphores to be friendly mutexes by allowing the lock be acquired more than once. Start the semaphore out at N if you want the lock to be availible for acquisition N times concurrently.

 
2006fall/notes/lec9.txt · Last modified: 2007/09/28 00:25 (external edit)
 
Recent changes RSS feed Driven by DokuWiki