Table of Contents

Lecture 18 notes

Review of Remote Procedure Calls (RPC)

The goal of RPC is to simplify the action of a distributed system by making communicating with another computer look just like a function call. Recall the example from the last lecture - a remote desktop. Program on computer A displayed a picture on computer B.

.:lec18_1.gif

We called our example RPC “draw_pixel”:

draw_pixel (Display * display, //monitor
            int x, int y,      //coordinates
            int color )

There are two applications running here: the program on computer A that draws a picture (“draw_circle”), and the application running on computer B, whose job is to listen to messages coming from computer A, and then actually draw a picture based on the request (“displayer”). Here, A is a client, and B is a server. For the client, draw_pixel() which we are implementing, looks just like a function call, and A’s application is unaware of the network connection.

In draw_pixel(), the file descriptor is hidden inside the ‘display’ object. Client has a file descriptor that represents a connection to the server (for example, a TCP connection). How do we write messages to a file descriptor? This is a marshaling step. Marshaling is a process of taking objects - coordinates x, y, and color, - and turning them into messages that go over the network:

draw_pixel(Display* display, int x, int y, int color) {
    //Marshal message
    //file descriptor is display -> fd
    write(display->fd, &x, sizeof(x));
    write(display->fd, &y, sizeof(y));
    write(display->fd, &color, sizeof(color));
 }

(Note: We assume that TCP protocol assures that messages are received in the order they were sent.)

What is missing here? A real server supports many different requests at the same time, so we multiplex different types of messages over a single connection. Here, we didn’t specify which operation we want to perform. So, we need to add one more write() message:

draw_pixel (Display * display, int x, int y, int color ){	
    //Marshal message
    //file descriptor is display -> fd
    write(display->fd, “draw_pixel”, ||);
    write(display->fd, &x, sizeof(x));
    write(display->fd, &y, sizeof(y));
    write(display->fd, &color, sizeof(color));
}

In general, a typical RPC message looks like:

.:lec18_3.gif

Big/Little Endian

Example: How can we represent integer 256 on a 32-bit machine? Two common ways are:

.:lec18_2.gif

When numbers are sent over a network, we need to worry about their representation, since if the client uses big endian, and the server uses little endian, without a common way to represent numbers, they would write and read different values. Almost all network protocols store numbers in big endian format, which is called “network byte order”.

Other objects (for example, strings) also need a common format for representing them. (Note: we only transmit objects, and not pointers to them, since clients and servers don’t share address space.) In response to the client’s request the server may return ‘success’ or ‘error’ (if, for example, the client requested a pixel out of bounds). Therefore, draw_pixel() has to wait for the result, which may be an error. Remember, this interface makes draw_pixel() to look like a function, and functions return results. So, we need to design a format for B’s response, and then read this response from the connection before returning. Therefore, in draw_pixel() we should have a read() system call that reads the result to some buffer, unmarshals the result, and returns it:

draw_pixel (Display * display, int x, int y, int color ){
    //Marshal message
    //file descriptor is display -> fd
    write(display->fd, “draw_pixel”, ||);
    write(display->fd, &x, sizeof(x));
    write(display->fd, &y, sizeof(y));
    write(display->fd, &color, sizeof(color));
    read(display->fd, &buf[0], 16);
    //unmarshal result
    result = unmarshal_int(&buf[12]);
    return result;
}

Performance

Let’s say, we want to draw a circle, which is 100 pixels wide, and 100 high, so that we have roughly 7500 pixels to draw. How fast can we draw so many pixels? This RPC waits for the response each time it asks to draw a single pixel, so drawing 7500 pixels will cost 7500 network RTTs (Round Trip Times), and it may take several minutes to draw the circle. The reason is that this RPC is SYNCHRONOUS, which means that the result arrives before the function returns (so the function WAITS for the result).

.:lec18_4.gif

SYNCHRONOUS RPC

For synchronous RPC, result arrives before function returns. We see that a synchronous RPC has a significant effect on the program’s performance. A solution to the performance problem is to change the protocol, either in a specific, or a general way:

(1) Add an application-specific function to reduce round-trip times (RTTs are what we need to optimize, since they usually take the longest time). For example, draw an array of pixels at a time, instead of a single pixel:

draw_pixels(int x[], int y[], int color[], int n)

.:lec18_5.gif

(2) General solution:

ASYNCHRONOUS RPC

For most Remote Procedure Calls, the result arrives later, after the function returns. Asynchronous RPC do not block, so we can send next request before the result of the previous request arrives.

.:lec18_6.gif

The problem with Asynchronous RPC is that we may not get the result that we need. The result of a call might be an error, but we don’t know it at once, since the function returns immediately. It is hard to deal with programs when errors come up later. Solution to this problem requires software design to cope with delayed errors. An example of this design might be a callback function supplied by the user, which get called if there is an error.

*In general, synchronous RPC block while asynchronous RPC do not block.

Asynchronous RPC let us build responsive and quick network applications, like for example, Google’s gmail.

Gmail Asynchronous RPC

.:lec18_7.gif

To make gmail perform faster, right after the login, the client application sends asynchronous requests (on the background), and prefetches the recent mail. The new mail may not be needed, but if the user decides to check his/her email, the messages have already been downloaded. It is important to use asynchronous RPC for prefetching, so that the requests would not block while doing this on the background, and would not slow the whole application down.

Note: Even though asynchronous RPCs improve the performance, some requests – the ones that return a result (rather that performing an action) - must be synchronous. For example, the function get_mouse(), that returns mouse coordinates, must be synchronous.

Network File Systems

Network File Systems (NFS) is a distributed system that allows a user of one computer to access files located on another computer(s), using regular file system calls (open(), read(), write(), close(), etc.) The reason for NFS is centralization of file storage for convenience and safety.

.:lec18_8.gif

Client application accessing a file stored on the NFS server, uses system calls open(), read(), write(), etc. in the user space. The kernel translates the system calls into RPC messages in the Virtual File System(VFS) layer. On the NFS server side, we don’t need special support in the kernel to translate the RPC messages back into system calls. We can do it in the application level. The NFS server then makes system calls for accessing files to its local file system, and sends its responses (by marshaling them) back over the network.

Since RPCs, like usual system calls, wait for the returned results, they must be synchronous, and, therefore, we have a slow performance problem.

.:lec18_9.gif

Performance Optimization

In order to improve performance, we can use prefetching and caching: the client reads pages of file data asynchronously and cache the results in the buffer cache.

.:lec18_10.gif

We have a problem here: If we prefetch a file, and store it in the cache, but someone changes it later on the server, a system call will return wrong (old) data to the client application (since the data is read from the cache, and not from the server’s FS).

.:lec18_11.gif

Moreover, we can have this problem even without caching – due to network delays!

.:lec18_12.gif

CONSISTENCY MODELS

Local file systems have write-to-read consistency, which means that the result of a read equals the most recent write at the time of the read. This consistency model implies a total order of read and write operations, such that every read returns the data of the most recent write. If we wanted to enforce write-to-read consistency on the Network File System, we would have to lock the file for the entire time of the read/write operation, and this would impair the performance. Here we have a fundamental conflict between performance and consistency: the strategies that allow us to achieve better performance also get us worse consistency.

Network file systems implement close-to-open consistency that allows maximum flexibility for caching. Close-to-open consistency means that the result of a read equals the most recent write whose file was closed at the time the read file was opened (or possibly a later version).

Close-to-open consistency is much looser than write-to-read consistency: it allows returning old data, which is key to allowing caching.

.:fig_1.jpg

With write-to-read data consistency, the read must return B, but with close-to-open consistency, A or B may be returned. (B was not there when Reader opened the file.) This allows us to prefetch data at open time. Another question is when do we have to write data? Close-to-open consistency lets us cache written data until the file is closed.

.:fig_2.jpg

Further performance optimization: Instead of prefetching the file every time it’s open, Reader can check if the file was changed since the last time it was open. Therefore, close-to-open consistency is what lets file systems to perform much faster.

Security

A secure computer system has ACCESS CONTROL POLICY that defines which users can perform which actions.

Example of an access control policy:

User Can log in?
Eddie yes
Stacy no

To implement access control policy, we need two parts:

  1. Authorized access is allowed. (Positive goal)
  2. Unauthorized access is not allowed. (Negative goal)

Denial-of-service attacks prevent authorized users to use computers (break part 1). Viruses break part 2.

To implement a positive goal, all we need is one successful method to access the computer. To implement a negative goal, we may need an infinite number of mechanisms, since an infinite number of things must be checked.

Tasks/Steps needed to implement the access control policy:

  1. Authentication
  2. Integrity
  3. Authorization
  4. Correctness

Let’s say we are implementing a login function. A message received is: “Eddie wants to log in.” The login function must decide if Eddie can log in.

Task/Step Meaning Method
Authentication Which user made request? Is the user Eddie? Passwords
Integrity Did the authenticated user make THIS request? Encryption
Authorization Is this user allowed to make this request? Access control policy
Correctness Does the code correctly implement the request?

All security is based on the “trusted computing base”. We need correctness for security, and therefore, we need to trust the CPU to implement programs correctly. Trusted computing base includes:

  1. processor
  2. kernel
  3. compiler

If there is a bug in the trusted computing base, there is no security, even if the program itself is correct.

We want the trusted computing base to be small, since then there is less chance for a bug. If an attacker has physical control over the machine, it has broken an important part of the trusted computing base – the hardware.