You are expected to understand this. CS 111 Operating Systems Principles, Fall 2006
You are here: CS111: [[2006fall:notes:lec17]]
 
 
 

Lecture 17 notes on Distributed Systems II

By Subhash Arja, Matthew Ho, Grant Jenks, Samuel Kwok

Topics Covered in Lecture

  • RPC - Remote Procedure Calls
  • NFS - Network File System

Remote Procedure Calls

Remote Procedure Calls (RPC) allow a client program to communicate with a server program. Unlike local procedure calls, RPC's do not assume the same address space and so all data must be sent between the programs. Data on the client's side is dereferenced and unpacked and then interpreted on the other side. The process of dereferencing data and interpreting the serial information is known as marshalling.

RPC's can be synchronous or asynchronous, depending on whether the client waits for a response before sending the next request. Asynchronous calls can significantly decrease the latency when there are multiple RPC's that must be made by amortizing the time waited across all the calls made.

Marshalling

Marshalling is defined as the process of converting a series of bytes into an object in order to make an RPC. This process is also known as pickling since it works to preserve the original data in the program. Marshalling is done on both ends of the communication pipe so that just as the object is broken down to a transmittable form, it can be built back up to an object form. Marshalling is accomplished with stubs.

Stubs are functions that are implemented both on the client and server sides in order to marshall requests or responses, respectively. In some cases stubs are generated automatically depending on the specific protocol specification and programming language support.

The following examples demonstrate implementations of client/server communications with RPCs.

Example 1: Http Requests

An RPC occurs when a browser tries to access a web page.

HTTP (Hyper-Text-Transfer-Protocol) request for URL (Universal Resource Locator) http://www.cs.ucla.edu/~kohler

Take the URL and other information to send to the server (request)

    GET /~kohler HTTP/1.1\n
    Host: www.cs.ucla.edu:80
    User-Agent: Mozilla/5.0 (XII; U; Linux; 686; en-US …..)

The first line is the function. This is a GET request. Other request types could be POST, HEAD, PUT, DELETE, TRACE, OPTIONS, or CONNECT. The request is seeking /~kohler as a response and is using version 1.1 of the HTTP protocol. The second line specifies the host and port and the third describes attributes of the browser. This information may be used to send specially formatted responses to different browsers.

Reply:

    HTTP/1.1 200 OK \n				200 = Status Code
    Connection: close\n
    Content-Type: text/html \n
    Content-Length: 1062 \n
    <HTML> \n
    <HEAD> \n

The first line specifies the protocol version and status. Here this is version 1.1 with status OK. The second, third, and fourth lines describe attributes of the response: the connection is now closed, the content type is text and html, and the length of the response is 1,062 bytes. This is very characteristic of marshalling. Were attributes like these not specified, the end of the message may be impossible to determine.

We've seen now in this example how both the client and server send marshalled messages to each other.

Example 2: Remote Display

Suppose we had the following code on the client side. The client sends a request to the server to draw one pixel on the server machine's screen and, in turn, reads the reply from the server.

int socket;
draw_pixel (int x, int y, int color) {
{
      char reply[4];
      write (socket, &x, sizeof(x));           // part of the Client Stub
      write (socket, &y, sizeof(y));           // part of the Client Stub
      write (socket, &color, sizeof(color));   // part of the Client Stub
      read (socket, reply, 4);
      return reply;
}

This is the code on the server side:

while (1) {
      read (socket, &x, sizeof(x));            // part of the Server Stub
      read (socket, &y, sizeof(y));            // part of the Server Stub
      read (socket, &color, sizeof(color));    // part of the Server Stub      
      int r = draw_real_pixel(x, y, color);
      write (socket, &r, sizeof(r));
}

Therefore, when the client intends to draw a pixel, he must specify the coordinates (x,y) and the color when making the request to the server. The lines marked as "part of the Stub" in both the client and server side are responsible for marshalling the request and response, respectively.

Performance

While this implementation works, there are a few performance problems. First consider an imaginary system in which communication between different systems takes no time. The stubs that marshall the communications must still take more time than local procedure calls. This latency is unavoidable and is a main reason developers must make RPCs wisely. Another performance problem is seen in the timing diagram below:

5a.jpg

The communication is synchronous because the client waits for the response to the first message before sending a second request. When the number of requests is large, as in the case of drawing a shape with many pixels, the latency is huge.

In order to fix this performance bottleneck, the calls can be made asynchronous. Since the client really doesn't have to wait for the first pixel to be drawn in order to send the second request, the client can send many requests at once and receive all the responses later. The following is the timing diagram for asynchronous calls:

5b.jpg

One possible problem of using only asynchronous calls shows up in error reporting from the server side. For example, if multiple requests are sent to the server but there is an error in the communication, the client will not receive the error until after multiple requests are packaged and sent. The resources and time used to package those requests would be wasted. This problem could be fixed by designing the client program to handle delayed errors.

Another case when asynchronous calls should not be used involves user interaction. For example, in the program above, asynchronous calls should not be used for a get_mouse_position request since this information will probably change future RPC's. This extends to include any real-time data which will affect the future RPC's.

Example 3: Google Maps

Google uses caching and prefetching to improve the performance of its Maps service. Specifically, when a user "drags" a map in any direction, more of the map must be displayed within the viewing window. In order to perform this action seamlessly, there is a buffer area around the viewing area that is loaded along with the displayed map. The map content in the buffered area is prefetched and cached to improve performance. The protocol used to accomplish this is called AJAX (Asynchronous Javascript and XML).


Network File System

The Network File System (NFS) is a protocol that allows the client to access data over a network as if the data were on local disks. This is accomplished through a system call on the client side, which triggers the client NFS stub to send an RPC across the network to a NFS server. The NFS server, in turn, returns the desired data through its Virtual File System (VFS) as a reply. The communication layout is shown in the diagram below:

1b.jpg

Let's suppose that the user application makes a system call to "write()" through a socket over the network, as was the case in the pixel drawing example above. The system causes a context switch into the client's kernel. Next, the Virtual File System (VFS) is responsible for passing on the information to the NFS Stub on the client side, where the request is marshalled and sent over the network as an RPC call. On the network, an NFS server receives the request and has its own stub to unmarshall the request. Then, a context switch within the server after which the server's VFS returns the file data needed. The server stub then marshalls the data and sends the reply back over the network to the client side. The client is expected to read the reply and send more requests if needed.

Local API NFS API
Open Lookup
Read Read
Write Write
Close

The NFS is implemented with a stateless design so that a client may drop the connection without causing any failure. This is made possible with the introduction of file handles. File handles are a method of reference that goes beyond user specified attributes of the file. Even as a file's name is changed or internal data is modified, the file handle will not change. The above table shows that an NFS does not really have open or close capabilities. This is precisely done with file handles to provide a stateless design. By not having an open or close ability, it is impossible for a client to open a file and then experience failure in such a way that it never closes the file. Furthermore the file handles persist between reboots of the NFS server itself. This ensures clients can continue communicating with a server even after server failure. The file handle is returned by the lookup system call.

The following is the timing diagram for basic NFS communication. The client always waits for a response from the server before sending another request. The 10 ms value for the time elapsed between sending the request and receiving the response is arbitrarily chosen for this example.

2b.jpg

Performance

The diagram above shows synchronous NFS RPC's. The performance of this design suffers heavily from the latency of the network. Two methods are used to improve the performance of the NFS. These are batching and local caching. By batching requests and using asynchronous RPC's, the latency of the network is amortized over the number of requests in the batch. As seen above, asynchronous calls will generally provide better throughput and efficiency. Local caching means the latency of the network can be avoided all together by writing data to the local resources. The cached data can then later be sent back to the NFS server. Specifically, repeated writes to the same file will require no network latency as the cache is stored on the client.

5.jpg

The above diagram shows the asynchronous RPC design used with NFS.

Synchronization

Let's suppose that we use a buffer cache in order to improve performance of NFS server communication. Whenever we received data from the server, we update the buffer cache. Therefore, when we make a system call such as "read" the buffer cache is first checked. If that data already exists in the cache we don't even have to communicate with the server. However, the biggest challenge with using this buffer cache, and caches in general, is synchronization. The following is a simple timing diagram where there are two clients, A and B, that are communicating with one NFS server:

3b.jpg

So client A makes the system call "read" when there is nothing in the buffer cache, and shortly thereafter, the server responds with the data "foo". Client A updates his buffer cache to contain the data "foo". Client B then makes the "write" call and changes the data to "bar". When client A makes the call to "read" again, the buffer cache is the first thing that is checked and, since the data requested is already there, "foo" is returned, without ever making the call to the NFS server over the network. Client A was just a victim of inconsistent data! This problem is not limited to cases with caching. Take this timing diagram below:

6b.jpg

In the picture above, client A sent his "read" request before client B sent his "write" request. However, the "write" request reached the server faster than the "read" request, and the file is changed based on B's "write" call. Client A will still receive the version of data before the "write" due to the network delay.

Data Consistency

Local file systems exhibit write-to-read consistency. This is what we're used to seeing. We expect that writes to a file, regardless of the process that issues the writes will affect subsequent reads from that same file. This is not necessarily the case with network file systems because they exhibit close-to-open consistency. On a network file system, two processes that have a file open will not necessarily see the changes the other is making until one process closes and reopens the file. This is a direct result of the cases seen above in synchronization. The overall solution to the problem above is simply to allow it to happen. In order to increase performance, local caches are used and in order to make them be effective, constant checks of the original file structure can not be made. A diagram of the effects of this type of consistency is seen below.

4b.jpg

Synchronization can still be achieved at a much smaller granularity be enforcing locks or checking for changes before writing the cache contents but support for this level is not supported natively within the NFS. Producing these effects is similar to how security is performed in NFS 3 which is described further below.

As we've seen before there are a number of policies which can apply to a cache. The one that must hold for open-to-close consistency is that all writes are sent back to the NFS server upon closing of the file. Beyond this, the cache may use a least recently used or least recently fetched policy for eviction from the cache and may also use dirty bits to only send a minimal number of pages back to the server.

It may seem odd to sacrifice so much consistency for some performance but the network should not be under estimated. Even without the performance measures in place, the network guarantees little about when requests from clients will actually reach the server. For this reason, much of the issues caused by caches is bound to happen and so caches may as well be used.

Vnodes

design-vnode.jpg

When a user program uses file system calls, it should not have to worry about dealing with files from different file system types. We can use layering to create a vnode interface and leave that job to the interface. A system call will operate on a vnode, and the vnode interface will make sure that the correct file handler is given to the system call. For example: consider the open operation. If our parameter is on the local file system, the vnode interface will simply return the local file. If it is on an NFS, however, the vnode interface will be responsibly for invoking the open operation over the network. Both cases should make no difference in the file system call layer.

File Handles

One way to make sure that the data requested by the client is correct is to implement the file handle system on the NFS server. A file handle is a handle that is given to the client for a particular file at a particular time. This is illustrated by the diagram below:

7.jpg

Suppose that the client does a lookup on the file "/foo/bar" at a certain time. The server will assign handle 1 for client A for the file "/foo/bar". Let's say that the file name on the server changes from "/foo/bar" to "/foo/baz". Now the client does another lookup on "/foo/bar". The server creates "/foo/bar", in addition to "/foo/baz" that it already contained. In return, the server assigns the client another file handle, this time for the new "/foo/bar". Let's call this new file handle "handle 2". Then the client decides he wants to do a "read" on file handle 1. He will receive the data from "/foo/baz" correctly, since that is the file with the correct data, even though the file name was changed! The same process happens if the client now asks to read data from file handle 2. The server will properly return the data from the file "/foo/bar"

To the client side: the file handle is simple a representation of a file over the network on the server. To the server side, the file handler is a structure that consists of:

  • file system identifier-tells the process to use the correct file system on the handler (ie. NFS).
  • Inode number-used to identify the file (as demonstrated with the previous graphs, we don't use pathnames to identify a file).
  • generation number-the version number of the inode. Every time we make a change to a file we increment its generation number. When we read a file, we can choose to read from the cache if its generation number did not change.

File handles provide a great way to maintain consistency between the client and server. The client doesn't have to be aware of all filename changes on the server side, since all it sees is the file handle. Using handles instead of filenames provides a great deal of flexibility and isolation between the client and server.

Security

The security layer in NFS 3 exists on the client side and therefore allows someone who can bypass the client VFS less restricted access to the files on the server. This problem is fixed in NFS 4 which now puts the security layer on the server. Unfortunately, this requires the server remember some state which violates its stateless attribute. The sacrifice here guards against the network and is considered much more worthwhile in practical implementations. Regardless of the improved security features, NFS 3 is still widely used.

A Detailed Specification of the Latest NFS verion (NFS 4): http://tools.ietf.org/html/rfc3530

 
2006fall/notes/lec17.txt · Last modified: 2007/09/28 00:25 (external edit)
 
Recent changes RSS feed Driven by DokuWiki