You are expected to understand this. CS 111 Operating Systems Principles, Spring 2006
You are here: CS111: [[2006spring:lab1b]]
 
 
 

Lab 1B: CS 111, Spring 06
Due Friday, April 28th at 12:00 PM
Assigned Saturday, April 15th

Overview

In this second part of Lab 1 you will build upon your command line parser to make a complete shell which can actually execute the parsed commands. You'll implement support for all commands that can be parsed, which may use features such as I/O redirection, pipes, and conditional, sequential, and background execution. You'll also implement the two internal commands cd and exit, which change the working directory and exit the shell, respectively. (Puzzle: why do these two commands have to be built into the shell, when all other commands are executed by running the corresponding program?)

What is a shell?

A shell, in Unix terms, is a program whose main purpose is to run other programs. The shell parses commands into lists of arguments, then executes those commands using fork() and execvp(). Here's a simple shell command:

$ echo foo

(The initial $ is not part of the command. That is shorthand for the prompt that the shell prints before reading a command.) The shell parses this command into two arguments, echo and foo. The echo argument names the binary that should be executed. So the shell forks a child process, which calls execvp() to execute echo with those two arguments. The echo program has a very simple job: it just prints its arguments to the console. (echo foo will just print foo.) Meanwhile, the parent waits for the child to finish; when it does, the parent returns to read another command.

Each program has standard input, standard output, and standard error file descriptors, whose numbers are 0, 1, and 2, respectively. (You know them in C++ as cin, cout and cerr; in C the files are called stdin, stdout, and stderr.) The echo program writes its output to the standard output file descriptor. Normally this is the same as the shell's standard output, which is the terminal (your screen). But the shell lets you redirect these file descriptors to point instead to other files. For example:

$ echo foo > output.txt

This command doesn't print anything to the screen. But let's use the cat program, which reads a file and prints its contents to standard output, to see what's in output.txt:

$ cat output.txt
foo

The > filename operator redirects standard output, < filename redirects standard input, and 2> filename redirects standard error. (The syntax varies from shell to shell; we generally follow the syntax of the Bourne shell or bash.)

Shells offer many ways to chain commands together. For example, the ; operator says "do one command, then do another". This shell command prints two lines:

$ echo foo ; echo bar
foo
bar

The && and || operators chain commands together based on their exit status. If a command accomplishes its function successfully, that command generally exits with status 0, by calling exit(0). (This is also what happens when the program runs off the end of its main function.) But if there's an error, most commands will exit with status 1. For example, the cat command will exit with status 0 if it reads its files successfully, and 1 otherwise:

$ cat output.txt
foo                                                 [[exit status 0]]
$ cat doesnotexist.txt
cat: doesnotexist.txt: No such file or directory    [[exit status 1]]

Now, && says "execute the command on the right only if the command on the left exited with status 0". And || says "execute the command on the right only if the command on the left exited with status NOT equal to 0". For example:

$ cat output.txt && echo "output.txt exists!"
foo
Output.txt exists!
$ cat doesnotexist.txt && echo "doesnotexist.txt exists!"
cat: doesnotexist.txt: No such file or directory    [[Note: does not run echo!]]
$ cat output.txt || echo "output.txt does not exist."
foo
$ cat doesnotexist.txt || echo "doesnotexist.txt does not exist."
cat: doesnotexist.txt: No such file or directory
doesnotexist.txt does not exist.

Parentheses ( ) allow you to run a set of commands in a subshell. This, in turn, lets you redirect the standard input, output, and error of a group of commands at once. For example:

$ ( echo foo ; echo bar ) > output.txt
$ cat output.txt
foo
bar

The exit status of a subshell is the same as the exit status of the last command executed.

Finally, you can also execute a command in the background with the & operator. Normally, the shell will not read a new command until the previous command has exited. But the & operator tells the shell not to wait for the command.

$ echo foo &
$ foo

(foo is printed on top of the next shell prompt.)

You may find the following commands particularly useful for testing your shell. Find out what they do by reading their manual pages. Be creative with how you combine these! (Also see the Lab 1B tester, below.)

  • cat (print one or more files to standard output)
  • echo (print arguments to standard output)
  • true (exit with status 0)
  • false (exit with status 1)
  • sleep (wait for N seconds then exit)
  • sort (sort lines)

In addition to running programs from the file system, shells have builtin commands that provide functionality that could not be obtained otherwise. Our shell will implement two such builtin commands, to change directories (cd) and to exit the shell (exit).

The cd command changes the shell's current directory, which is the default directory that the shell uses for files. So cd dir changes the current directory to dir. (You can think of the current directory as the directory that shows up by default when you use an "Open" or "Save" dialog in a GUI program.) Of course, files can also be manipulated using absolute pathnames, which do not depend on the current directory; /home/cs111/cmdline.c is an example.

There may also come a time when you would like to leave the wonderful world of your shell; the exit command instructs the shell to exit with a given status. (exit alone exits with status 0.)

(Why are cd and exit part of the shell instead of standalone programs?)

Lab materials

The skeleton code you will get for this second part of the lab consists of an updated Makefile and three additional files beyond what Lab 1A provided: ospsh.c, ospsh.h, and main-b.c. They are contained in the file lab1b.tar.gz which you can extract just like you did in Lab 1A. You will need to copy your version of cmdline.c from your lab1a directory (with the cp command). Most of the instructions for Lab 1B are included as comments in ospsh.c, but there is one exercise to complete in main-b.c. Additionally, there is one feature you will need to add to ospsh yourself, described below.

You will likely find it helpful to learn more about the following functions (their manual pages are fairly descriptive): fork(), WEXITSTATUS(), execvp(), waitpid(), open(), close(), dup2(), pipe(), and chdir(). Remember that you can read manual pages by using the man program.

Our solution to the labeled lab exercises takes 109 lines of code above & beyond the lab handout (not counting cmdline.c and cmdline.h from Lab 1A).

Running

The updated Makefile's default target is now the program ospsh, rather than cmdline. (You can still compile just the parsing parts of your program by running "make cmdline" to tell make to build that target.) To run your shell, you can type "./ospsh" at the command prompt. It will display its prompt, just like cmdline did. Now you can type commands to it, and after each command you type, it will display the parser output and then attempt to execute the command. Initially, it won't execute anything, but as you complete the lab, more and more kinds of commands will work. To quit your shell, you can press control-C to kill it, and after you implement the exit command, you will be able to type "exit" to terminate your shell.

Open-Ended Problem

Most of the exercises in the lab are specified in the source code as EXERCISE comments. However, there is one exercise which is left entirely to you: serial background execution.

The background execution operator & ordinarily runs programs in parallel in the background. For instance, this foreground command:

echo foo ; sleep 5 ; echo bar ;

will print foo (the echo command prints its arguments), wait five seconds, print bar, and finally return to the shell prompt. But in this background command:

echo foo & sleep 5 & echo bar &

all three of commands start running at the same time, and all in the background. foo and bar are printed immediately, but the shell prompt appears immediately as well.

The exercise is to implement a new feature: serial background execution using the , operator. (You should add this operator to your cmdline.c and cmdline.h.) The analogous command line using ,:

echo foo , sleep 5 , echo bar ,

will execute the three commands in the background relative to other processes; for example, the shell prompt will appear immediately. However, a serial background process starts only after the previous serial background process has exited. Thus, foo will appear, then, five seconds later, bar will appear; but in the mean time you can enter new commands at the command prompt.

New serial background processes are "queued up" behind old serial background processes, including serial background processes from previous command lines. Thus, the following 3 command lines,

echo foo ,
sleep 5 ,
echo bar ,

should have the same effect as the single command line echo foo , sleep 5 , echo bar ,. Note that you are not required to support serial background processes within parentheses.

You could even run a command in the foreground while commands are running in the background; for example two seconds after the background commands are started, run uptime:

echo foo , sleep 5 , echo bar , sleep 2 ; uptime

We've implemented three solution approaches. One approach executes serial background processes in a separate thread, using the Posix threads library (-lpthread, man pthread_create). One approach executes them in another process, connected by a single pipe. And one approach uses a pipeline to control when serial background processes start and stop. Any of these approaches will work -- but make sure to avoid race conditions as much as you can! (If you can't figure out how to avoid a race condition, at least tell us what that race condition is.) Feel free to ask us questions.

Each of our solution approaches takes about 50-100 lines of code.

Testing

We have provided a lab1b-tester.pl script that will help you to test your shell by giving it a set of sample inputs designed to test many of the features of your shell. You may also compare your shell's behavior to that of the shell bash, which might even be your default system shell. If you're not sure, you can always run it by typing bash at your prompt. With the exception of the challenge problem !> redirection operator below and the , serial background operator, all the syntax your shell is required to handle is also accepted by bash.

We will be using scripts very similar to the tester scripts included with Lab 1B to actually grade your labs. To run the Lab 1A tester, you may need to run "make cmdline" first in order to compile the Lab 1A portions of the lab separately. These tester scripts expect to be run on a Linux machine; although you can do Lab 1 on other Unix machines or even Windows with Cygwin installed, the tester scripts will incorrectly report errors for some test cases when they are run on platforms other than Linux. It is probably easiest to just use a Linux machine, since later labs will require it anyway.

The Lab 1B tester does not test serial background execution at all. You should design your own tests for that functionality.

Challenge problem ideas

Environment variables

In most shells, you can type a command something like export VAR=value which assigns the value value to the "environment" variable named VAR. (Take a look at the getenv() and setenv() functions.) The environment, like all other parts of a process, is copied on a fork() call, but unlike many parts of a process, it is preserved during an exec() call. Child processes can therefore access these variables. In addition, the shell itself should use them so that in subsequent commands, the token $VAR would expand to the variable's current value. For instance, if you execute the command "STR=blah ; echo $STR $STR" you would get the output "blah blah" as a result. Implement this.

Customizable prompt

Implement a user-customizable prompt, so that instead of the bland cs111_spring06$ you can execute a command to set the prompt to something different. However, just a different fixed string is not really all that exciting, so instead, make your shell able to parse "escape sequences" in the prompt string that will be expanded to various information every time the prompt is displayed. For instance, support the escape sequences \u, \h, \w, and \t, to display the user name, host name, current working directory, and current time, respectively. Thus setting the prompt to "\t \u@\h:\w$ " would display "1:45 PM joeuser@server:/home/joeuser$ " as your prompt.

Tab completion and history with ''readline''

You may have noticed that in the default shell on the Linux machines in the lab, you can use the arrow keys to edit the current command or access the command history. Additionally, you can use the tab key to automatically complete partially typed commands, if the shell can figure out what you were trying to type (by searching for programs or files starting with what you have typed so far). Add support for these features by using the GNU readline library. (Documentation for it is available online.) Your tab completion should handle both program names in the default path (use the PATH environment variable) and files in the current directory, depending on whether the shell should expect a command or an argument next on the command line.

More redirections

You are only required to support basic input, output, and error redirection with the <, >, and 2> symbols. There are several other options you could add for redirection, including:

  • >> filename: output redirection that will append instead of overwrite.
  • !> filename: output redirection that will not overwrite an existing file.
  • i>&j (where i and j are numbers): redirects the output of file descriptor i to file descriptor j.
  • >& filename: redirects both stdout and stderr to filename.

Hand-in

When you are finished, edit the file named LAB and follow the instructions at the top of the file to fill in your name(s), student ID(s), email address(es), short descriptions of any extra credit or challenge problems you did, any known limitations of your code (including known bugs), and any other information you'd like us to have.

Then run "make tarball" which will generate a file lab1b-yourusername.tar.gz inside the lab1b directory. Upload this file to CourseWeb using a web browser to turn in the project. Remember to upload it only once if you are working in a team - the LAB file will allow us to give both team members credit.

Good luck!

 
2006spring/lab1b.txt · Last modified: 2007/10/03 10:11 by kohler
 
Recent changes RSS feed Driven by DokuWiki