You are expected to understand this. CS 111 Operating Systems Principles, Winter 2011
You are here: CS111: [[lab1b]]
 
 
 

Lab 1b: CS 111, Winter 2011

Due Monday, January 24

Skeleton code available on SEASnet

Overview

In this second part of Lab 1 you will build upon your command line parser to make a complete shell which can actually execute the parsed commands. You'll implement support for all commands that can be parsed, which may use features such as I/O redirection, pipes, and conditional, sequential, and background execution. You'll also implement the two internal commands cd and exit, which change the working directory and exit the shell, respectively. (Puzzle: why do these two commands have to be built into the shell, when all other commands are executed by running the corresponding program?)

We've provided extensive skeleton code for these parts of the lab. But in addition, you must complete an open-ended problem for which we have not provided skeleton code, namely command queues.

What is a shell?

A shell is a program whose main purpose is to run other programs. The shell parses commands into lists of arguments, then executes those commands using fork() and execvp(). Here's a simple shell command:

$ echo foo

(The initial $ is not part of the command. That is shorthand for the prompt that the shell prints before reading a command.) The shell parses this command into two arguments, echo and foo. The echo argument names the binary that should be executed. So the shell forks a child process to execute echo with those two arguments. The echo program has a very simple job: it just prints its arguments to the console. (echo foo will just print foo.) Meanwhile, the parent waits for the child to finish; when it does, the parent returns to read another command.

You may be interested in a reasonable tutorial for Unix shells. You can find others by searching for, e.g., ''shell tutorial'' on Google. Let us know if you find one you really like.

Each program has standard input, standard output, and standard error file descriptors, whose numbers are 0, 1, and 2, respectively. (You know them in C++ as cin, cout and cerr; in C the files are called stdin, stdout, and stderr.) The echo program writes its output to the standard output file descriptor. Normally this is the same as the shell's standard output, which is the terminal (your screen). But the shell lets you redirect these file descriptors to point instead to other files. For example:

$ echo foo > output.txt

This command doesn't print anything to the screen. But let's use the cat program, which reads a file and prints its contents to standard output, to see what's in output.txt:

$ cat output.txt
foo

The > filename operator redirects standard output, < filename redirects standard input, and 2> filename redirects standard error. (The syntax varies from shell to shell; we generally follow the syntax of the Bourne shell or bash.)

Shells offer many ways to chain commands together. For example, the ; operator says "do one command, then do another". This shell command prints two lines:

$ echo foo ; echo bar
foo
bar

The && and || operators chain commands together based on their exit status. If a command accomplishes its function successfully, that command generally exits with status 0, by calling exit(0). (This is also what happens when the program runs off the end of its main function.) But if there's an error, most commands will exit with status 1. For example, the cat command will exit with status 0 if it reads its files successfully, and 1 otherwise:

$ cat output.txt
foo                                                 [[exit status 0]]
$ cat doesnotexist.txt
cat: doesnotexist.txt: No such file or directory    [[exit status 1]]

Now, && says "execute the command on the right only if the command on the left exited with status 0". And || says "execute the command on the right only if the command on the left exited with status NOT equal to 0". For example:

$ cat output.txt && echo "output.txt exists!"
foo
Output.txt exists!
$ cat doesnotexist.txt && echo "doesnotexist.txt exists!"
cat: doesnotexist.txt: No such file or directory    [[Note: does not run echo!]]
$ cat output.txt || echo "output.txt does not exist."
foo
$ cat doesnotexist.txt || echo "doesnotexist.txt does not exist."
cat: doesnotexist.txt: No such file or directory
doesnotexist.txt does not exist.

Parentheses ( ) allow you to run a set of commands in a subshell. This, in turn, lets you redirect the standard input, output, and error of a group of commands at once. For example:

$ ( echo foo ; echo bar ) > output.txt
$ cat output.txt
foo
bar

The exit status of a subshell is the same as the exit status of the last command executed.

You can also execute a command in the background with the & operator. Normally, the shell will not read a new command until the previous command has exited. But the & operator tells the shell not to wait for the command.

$ echo foo &
$ foo

(foo is printed on top of the next shell prompt.)

Finally, the pipe operator | sends the output of one command to the input of another. For example:

$ echo foo | rev
oof

You may find the following commands particularly useful for testing your shell. Find out what they do by reading their manual pages. Be creative with how you combine these! (Also see the Lab 1b tester, below.)

  • cat (print one or more files to standard output)
  • echo (print arguments to standard output)
  • true (exit with status 0)
  • false (exit with status 1)
  • sleep (wait for N seconds then exit)
  • sort (sort lines)

In addition to running programs from the file system, shells have builtin commands that provide functionality that could not be obtained otherwise. Our shell will implement two such builtin commands, to change directories (cd) and to exit the shell (exit).

The cd command changes the shell's current directory, which is the default directory that the shell uses for files. So cd dir changes the current directory to dir. (You can think of the current directory as the directory that shows up by default when you use an "Open" or "Save" dialog in a GUI program.) Of course, files can also be manipulated using absolute pathnames, which do not depend on the current directory; /home/cs111/cmdline.c is an example.

There may also come a time when you would like to leave the wonderful world of your shell; the exit command instructs the shell to exit with a given status. (exit alone exits with status 0.)

(Why are cd and exit part of the shell instead of standalone programs?)

Lab materials

The skeleton code you will get for this second part of the lab consists of an updated Makefile and three additional files beyond what Lab 1a provided: ospsh.c, ospsh.h, and main-b.c. They are contained in the file lab1b.tar.gz which you can extract just like you did in Lab 1a. You will need to copy your version of cmdline.c from your lab1a directory (with the cp command). Most of the instructions for Lab 1b are included as comments in ospsh.c, but there is one exercise to complete in main-b.c. Additionally, you will need to add simple job control features, as described below.

You will likely find it helpful to learn more about the following functions (their manual pages are fairly descriptive): fork(), WEXITSTATUS(), execvp(), waitpid(), open(), close(), dup2(), pipe(), and chdir(). Remember that you can read manual pages by using the man program.

Our solution to the labeled lab exercises takes 109 lines of code above & beyond the lab handout (not counting cmdline.c and cmdline.h from Lab 1a).

Running

The updated Makefile's default target is now the program ospsh, rather than cmdline. (You can still compile just the parsing parts of your program by running "make cmdline" to tell make to build that target.) To run your shell, you can type "./ospsh" at the command prompt. It will display its prompt, just like cmdline did. Now you can type commands to it, and after each command you type, it will display the parser output and then attempt to execute the command. Initially, it won't execute anything, but as you complete the lab, more and more kinds of commands will work. To quit your shell, you can press control-C to kill it, and after you implement the exit command, you will be able to type "exit" to terminate your shell.

Open-Ended Problem

Most of the exercises in the lab are specified in the source code as EXERCISE comments. However, there is one exercise which is left up to to you: command queues. Like background commands, command queues let a shell user run commands in parallel. But command queues limit the number of commands that execute simultaneously. For instance, a queue with parallelism 2 will run at most 2 commands at a time. Additional commands are saved in a queue; they execute, in order, as running commands complete, with at most 2 running commands active at any time. Command queues let a user run a bunch of commands without taking up lots of machine resources. It also is an interesting way to prevent certain limited fork bombs.

Implement command queues using three special shell commands.

  1. The makeq qname size command creates a command queue named qname with parallelism size. You can also use makeq to reset the parallelism of an existing queue.
  2. The q qname command... command adds a command to the command queue named qname. This command will be run when there is room in the queue, possibly immediately. Much like a background command, the q command returns immediately with status 0. (This is true even if command must wait to run.)
  3. The waitq qname command blocks until qname is empty, then returns with status 0.

There is no command to delete a command queue.

Some examples may help you understand. This simple example shows a command queue with parallelism 1. Such a queue runs one command at a time. Here, we enter the queue commands at the cs111_winter11$ prompt:

 cs111_winter11$ makeq myq 1
 [3 args "makeq" "myq" "1"] .
 cs111_winter11$ q myq echo "This is printed immediately."
 [4 args "q" "myq" "echo" "This is printed immediately."] .
 cs111_winter11$ This is printed immediately.
 q myq echo "That emptied the queue, so this is printed immediately too."
 [4 args "q" "myq" "echo" "That emptied the queue, so this is printed immediately too."] .
 cs111_winter11$ That emptied the queue, so this is printed immediately too.
 q myq sleep 1 ; q myq echo "This should run after 1 second." ; echo "This should be printed first." ; sleep 2
 [4 args "q" "myq" "sleep" "1"] ;
 [4 args "q" "myq" "echo" "This should run after 1 second."] ;
 [2 args "echo" "This should be printed first."] ;
 [2 args "sleep" "2"] .
 This should be printed first.
 This should run after 1 second.                 ## (it appeared after 1 second)
 cs111_winter11$

This example shows a queue with parallelism 2. This time, we give the commands all at once, in a command file. When run as ./ospsh < commandfile, ospsh will print messages with the described delays. (Figure out why.)

 makeq myq2 2
 q myq2 sleep 1
 q myq2 sleep 2
 q myq2 echo This is printed 1 second after the shell began.
 q myq2 sleep 3
 q myq2 echo This is printed 2 seconds after the shell began.
 waitq myq2
 echo This is printed 4 seconds after the shell began.

If a command queue command is called with too few arguments, or bad arguments, it should write q: Syntax error (or makeq: Syntax error, or waitq: Syntax error) to standard error. For example:

 cs111_winter11$ q X
 q: Syntax error
 cs111_winter11$ q
 q: Syntax error

Note that queue commands can have redirections just like normal commands.

 cs111_winter11$ q X 2> /tmp/x
 cs111_winter11$ cat /tmp/x
 q: Syntax error

Queued commands should "wake up" immediately after the relevant job ID completes, even if the shell is waiting for another process at the time. For example:

 makeq myq3 1
 q myq3 sleep 50
 q myq3 echo This should appear after 50 seconds
 sleep 100 ; echo This should appear after 100 seconds
   =>
 ... waits about 50 seconds, then prints: ...
 This should appear after 50 seconds
 ... waits another 50 seconds or so, then prints: ...
 This should appear after 100 seconds

Beware of errors! Unless you are careful, problematic commands might mess up your queue.

 makeq myq4 1
 q myq4 sleep 10
 q myq4 echo This should appear after 10 seconds
 q myq4                                                   ## Should cause a syntax error
 q myq4 echo This should appear immediately thereafter    ## ... but a common error might cause it to NEVER appear

Subshell note: Command queues in subshells (parentheses) don't need to be supported. The queueing commands do not mean anything special inside a subshell.

Maximum queue size note: A command queue must be able to support up to 1024 enqueued processes, but it doesn't need to support more than 1024 enqueued processes.

Current directory note: We will not test interactions between the cd command and queued commands. In the following, it is OK for the queued ls command to execute in either / or /tmp.

 makeq myq5 1
 q myq5 sleep 10
 cd /
 q myq5 ls
 cd /tmp

Prompt note: You do not need to run queued commands while ospsh is blocked at a cs111_winter11$ prompt. (You do need to run queued commands while ospsh is blocked waiting for another process, as described above.) If you make this work, though, tell us in answers.txt.

Implementing the q command will be a bit tricky. Questions to ponder: How does the shell know when a background process completes? How can the shell tell a waiting q command to start running?

Implementing the waitq command is also tricky. If you design your q implementation well, you should be able to reuse most of it for waitq! But note the difference: a q command will run whenever there is room in the queue, but waitq must not complete until the queue has no active processes.

It is OK if your command queue implementation creates a process for a queued command as soon as it is entered. However, those processes must block until it is their turn to run according to the queue.

Command queueing is, as far as we know, a feature unique to ospsh, but features like it are common in real Unix shells. In general features like q are given the name job control. See this reference for a good introduction to job control.

Testing

We have provided a lab1b-tester.pl script that will help you to test your shell by giving it a set of sample inputs. You can also run it using make grade, which will run both the Lab 1a tests and the Lab 1b tests. However, lab1b-tester.pl does not test all of the behavior the lab requires. You should also design your own test cases to run either by hand or automatically.

In many cases, you may compare your shell's behavior to that of the shell bash. (This might be your default system shell, but you can always run it by typing bash at your prompt.) All the syntax your shell is required to handle is also accepted by bash. In a few isolated cases, such as > out with no command, your shell should produce an error on syntax that is OK for bash; the lab1a tester caught those cases. Our code also produces different results from bash on commands involving multiple && and ||s in the same command line, and for some pipelines (bash waits for all processes in a pipeline to exit, but ospsh only waits for the right-hand side). It will be easier to follow our code, but you may also change the code to produce results like bash; let us know in answers.txt if you do this.

We will be using scripts very similar to the tester scripts included with Lab 1b to actually grade your labs. These tester scripts expect to be run on a Linux machine; although you can do Lab 1 on other Unix machines or even Windows with Cygwin installed, the tester scripts will incorrectly report errors for some test cases when they are run on platforms other than Linux. It is probably easiest to just use a Linux machine, since later labs will require it anyway.

The Lab 1b tester does not test command queueing at all. You should design your own tests for that functionality.

Design problem ideas

If you have been assigned a design problem for Lab 1, you may implement one of the following problems, or design your own, as long as it's similar in difficulty and the course staff approves. (If you want to design your own, do it early and get approval from us early.)

How to do design problems.

Control structures

Design and implement analogues of the if, for, and while control structures common to many programming languages. For example, your if structure should execute several commands in order, but only if some condition is true -- for example, only if a command exits with status 0. You may investigate current shells, such as bash, to see how they implement these structures, but your design must differ from bash/tcsh in at least one way. Point out that difference and describe why you think your version is better.

Shell functions

Design and implement a way for shell users to write their own "functions". Once a function f is defined, typing f at the command line will execute the function, rather than a binary program named f. For example, the user might write a function echo_twice that printed its arguments twice, by running the echo command twice. Discuss how other command line arguments will be passed to the shell function. As above, you may investigate current shells, such as bash, to see how they implement these structures, but your design must differ from bash/tcsh in at least one way. Point out that difference and describe why you think your version is better.

Complex redirections

Design and implement a new shell syntax that allows more complex forms of redirection. In particular, your syntax should support arbitrary redirection graphs, including:

  • Run five commands, a, b, c, d, and e, where:
    • a stdout ==> b stdin (this is a conventional pipe).
    • a stderr ==> c stdin.
    • b file descriptor 3 ==> d file descriptor 4.
    • d stdout ==> e stdin.
  • Run two commands, a and b, where:
    • a stdout ==> b stdin.
    • b stdout ==> a stdin. (This is a "bidirectional" pipe.)

You may investigate bash's implementation for some ideas (look for >&), but bash does not completely support this feature.

Tab completion

You may have noticed that in the default shell on the Linux machines in the lab, you can use the tab key to automatically complete partially typed commands, if the shell can figure out what you were trying to type (by searching for programs or files starting with what you have typed so far). Add support for this tab completion feature. Your tab completion should handle both program names in the default path (use the PATH environment variable) and special commands, such as cd and exit. You may want to investigate the GNU readline library, which can help.

Hand-in

When you are finished, edit the file named answers.txt and follow the instructions at the top of the file to fill in your name(s), student ID(s), email address(es), short descriptions of any extra credit or challenge problems you did, any known limitations of your code (including known bugs), and any other information you'd like us to have.

Then run "make tarball" which will generate a file lab1b-yourusername.tar.gz inside the lab1b directory. Upload this file to CourseWeb using a web browser to turn in the project. Remember to upload it only once if you are working in a team - the answers.txt file will allow us to give both team members credit.

Good luck!

 
lab1b.txt · Last modified: 2011/01/23 20:10 by kohler
 
Recent changes RSS feed Driven by DokuWiki