Writing Shell Scripts

This chapter includes:

What's a script?

Shell scripting, at its most basic, is taking a series of commands you might type at a command line and putting them into a file, so you can reproduce them again at a later date, or run them repeatedly without having to type them over again.

You can use scripts to automate repeated tasks, handle complex tasks that might be difficult to do correctly without repeated tries, redoing some of the coding, or both. Such scripts include:

Available shells

The shell that you'll likely use for scripting under Neutrino is ksh, a public-domain implementation of the Korn shell. The sh command is usually a symbolic link to ksh. For more information about this shell, see:

Neutrino also supplies or uses some other scripting environments:

In general, a shell script is most useful and powerful when working with the execution of programs or modifying files in the context of the filesystem, whereas sed, gawk, and perl are primarily for working with the contents of files. For more information, see:

Running a shell script

You can execute a shell script in these ways:

The first line

The first line of many — if not most — shell scripts is in this form:

#! interpreter [arg]

For example, a Korn shell script likely starts with:

#! /bin/sh

The line starts with a #, which indicates a comment, so the line is ignored by the shell processing this script. The initial two characters, #!, aren't important to the shell, but the loader code in procnto recognizes them as an instruction to load the specified interpreter and pass it:

  1. the path to the interpreter
  2. the optional argument specified on the first line of the script
  3. the path to the script
  4. any arguments you pass to the script

For example, if your script is called my_script, and you invoke it as:

./my_script my_arg1 my_arg2 ...

then procnto loads:

interpreter [arg] ./my_script my_arg1 my_arg2 ...

Note:
  • The interpreter can't be another #! script.
  • The kernel ignores any setuid and getuid permissions on the script; the child still has the same user and group IDs as its parent. (For more information, see Setuid and setgid in the Working with Files chapter of this guide.)

Some interpreters adjust the list of arguments:

For example, let's look at some simple scripts that echo their own arguments.

Arguments to a ksh script

Suppose we have a script called ksh_script that looks like this:

#! /bin/sh
echo $0
for arg in "$@" ; do
  echo $arg
done

If you invoke it as ./ksh_script one two three, the loader invokes it as /bin/sh ./ksh_script one two three, and then ksh removes itself from the argument list. The output looks like this:

./ksh_script
one
two
three

Arguments to a gawk script

Next, let's consider the gawk version, gawk_script, which looks like this:

#!/usr/bin/gawk -f
BEGIN {
        for (i = 0; i < ARGC; i++)
                print ARGV[i]
}

The -f argument is important; it tells gawk to read its script from the given file. Without -f, this script wouldn't work as expected.

If you run this script as ./gawk_script one two three, the loader invokes it as /usr/bin/gawk -f ./gawk_script one two three, and then gawk changes its full path to gawk. The output looks like this:

gawk
one
two
three

Arguments to a perl script

The perl version of the script, perl_script, looks like this:

#! /usr/bin/perl
for ($i = 0; $i <= $#ARGV; $i++) {
    print "$ARGV[$i]\n";
}

If you invoke it as ./perl_script one two three, the loader invokes it as /usr/bin/perl ./perl_script one two three, and then perl removes itself and the name of the script from the argument list. The output looks like this:

one
two
three

Example of a Korn shell script

As a quick tutorial in the Korn shell, let's look at a script that searches C source and header files in the current directory tree for a string passed on the command line:

#!/bin/sh
#
# tfind:
# script to look for strings in various files and dump to less

case $# in
1)
    find . -name '*.[ch]' | xargs grep $1 | less
    exit 0   # good status
esac

echo "Use tfind stuff_to_find                               "
echo "      where : stuff_to_find = search string           "
echo "                                                      "
echo "e.g. tfind console_state looks through all files in   "    
echo "     the current directory and below and displays all "
echo "     instances of console_state."
exit 1    # bad status

As described above, the first line identifies the program, /bin/sh, to run to interpret the script. The next few lines are comments that describe what the script does. Then we see:

case $# in
1)
  ...
esac

The case ... in is a shell builtin command, one of the branching structures provided by the Korn shell, and is equivalent to the C switch statement.

The $# is a shell variable. When you refer to a variable in a shell, put a $ before its name to tell the shell that it's a variable rather than a literal string. The shell variable, $#, is a special variable that represents the number of command-line arguments to the script.

The 1) is a possible value for the case, the equivalent of the C case statement. This code checks to see if you've passed exactly one parameter to the shell.

The esac line completes and ends the case statement. Both the if and case commands use the command's name reversed to represent the end of the branching structure.

Inside the case we find:

find . -name '*.[ch]' | xargs grep $1 | less

This line does the bulk of the work, and breaks down into these pieces:

which are joined by the | or pipe character. A pipe is one of the most powerful things in the shell; it takes the output of the program on the left, and makes it the input of the program to its right. The pipe lets you build complex operations from simpler building blocks. For more information, see Redirecting input and output in Using the Command Line.

The first piece, find . -name '*.[ch]', uses another powerful and commonly used command. Most filesystems are recursive through a hierarchy of directories, and find is a utility that descends through the hierarchy of directories recursively. In this case, it searches for files that end in either .c or .h — that is, C source or header files — and prints out their names.

The filename wildcards are wrapped in single quotes (') because they're special characters to the shell. Without the quotes, the shell would expand the wildcards in the current directory, but we want find to evaluate them, so we prevent the shell from evaluating them by quoting them. For more information, see Quoting special characters in Using the Command Line.

The next piece, xargs grep $1, does a couple of things:

The final piece, less, is an output pager. The entire command may generate a lot of output that might scroll off the terminal, so less presents this to you a page at a time, with the ability to move backwards and forwards through the data.

The case statement also includes the following after the find command:

exit 0   # good status

This returns a value of 0 from this script. In shell programming, zero means true or success, and anything nonzero means false or failure. (This is the opposite of the meanings in the C language.)

The final block:

echo "Use tfind stuff_to_find                               "
echo "      where : stuff_to_find = search string           "
echo "                                                      "
echo "e.g. tfind console_state looks through all files in   "    
echo "     the current directory and below and displays all "
echo "     instances of console_state."
exit 1    # bad status

is just a bit of help; if you pass incorrect arguments to the script, it prints a description of how to use it, and then returns a failure code.

Efficiency

In general, a script isn't as efficient as a custom-written C or C++ program, because it:

However, developing a script can take less time than writing a program, especially if you use pipes and existing utilities as building blocks in your script.

Caveat scriptor

Here are some things to keep in mind when writing scripts: