|
|
|
Why is it necessary to write something about command
line arguments? The concept is very easy and clear: if you
enter the following command
$ ls -l *.txt
the command "ls " is executed with the
command line flag "-l " and all files
in the current directory ending with ".txt "
as arguments.
Still many shell scripts do not accept command line
arguments the way we are used to (and came to like) from
other standard commands. Some shell programmers do not
even bother implementing command line argument parsing, often
aggravating the script's users with other strange calling
conventions.
For examples on how to name command line flags to be
consistent with existing UNIX commands see the table
Frequent option names.
Here are some examples of bad coding practices.
-
Setting environment variables for script input that
could be specified on the command line.
One example:
:
# AUTORUN must be specified by the user
if [ "$AUTORUN" != yes ]
then
echo "Do you really want to run this script?"
echo "Enter ^D to quit:"
if read answer
then
echo "o.k, starting up memhog daemon"
else
echo "terminating"
exit 0
fi
fi
# start of script...
|
Consider the script's user who might ponder
"What was the name of this variable? FORCERUN?
AUTOSTART? AUTO_RUN? or AUTORUN?"
Don't get me wrong, environment variables do have their place
and can make life easier for the user.
A much better way to solve the autorun option would be to
implement a command line flag, i.e.
"-f " for "force non-interactive
execution".
-
Positional parameters.
Example:
:
# process - process input file
ConfigFile="$1"
InputFile="$2"
OutputFile="$3"
# Read config file
get_defaults "$ConfigFile"
# Do the processing
process_input < "$InputFile" > "$OutputFile"
|
This script expects exactly three parameters in exactly
this order:
the name of a configuration file with default settings,
the name of an input file, and the name of an output file.
The script could be called with the following parameters:
$ process defaults.cf important.dat output.dat
It then reads the configuration file
"defaults.cf ", processes the input file
"important.dat " and then writes (possibly
overwriting) the output file "output.dat ".
Now see what happens if you call it like this:
$ process output.dat defaults.cf important.dat
Now the script tries to read the output file
"output.dat " as configuration file.
If the user is lucky the script will terminate at this
point, before it tries to overwrite his data file
"important.dat " it will be using as the output file!
This script would have been better with the following
usage:
$ process -c default.cf -o output.dat file.dat
The command line option "-c " precedes
the default file, the output file is specified with the
"-o " option, and every other argument is
taken to be the input file name.
Our goal are shellscripts, that use "standard" command
line flags and options. We will develop a shell script
code fragment that handles command line options well.
You may then use this template in your shell scripts and
modify it to fit your needs.
Consider the following command line:
$ fgrep -v -i -f excludes.list *.c *.h
|
This command line consists of a command ("fgrep") with
three flags "-v", "-i" and "-f".
One flag takes an argument
("excludes.list"). After the command line flags
multiple file names ("*.c", "*.h") may follow. At this point we do not
know how many file names that may be; the shell will
expand the file name patterns (or "wildcards")
to a list of actual file names before calling the command "fgrep".
The command itself does not have to deal with
wildcards.
What happens if there is no file matching the pattern
"*.c" in the current directory? In this case the shell
will pass the parameter unchanged to the program.
If we wanted to handle command lines like the above,
we must be prepared to handle
- command line flags (i.e. "-v", "-i")
- command line flags with arguments (i.e. "-f file")
- multiple file names following the flags
The shell sets some environment variables according
to the command line arguments specified:
$0 |
The name the script was invoked with. This may be a basename
without directory component, or a path name. This variable
is not changed with subsequent shift
commands.
|
$1 , $2 , $3 , ... |
The first, second, third, ... command line argument,
respectively. The argument may contain whitespace if the argument
was quoted, i.e. "two words".
|
$# |
Number of command line arguments, not counting the
invocation name $0
|
$@ |
"$@" is replaced with all command line
arguments, enclosed in quotes, i.e. "one", "two three",
"four". Whitespace within an argument is
preserved.
|
$* |
$* is replaced with all command line
arguments. Whitespace is not preserved, i.e.
"one", "two three", "four" would be changed to
"one", "two", "three", "four".
This variable is not used very often, "$@"
is the normal case, because it leaves the arguments
unchanged.
|
The following code segment loops through all command line
arguments, and prints them:
:
# cmdtest - print command line arguments
while [ $# -gt 0 ]
do
echo "$1"
shift
done
|
The environment variable $# is automatically set
to the number of command line arguments. If the script was
called with the following command line:
$ cmdtest one "two three" four
$# would have the value "3" for the arguments: "one",
"two three", and "four". "two three" count as one argument,
because they are enclosed within quotes.
The shift command
"shifts" all command line arguments one position to the left. The leftmost
argument is lost. The following table lists the values of $#
and the command line arguments during the iterations of the
while loop:
$# |
remaining arguments |
comments
|
3 |
$1 = "one" $2 = "two three" $3 = "four" |
start of the command
|
2 |
$1 = "two three" $2 = "four" |
after the first shift |
1 |
$1 = "four" |
after the second shift |
0 |
|
end of the while loop |
Now that we can loop through the argument list,
we can set script variables depending on command
line flags:
vflag=off
while [ $# -gt 0 ]
do
case "$1" in
-v) vflag=on;;
esac
shift
done
|
The command line option -v will now result
in the variable vflag to be set to
"on ". We can then use this variable throughout
the script.
Now let's improve this code fragment to handle
file names. It would be nice if the script would handle
all command line flags, but leave the file names alone. This
way we could use the shell variable $@ with
the remaining command line arguments later
on, i.e.
# ...
grep $searchstring "$@"
and be sure that it only contains file names.
But how do we recognize file names from command line switches?
That's easy: files do not start with a dash "-"
(at least not yet...):
vflag=off
while [ $# -gt 0 ]
do
case "$1" in
-v) vflag=on;;
-*)
echo >&2 "usage: $0 [-v] [file ...]"
exit 1;;
*) break;; # terminate while loop
esac
shift
done
|
This example prints a short usage message and terminates
if an unknown command line flag starting with a dash
was specified.
If the current argument does not start with a dash (and
therefore probably is a file name),
the while loop is terminated with the break
statement, leaving the file name in the variable "$1".
Now we just need a switch for command line flags with arguments,
i.e. "-f filename". This is also pretty straight forward:
vflag=off
filename=
while [ $# -gt 0 ]
do
case "$1" in
-v) vflag=on;;
-f) filename="$2"; shift;;
-*) echo >&2 \
"usage: $0 [-v] [-f file] [file ...]"
exit 1;;
*) break;; # terminate while loop
esac
shift
done
|
If the argument $1 is "-f ",
the next argument ($2 ) should be the file
name. We now handled two arguments ("-f" and the filename),
but the shift after the case construct will
only "consume" one argument. This is the reason why we
execute an initial shift after saving
the filename in the variable filename .
This shift removes the "-f" flag, while the
second (after the case construct) removes
the filename argument.
We still have a problem handling file names starting
with a dash ("-"), but that's a problem every standard
unix command interpreting command line switches has.
It is commonly solved by inventing a special command line
option named "--" meaning "end of the option list".
If you for example had a file named "-f", it could not
be removed using the command "rm -f", because "-f" is a valid
command line option. Instead you can use "rm -- -f". The double
dash "--" means "end of command line flags", and the following
"-f" is then interpreted as a file name.
- Note:
- You can also remove a file named "-f" using the command
"rm ./-f"
The following (recommended)
command line handling
code is a good way to solve this problem:
vflag=off
filename=
while [ $# -gt 0 ]
do
case "$1" in
-v) vflag=on;;
-f) filename="$2"; shift;;
--) shift; break;;
-*)
echo >&2 \
"usage: $0 [-v] [-f file] [file ...]"
exit 1;;
*) break;; # terminate while loop
esac
shift
done
# all command line switches are processed,
# "$@" contains all file names
|
The drawback of this command line handling is that
it needs whitespace between the option character and
an argument, ("-f file" works, but "-ffile" fails), and
that multiple option characters cannot be written behind
one switch character, ("-v -l" works, but "-vl" does not).
- Portability:
-
This method works with all shells derived from the Bourne
Shell, i.e.
sh, ksh,
ksh93, bash,
pdksh, zsh.
|
Now this script processes its command line arguments
like any standard UNIX command, with one exception. Multiple
command line flags may be combined with standard commands,
i.e. "ls -l -a -i" may be written as "ls -lai". This
is not that easy to handle from inside of our shell script,
but fortunately there is a command that does the work for us:
getopt (1).
The following test shows us, how getopt
rewrites the command line arguments "-vl -f file one two three":
$ getopt f:vl -vl -ffile one two three
produces the output
-v -l -f file -- one two three
These are the command line flags we would have liked
to get! The flags "-vl" are separated into two flags
"-v" and "-l". The command line options are separated from
the file named by a "--" argument.
How did getopt know, that "-f" needed a second argument,
but "-v" and "-l" did not?
The first argument to getopt describes, what options
are acceptable, and if they have arguments. An option
character followed by a colon (":") means that the
option expects an argument.
Now we are ready to let getopt rewrite the command
line arguments for us. Since getopt writes the rewritten
arguments to standard output, we use
set -- `getopt f:vl "$@"`
to set the arguments. `getopt ...` means "the output
of the command getopt", and "set -- " sets the command line
arguments to the result of this output. In our example
set -- `getopt f:vl -vl -ffile one two three`
is replaced with
set -- -v -l -f file -- one two three
which results in the command line arguments
-v -l -f file -- one two three
These arguments can easily be processed by the
script we developed above.
Now we include getopt within our script:
vflag=off
filename=
set -- `getopt vf: "$@"`
[ $# -lt 1 ] && exit 1 # getopt failed
while [ $# -gt 0 ]
do
case "$1" in
-v) vflag=on;;
-f) filename="$2"; shift;;
--) shift; break;;
-*)
echo >&2 \
"usage: $0 [-v] [-f file] file ..."
exit 1;;
*) break;; # terminate while loop
esac
shift
done
# all command line switches are processed,
# "$@" contains all file names
|
The first version of this document contained the line
set -- `getopt vf: "$@"` || exit 1
This commands do not
work with all shells, because the set command
doesn't always return an error code if getopt fails.
The line assumes, that getopt
sets its return value if the command line arguments are wrong (which
is almost certainly the case) and that set
returns an error code if the command substitution (that executes
getopt) fails. This is not always true.
Why didn't we use getopt in the first place? There is
one drawback with the use of getopt: it removes whitespace
within arguments. The command line
one "two three" four
(three command line arguments) is rewritten as
one two three four
(four arguments). Don't use the
getopt command if the arguments may contain
whitespace characters.
Newer shells (Korn Shell, BASH) have the build-in
getopts command, which does not have this
problem. This command is described in the following
section.
- Portability:
-
The getopt command is part
of almost any UNIX system.
|
On newer shells, the getopts command is built-in.
Do not confuse it with the older
getopt (without the trailing "s") command.
getopts strongly resembles the C library
function getopt(3).
Below is a typical example of how getopts
is used:
vflag=off
filename=
while getopts vf: opt
do
case "$opt" in
v) vflag=on;;
f) filename="$OPTARG";;
\?) # unknown flag
echo >&2 \
"usage: $0 [-v] [-f filename] [file ...]"
exit 1;;
esac
done
shift `expr $OPTIND - 1`
|
- Portability:
-
The getopts command is an internal
command of newer shells. As a rule of thumb all systems
that have the KSH have shells (including
the Bourne Shell sh)
that include a built-in getopts command.
|
The following table should help you find good names
for your command line flags. Look at the second column
(Meaning), and see if you find a rough description
of your command line option there. If you i.e. are searching
for the name of on option to append to a file,
you could use the "-a " flag.
Flag |
Meaning |
UNIX examples
|
-a |
- append, i.e. output to a file
- show/process all files, ...
|
tee -a
ls -a
|
-c |
- count something
- command string
|
grep -c
sh -c command
|
-d |
- directory
- specify a delimiter
|
cpio -d
cut -ddelimiter
|
-e |
- expand something, i.e. tabs to spaces
- execute command
|
pr -e
xterm -e /bin/ksh
|
-f |
- read input from a file
- force some condition (i.e. no
prompts, non-interactive execution)
- specify field number
|
fgrep -f file
rm -f
cut -ffieldnumber
|
-h |
- print a help message
- print a header
Note: -t for
title may be more appropriate.
|
pr -hheader
|
-i |
- ignore the case of characters
- Turn on interactive mode
- Specify input option
|
grep -i
rm -i
|
-l |
- long output format
- list file names
- line count
- login name
|
ls -l, ps -l, who -l
grep -l
wc -l
rlogin -lname
|
-L |
|
cpio -L, ls -L
|
-n |
- non-interactive mode
- numeric processing
|
rsh -n
sort -n
|
-o |
- output option, i.e. output file name
|
cc -o, sort -o
|
-p |
|
ps -p pid
mkdir -p
|
-q |
|
finger -q, who -q
|
-r |
- process directories recursively
Note: the flag -R would be better for
this purpose.
- process something in the reverse order
- specify root directory
|
rm -r
sort -r, ls -r
|
-R |
- process directories recursively
|
chmod -R
ls -R
|
-s |
- be silent about errors
Note: such an option is unnecessary, because
the user can make the program silent by redirecting
standard output and standard error to /dev/null.
|
cat -s
lp -s
|
-t |
|
sort -ttabchar
|
-u |
- Produce unique output
- process data unbuffered
|
sort -u
cat -u
|
-v |
- print verbose output, the opposite of
-q
- reverse the functionality
|
cpio -v, tar -v
grep -v
|
-w |
- specify width
- wide output format
- work with words
|
pr -w, sdiff -w
ps -w
wc -w
|
-x |
|
|
-y |
- answer yes to all questions (effectively
making the command non-interactive)
Note: The flag -f may be better for
this purpose.
|
fsck -y, shutdown -y
|
Now you know the standard option names, on to
"standard" UNIX commands that do not use them.
- dd - disk dump
-
dd if=infile of=outfile bs=10k
The syntax of this command probably is older
than UNIX itself. One major disadvantage is that
argument names and file names are written together
without whitespace, i.e. if=mydoc*.txt .
The shell will take "if=" as part of the file name,
and cannot expand the wildcards "mydoc*.txt".
- find - find files
-
find / -name '*.txt' -print
With this command option names have more than
one character. This makes them more memorable
and more readable. If only all commands would
be like this! And if only
-print was a default option!
By the way, did you know that the command
line
$ ls -bart -simpson -is -cool
is a valid usage for the SOLARIS
ls command?
|
|