SHELLdorado Newsletter 3/2001 - August 26, 2001

================================================================
The "SHELLdorado Newsletter" covers UNIX shell script related
topics. To subscribe to this newsletter, leave your e-mail
address at the SHELLdorado home page:

	http://www.shelldorado.com/

"Heiner's SHELLdorado" is a place for UNIX shell script
programmers providing

     Many shell script examples,
     shell scripting tips & tricks + more...
================================================================

Focus on CGI programming

Contents

 o  Editorial
 o  Q&A: How can I write CGI programs using shell scripts?
 o  Q&A: How can I encode/decode URL data?
 o  Q&A: Are there further CGI resources at the SHELLdorado?
 o  Amendment: Arrays for Bourne shell

-----------------------------------------------------------------
>> Editorial
-----------------------------------------------------------------

    The World Wide Web is one of the most exciting topics today.
    It provides a wealth of information for people searching
    information, and an comparingly easy way for programmers to
    present it. No need to guess screen sizes and screen column
    positions: just create HTML output, and the browser will
    render text and images in the best possible way.

    If you never tried to create HTML pages, or do not know what
    CGI is, this issue of the SHELLdorado Newsletter is not
    written for you. Wait for the next one, or browse some of
    the back issues:

    	http://www.shelldorado.com/newsletter/

    If you are writing CGI script of your own, or are planning
    to do so, please read on! You will find tools and tips for
    writing CGI scripts using the Bourne (or Korn) shell.

    Heiner Steven, Editor
    <heiner.steven@shelldorado.com>


-----------------------------------------------------------------
>> Q&A: How can I write CGI programs using shell scripts?
-----------------------------------------------------------------

    The Web mostly consists of static HTML pages, connected with
    hyperlinks. But the real power of the web becomes apparent
    with dynamically generated pages: pages that are created
    "on-the-fly" for each request, and therefore are always
    up-to-date.

    One method for creating Web pages dynamically are CGI
    ("Common Gateway Interface") programs or scripts. While Perl
    or Java are the languages of choice for many CGI
    programmers, shell scripts offer some advantages over the
    other languages, e.g. the shell is present on any UNIX
    system without the need to install additional language
    interpreters or compilers, and scripts tend to be short and
    easily written.

    The following shell script template may be used to create
    arbitrary CGI scripts. It parses CGI arguments into shell
    script variables, and creates the minimum output required
    for a CGI script.

    The CGI arguments are processed by a helper script "urlgetopt":

    	http://www.shelldorado.com/scripts/cmds/urlgetopt

    The script is executable without any changes, and just
    prints the arguments it was invoked with. Install it as a
    template into the server's /cgi-bin/ directory, and you can
    start to used to create new CGI scripts on your own!

	#! /bin/sh
	# Minimal example of a CGI script. Uses "urlgetopt" to
	# parse CGI arguments.

	# Append directory of "urlgetopt" to command search
	# path:

	PATH=$PATH:/usr/local/bin	export PATH

	# Print minimum CGI header. For debugging "text/plain"
	# could be used instead of "text/html". Note the
	# embedded new line:

	echo "Content-type: text/html
	"

	# Debugging: redirect error messages to standard output
	# where we can see them:

	exec 2>&1

	# The standard variable REQUEST_METHOD tells us, if the
	# CGI arguments are in the variable QUERY_STRING, or if
	# we have to read them from standard input

	case "$REQUEST_METHOD" in
	    GET)		# Arguments in QUERY_STRING
		;;
	    POST)
		# Form data is on the first line of the standard
		# input:
		read query_string
		QUERY_STRING=$query_string
		;;

	    *)
		echo "ERROR: request method: $REQUEST_METHOD"
		exit 1
		;;
	esac

	# Parse CGI arguments, and set environment variables for
	# each HTML form variable.  Prefix each variable name
	# with "FORM_", i.e. the contents of the HTML form name
	# "email" become available in the variable "FORM_email".
	# If this script was invoked in the following way:
	#	http://host/cgi-bin/test?email=nn@mail.com&street=none
	# the variables "FORM_email" and "FORM_street" would be
	# set.

	eval "`urlgetopt -l -p FORM_ \"$QUERY_STRING\"`"

	# At this point, the form variables are accessible using
	# FORM_* environment variables. The script should print
	# HTML code to standard output

	echo "DEBUG: CGI arguments:"
	set | grep "^FORM_"

	exit 0

    A more sophisticated version (written for Korn Shell) is
    available at the SHELLdorado:

    	http://www.shelldorado.com/scripts/cmds/cgitemplate.ksh

    This version additionally shows, how a script could look
    like that may be used interaktively from the command line,
    or be invoked as a CGI program.

    [Further reading:
    	NCSA: The Common Gateway Interface.
	    http://hoohoo.ncsa.uiuc.edu/cgi/overview.html
    ]

-----------------------------------------------------------------
>> Q&A: How can I encode/decode URL data?
-----------------------------------------------------------------

    Within CGI scripts it is sometimes necessary to encode or
    decode HTML form data. The data usually is encoded using the
    MIME type "application/x-www-form-urlencoded", which
    basically represents "unsafe" characters with a percent
    character ('%') followed by their code value printed as a
    two-digit hex code. If a user e.g. entered the string
    "a:*.txt" the encoded string would look like "a%3A%2A.txt",
    where %3A is the representation of a colon (':'), and %2A
    represents an asterisk ('*'). A space character may be
    encoded using %20 (using its ASCII code 32 = hex 20) or just
    with a plus sign ('+').

    The following two scripts handle encoding/decoding of
    "urlencoded" data:

    	http://www.shelldorado.com/scripts/cmds/urlencode
    	http://www.shelldorado.com/scripts/cmds/urldecode

    The following example shows how the scripts could be used
    e.g. from a CGI script:

	# translate - translate English word into German, and
	# vice versa
	#...
	baseurl=http://dict.leo.org/?search=
	word=shell

	encoded=`echo "$word" | urlencode`
	requrl=$baseurl$encoded

	# retrieve $requrl, parse output...

    Note that the "urlgetopt" script stated above already
    decodes "urlencoded" data.

    [Further links:
        Berners-Lee, Tom: Uniform Resource Locators (URL).
	    RFC 1738, December 1994.
	    http://www.ietf.org/rfc/rfc1738.txt
    ]


-----------------------------------------------------------------
>> Q&A: Are there further CGI resources at the SHELLdorado?
-----------------------------------------------------------------

    The following scripts may be useful for CGI script
    programemrs:

      o	dumphtmltbl - extract ASCII table data from HTML page

        Example:
	    $ wget -O- http://www.table.com | dumphtmltbl

    	http://www.shelldorado.com/scripts/cmds/dumphtmltbl

      o	htmltable - formats ASCII data as HTML table

	Example:
	    $ ls | htmltable

    	http://www.shelldorado.com/scripts/cmds/htmltable

      o	striphtml - removes all HTML tags from a page

    	Example:
	    $ striphtml index.html

	http://www.shelldorado.com/scripts/quickies/striphtml

      o	fmtlinks - create HTML links

    	Example (in a CGI program):
	    echo "<pre>"
	    fmtlinks linklist.txt
	    echo "</pre>"

	http://www.shelldorado.com/scripts/quickies/fmtlinks

      o	extracturl - extract URL list from text file

    	Example:
	    extracturl index.html

	http://www.shelldorado.com/scripts/quickies/extracturl

     -------------------------------------------------------

     The following scripts are examples on how to
     retrieve and process information from the Web:

      o	dailynews - prints daily news message from the Web
	http://www.shelldorado.com/scripts/quickies/dailynews

	findhomepage, guesshomepage
	    - see SHELLdorado Newsletter Juny 2001
	    http://www.shelldorado.com/newsletter/issues/2001-2-Jun.html

      o	translate - ...words between English and German
	http://www.shelldorado.com/scripts/quickies/translate

      o	syn - find synonyms for words
	http://www.shelldorado.com/scripts/quickies/syn


-----------------------------------------------------------------
>> Amendments: Arrays for Bourne shell
-----------------------------------------------------------------

    The last SHELLdorado Newsletter contained an example on how
    to simulate arrays for the Bourne shell using "eval"
    (http://www.shelldorado.com/newsletter/issues/2001-2-Jun.html).

    Daniel E. Singer kindly pointed out, that the quoting for
    the "eval" command was unnecessary complicated. He suggested
    calling "eval" with just one argument, and apply proper
    quoting, e.g.  var$n="$value" is rewritten to read:

    	eval "var$n=\"\$value\""

    instead of

    	eval var$n="'""$value""'"

    This makes quoting a little easier, and even works if the
    variable on the right hand side of the assignment contains a
    "single quote" character (').


----------------------------------------------------------------
If you want to comment on the newsletter, have suggestions for
new topics to be covered in one of the next issues, or even want
to submit an article of your own, send an e-mail to

	mailto:heiner.steven@shelldorado.com

================================================================
To unsubscribe send a mail with the body "unsubscribe" to
newsletter@shelldorado.com
================================================================