SHELLdorado Newsletter 3/2001 - August 26, 2001 ================================================================ The "SHELLdorado Newsletter" covers UNIX shell script related topics. To subscribe to this newsletter, leave your e-mail address at the SHELLdorado home page: http://www.shelldorado.com/ "Heiner's SHELLdorado" is a place for UNIX shell script programmers providing Many shell script examples, shell scripting tips & tricks + more... ================================================================ Focus on CGI programming Contents o Editorial o Q&A: How can I write CGI programs using shell scripts? o Q&A: How can I encode/decode URL data? o Q&A: Are there further CGI resources at the SHELLdorado? o Amendment: Arrays for Bourne shell ----------------------------------------------------------------- >> Editorial ----------------------------------------------------------------- The World Wide Web is one of the most exciting topics today. It provides a wealth of information for people searching information, and an comparingly easy way for programmers to present it. No need to guess screen sizes and screen column positions: just create HTML output, and the browser will render text and images in the best possible way. If you never tried to create HTML pages, or do not know what CGI is, this issue of the SHELLdorado Newsletter is not written for you. Wait for the next one, or browse some of the back issues: http://www.shelldorado.com/newsletter/ If you are writing CGI script of your own, or are planning to do so, please read on! You will find tools and tips for writing CGI scripts using the Bourne (or Korn) shell. Heiner Steven, Editor <heiner.steven@shelldorado.com> ----------------------------------------------------------------- >> Q&A: How can I write CGI programs using shell scripts? ----------------------------------------------------------------- The Web mostly consists of static HTML pages, connected with hyperlinks. But the real power of the web becomes apparent with dynamically generated pages: pages that are created "on-the-fly" for each request, and therefore are always up-to-date. One method for creating Web pages dynamically are CGI ("Common Gateway Interface") programs or scripts. While Perl or Java are the languages of choice for many CGI programmers, shell scripts offer some advantages over the other languages, e.g. the shell is present on any UNIX system without the need to install additional language interpreters or compilers, and scripts tend to be short and easily written. The following shell script template may be used to create arbitrary CGI scripts. It parses CGI arguments into shell script variables, and creates the minimum output required for a CGI script. The CGI arguments are processed by a helper script "urlgetopt": http://www.shelldorado.com/scripts/cmds/urlgetopt The script is executable without any changes, and just prints the arguments it was invoked with. Install it as a template into the server's /cgi-bin/ directory, and you can start to used to create new CGI scripts on your own! #! /bin/sh # Minimal example of a CGI script. Uses "urlgetopt" to # parse CGI arguments. # Append directory of "urlgetopt" to command search # path: PATH=$PATH:/usr/local/bin export PATH # Print minimum CGI header. For debugging "text/plain" # could be used instead of "text/html". Note the # embedded new line: echo "Content-type: text/html " # Debugging: redirect error messages to standard output # where we can see them: exec 2>&1 # The standard variable REQUEST_METHOD tells us, if the # CGI arguments are in the variable QUERY_STRING, or if # we have to read them from standard input case "$REQUEST_METHOD" in GET) # Arguments in QUERY_STRING ;; POST) # Form data is on the first line of the standard # input: read query_string QUERY_STRING=$query_string ;; *) echo "ERROR: request method: $REQUEST_METHOD" exit 1 ;; esac # Parse CGI arguments, and set environment variables for # each HTML form variable. Prefix each variable name # with "FORM_", i.e. the contents of the HTML form name # "email" become available in the variable "FORM_email". # If this script was invoked in the following way: # http://host/cgi-bin/test?email=nn@mail.com&street=none # the variables "FORM_email" and "FORM_street" would be # set. eval "`urlgetopt -l -p FORM_ \"$QUERY_STRING\"`" # At this point, the form variables are accessible using # FORM_* environment variables. The script should print # HTML code to standard output echo "DEBUG: CGI arguments:" set | grep "^FORM_" exit 0 A more sophisticated version (written for Korn Shell) is available at the SHELLdorado: http://www.shelldorado.com/scripts/cmds/cgitemplate.ksh This version additionally shows, how a script could look like that may be used interaktively from the command line, or be invoked as a CGI program. [Further reading: NCSA: The Common Gateway Interface. http://hoohoo.ncsa.uiuc.edu/cgi/overview.html ] ----------------------------------------------------------------- >> Q&A: How can I encode/decode URL data? ----------------------------------------------------------------- Within CGI scripts it is sometimes necessary to encode or decode HTML form data. The data usually is encoded using the MIME type "application/x-www-form-urlencoded", which basically represents "unsafe" characters with a percent character ('%') followed by their code value printed as a two-digit hex code. If a user e.g. entered the string "a:*.txt" the encoded string would look like "a%3A%2A.txt", where %3A is the representation of a colon (':'), and %2A represents an asterisk ('*'). A space character may be encoded using %20 (using its ASCII code 32 = hex 20) or just with a plus sign ('+'). The following two scripts handle encoding/decoding of "urlencoded" data: http://www.shelldorado.com/scripts/cmds/urlencode http://www.shelldorado.com/scripts/cmds/urldecode The following example shows how the scripts could be used e.g. from a CGI script: # translate - translate English word into German, and # vice versa #... baseurl=http://dict.leo.org/?search= word=shell encoded=`echo "$word" | urlencode` requrl=$baseurl$encoded # retrieve $requrl, parse output... Note that the "urlgetopt" script stated above already decodes "urlencoded" data. [Further links: Berners-Lee, Tom: Uniform Resource Locators (URL). RFC 1738, December 1994. http://www.ietf.org/rfc/rfc1738.txt ] ----------------------------------------------------------------- >> Q&A: Are there further CGI resources at the SHELLdorado? ----------------------------------------------------------------- The following scripts may be useful for CGI script programemrs: o dumphtmltbl - extract ASCII table data from HTML page Example: $ wget -O- http://www.table.com | dumphtmltbl http://www.shelldorado.com/scripts/cmds/dumphtmltbl o htmltable - formats ASCII data as HTML table Example: $ ls | htmltable http://www.shelldorado.com/scripts/cmds/htmltable o striphtml - removes all HTML tags from a page Example: $ striphtml index.html http://www.shelldorado.com/scripts/quickies/striphtml o fmtlinks - create HTML links Example (in a CGI program): echo "<pre>" fmtlinks linklist.txt echo "</pre>" http://www.shelldorado.com/scripts/quickies/fmtlinks o extracturl - extract URL list from text file Example: extracturl index.html http://www.shelldorado.com/scripts/quickies/extracturl ------------------------------------------------------- The following scripts are examples on how to retrieve and process information from the Web: o dailynews - prints daily news message from the Web http://www.shelldorado.com/scripts/quickies/dailynews findhomepage, guesshomepage - see SHELLdorado Newsletter Juny 2001 http://www.shelldorado.com/newsletter/issues/2001-2-Jun.html o translate - ...words between English and German http://www.shelldorado.com/scripts/quickies/translate o syn - find synonyms for words http://www.shelldorado.com/scripts/quickies/syn ----------------------------------------------------------------- >> Amendments: Arrays for Bourne shell ----------------------------------------------------------------- The last SHELLdorado Newsletter contained an example on how to simulate arrays for the Bourne shell using "eval" (http://www.shelldorado.com/newsletter/issues/2001-2-Jun.html). Daniel E. Singer kindly pointed out, that the quoting for the "eval" command was unnecessary complicated. He suggested calling "eval" with just one argument, and apply proper quoting, e.g. var$n="$value" is rewritten to read: eval "var$n=\"\$value\"" instead of eval var$n="'""$value""'" This makes quoting a little easier, and even works if the variable on the right hand side of the assignment contains a "single quote" character ('). ---------------------------------------------------------------- If you want to comment on the newsletter, have suggestions for new topics to be covered in one of the next issues, or even want to submit an article of your own, send an e-mail to mailto:heiner.steven@shelldorado.com ================================================================ To unsubscribe send a mail with the body "unsubscribe" to newsletter@shelldorado.com ================================================================