developer.com
developerdirect.com
htmlgoodies.com
javagoodies.com
jars.com
intranetjournal.com
javascripts.com
|
All Categories :
CGI & PERL
Appendix D
cgi-lib.ol Reference Guide
CONTENTS
Using cgi-lib.pl
Routines and Variables
Source Code
Steve Brenner's cgi-lib.pl was one
of the first CGI programming libraries available, and it is widely
used. cgi-lib.pl greatly simplifies CGI programming in Perl by
providing parsing libraries and other useful CGI routines. It
is written for Perl 4, although it will work with Perl 5.
The primary function of cgi-lib.pl is to parse form input. It
parses form input and places it in an associative array keyed
by the name of the field. This library has evolved since its first
release and can handle both regular form decoded input (application/x-www-form-urlencoded,
data that is sent as arguments in the URL itself) and the multipart
form decoded input used for the newly proposed HTML file uploading
(multipart/form-data, data
which is sent as standard input like a multipart e-mail attachment).
This appendix presents a very simple example of how to use cgi-lib.pl
and describes each available routine. The complete source code
for cgi-lib.pl appears at the end of this appendix. The library
is also available on the CD-ROM provided with this book.
Note |
I have refrained from discussing Perl 5 in this book for a number of reasons, most of them listed in the Introduction. However, I would highly encourage you to explore Perl 5 and some of its nice improvements over Perl 4. Although Perl 5 is slightly more complex than Perl 4 conceptually and syntactically, the tools you gain make it worth the time you spend learning it.
Lincoln Stein has written a very good class library for Perl 5 called CGI.pm, which includes support for form parsing, HTML form output, and internal debugging. If you know Perl 5 or plan to learn it, I highly recommend you take a look. It is available on the included CD-ROM; more information is at <URL:http://www-genome.wi.mit.edu/ftp/pub/software/WWW/cgi_docs.html>
|
To use cgi-lib.pl, you must place it either in the same directory
as your Perl scripts or in the global directory of Perl libraries
(normally located in /usr/lib/perl
on UNIX machines). On UNIX machines, make sure cgi-lib.pl is world-readable.
Using cgi-lib.pl requires two steps: including the library and
calling the functions. A very minimal CGI program using cgi-lib.pl
is
#!/usr/local/bin/perl
if (&ReadParse(*input)) {
print &PrintHeader,&HtmlTop("Form
Results");
print &PrintVariables,&HtmlBot;
}
else {
print &PrintHeader,&HtmlTop("Entry
Form");
print <<EOM;
<form method=POST>
<p>Name: <input name="name"><br>
Age: <input name="age"></p>
<p><input type=submit></p>
</form>
EOM
print &HtmlBot;
}
This program does the following:
- Checks to see if there is any form input. If there is, parse
the form and print the results.
- If there is no input, print an HTML form.
The main routine is &ReadParse,
which takes each form name/value pair and inserts it into the
associative array %input.
The array is keyed by name, so $input{'name'}
is equal to 'value'.
&PrintHeader, &HtmlTop,
&PrintVariables, and
&HtmlBot are all HTML
output functions described in more detail in the next section.
In this section, I have listed and defined the functions and variables
made available in the cgi-lib.pl library.
&ReadParse parses form
input of MIME types application/x-www-form-urlencoded
and multipart/form-data.
Pass it the variable *varname
and it will place the parsed form data in the associative array
%varname in the form:
$varname{name} = value
If a name has more than one associated value, the values are separated
by a null character. You can use the &SplitParam
function to separate the value of $varname{name}
into its multiple values.
If you want &ReadParse
to save files uploaded using HTML file upload, you must change
the value of $cgi-lib'writefiles
in cgi-lib.pl from 0 to 1.
&PrintHeader returns
the following string:
Content-Type: text/html\n\n
Here is how this function is called:
print &PrintHeader;
&HtmlTop accepts a string
that is used between the <title>
tags and the <h1> tags.
It returns a valid HTML header. For example, the following:
print &HtmlTop("Hello, World!");
prints this:
<html><head>
<title>Hello, World!</title>
</head>
<body>
<h1>Hello, World!</h1>
&HtmlBot is the complement
of &HtmlTop and returns
the HTML footer string.
</body> </html>
&SplitParam splits a
multivalued parameter returned by the associative array from &ReadParse
and returns a list containing each separate element. For example,
if you had the following form:
<form method=POST>
Street 1: <input name="street"><br>
Street 2: <input name="street"><br>
<input type=submit>
</form>
and you parsed it using this:
&ReadParse(*input);
the following is the value of $input{'street'}:
value1\0value2
To split these values, you can do the following:
@streets = &SplitParam($input{'street'});
which would return this list:
(value1, value2)
&MethGet returns 1
if REQUEST_METHOD equals
GET; otherwise, it returns
0.
&MethPost returns 1
if REQUEST_METHOD equals
POST; otherwise, it returns
0.
&MyBaseUrl returns the
URL without the QUERY_STRING
or PATH_INFO. For example,
if the URL were the following:
http://hcs.harvard.edu/cgi-bin/finger?eekim
&MyBaseUrl would return
the following:
http://hcs.harvard.edu:80/cgi-bin/finger
&MyFullUrl returns the
complete URL including any QUERY_STRING
or PATH_INFO values. For
example, if your URL is
http://hcs.harvard.edu/cgi-bin/counter.cgi/~eekim?file.html
&MyFullUrl returns the
following:
http://hcs.harvard.edu:80/cgi-bin/counter.cgi/~eekim?file.html
&CgiError accepts a list
of strings and prints them in the form of an error message. The
first string is inserted between <title>
and <h1> tags; all
subsequent strings are placed between <p>
tags. If no strings are provided, the default headline and title
of the message is
Error: script $name encountered fatal
error
where $name is the value
of &MyFullUrl. For example,
the following:
&CgiError("Error","Cannot
open file","Please report to web admin.");
returns this HTML message:
<html><head>
<title>Error</title>
</head>
<body>
<h1>Error</h1>
<p>Cannot open file</p>
<p>Please report to web admin.</p>
</body> </html>
The same as &CgiError
except it does a die when
finished. die prints the
error message to stderr.
&PrintVariables returns
a definition list (<dl>)
of each name and value pair. For example, given the name and value
pairs (name, eugene)
and (age, 21),
&PrintVariables returns
the following:
<dl compact>
<dt><b>name</b>
<dd><i>eugene</i>:<br>
<dt><b>age</b>
<dd><i>21</i>:<br>
</dl>
&PrintEnv returns a definition
list of all the environment variables.
This section contains a full listing of the cgi-lib.pl library.
Listing D.1. The cgi-lib.pl program.
# Perl Routines to Manipulate CGI input
# S.E.Brenner@bioc.cam.ac.uk
# $Id: cgi-lib.pl,v 2.8 1996/03/30 01:36:33 brenner Rel $
#
# Copyright (c) 1996 Steven E. Brenner
# Unpublished work.
# Permission granted to use and modify this library so long as
the
# copyright above is maintained, modifications are documented,
and
# credit is given for any use of the library.
#
# Thanks are due to many people for reporting bugs and suggestions
# especially Meng Weng Wong, Maki Watanabe, Bo Frese Rasmussen,
# Andrew Dalke, Mark-Jason Dominus, Dave Dittrich, Jason Mathews
# For more information, see:
# http://www.bio.cam.ac.uk/cgi-lib/
($cgi_lib'version = '$Revision: 2.8 $') =~ s/[^.\d]//g;
# Parameters affecting cgi-lib behavior
# User-configurable parameters affecting file upload.
$cgi_lib'maxdata = 131072; #
maximum bytes to accept via POST - 2^17
$cgi_lib'writefiles = 0; #
directory to which to write files, or
#
0 if files should not be written
$cgi_lib'filepre = "cgi-lib";
# Prefix of file names, in directory above
# Do not change the following parameters unless you have special
reasons
$cgi_lib'bufsize = 8192; #
default buffer size when reading multipart
$cgi_lib'maxbound = 100; #
maximum boundary length to be encounterd
$cgi_lib'headerout = 0; #
indicates whether the header has been printed
# ReadParse
# Reads in GET or POST data, converts it to unescaped text, and
puts
# key/value pairs in %in, using "\0" to separate multiple
selections
# Returns >0 if there was input, 0 if there was no input
# undef indicates some failure.
# Now that cgi scripts can be put in the normal file space, it
is useful
# to combine both the form and the script in one place. If no
parameters
# are given (i.e., ReadParse returns FALSE), then a form could
be output.
# If a reference to a hash is given, then the data will be stored
in that
# hash, but the data from $in and @in will become inaccessable.
# If a variable-glob (e.g., *cgi_input) is the first parameter
to ReadParse,
# information is stored there, rather than in $in, @in, and %in.
# Second, third, and fourth parameters fill associative arrays
analagous to
# %in with data relevant to file uploads.
# If no method is given, the script will process both command-line
arguments
# of the form: name=value and any text that is in $ENV{'QUERY_STRING'}
# This is intended to aid debugging and may be changed in future
releases
sub ReadParse {
local (*in) = shift if @_; #
CGI input
local (*incfn, #
Client's filename (may not be provided)
*inct, #
Client's content-type (may not be provided)
*insfn)
= @_; #
Server's filename (for spooled files)
local ($len, $type, $meth, $errflag, $cmdflag, $perlwarn);
# Disable warnings as this code deliberately uses
local and environment
# variables which are preset to undef (i.e., not explicitly
initialized)
$perlwarn = $^W;
$^W = 0;
# Get several useful env variables
$type = $ENV{'CONTENT_TYPE'};
$len = $ENV{'CONTENT_LENGTH'};
$meth = $ENV{'REQUEST_METHOD'};
if ($len > $cgi_lib'maxdata) { #'
&CgiDie("cgi-lib.pl:
Request to receive too much data: $len bytes\n");
}
if (!defined $meth || $meth eq '' || $meth eq 'GET'
||
$type eq 'application/x-www-form-urlencoded')
{
local ($key, $val, $i);
# Read in text
if (!defined $meth || $meth eq '') {
$in = $ENV{'QUERY_STRING'};
$cmdflag = 1; # also use
command-line options
} elsif($meth eq 'GET' || $meth eq 'HEAD')
{
$in = $ENV{'QUERY_STRING'};
} elsif ($meth eq 'POST') {
$errflag = (read(STDIN,
$in, $len) != $len);
} else {
&CgiDie("cgi-lib.pl:
Unknown request method: $meth\n");
}
@in = split(/[&;]/,$in);
push(@in, @ARGV) if $cmdflag; # add command-line
parameters
foreach $i (0 .. $#in) {
# Convert plus to space
$in[$i] =~ s/\+/ /g;
# Split into key and value.
($key, $val) = split(/=/,$in[$i],2);
# splits on the first =.
# Convert %XX from hex numbers
to alphanumeric
$key =~ s/%([A-Fa-f0-9]{2})/pack("c",hex($1))/ge;
$val =~ s/%([A-Fa-f0-9]{2})/pack("c",hex($1))/ge;
# Associate key and value
$in{$key} .= "\0"
if (defined($in{$key})); # \0 is the multiple separator
$in{$key} .= $val;
}
} elsif ($ENV{'CONTENT_TYPE'} =~ m#^multipart/form-data#)
{
# for efficiency, compile multipart code
only if needed
$errflag = !(eval <<'END_MULTIPART');
local ($buf, $boundary, $head, @heads,
$cd, $ct, $fname, $ctype, $blen);
local ($bpos, $lpos, $left, $amt, $fn,
$ser);
local ($bufsize, $maxbound, $writefiles)
=
($cgi_lib'bufsize, $cgi_lib'maxbound,
$cgi_lib'writefiles);
# The following lines exist solely to
eliminate spurious warning messages
$buf = '';
($boundary) = $type =~ /boundary="([^"]+)"/;
#"; # find boundary
($boundary) = $type =~ /boundary=(\S+)/
unless $boundary;
&CgiDie ("Boundary not provided")
unless $boundary;
$boundary = "--" . $boundary;
$blen = length ($boundary);
if ($ENV{'REQUEST_METHOD'} ne 'POST')
{
&CgiDie("Invalid
request method for multipart/form-data: $meth\n");
}
if ($writefiles) {
local($me);
stat ($writefiles);
$writefiles = "/tmp"
unless -d _ && -r _ && -w _;
# ($me) = $0 =~ m#([^/]*)$#;
$writefiles .= "/$cgi_lib'filepre";
}
# read in the data and split into parts:
# put headers in @in and data in %in
# General algorithm:
# There are two dividers:
the border and the '\r\n\r\n' between
# header and body. Iterate between searching
for these
# Retain a buffer of
size(bufsize+maxbound); the latter part is
# to ensure that dividers don't get lost
by wrapping between two bufs
# Look for a divider
in the current batch. If not found, then
# save all of bufsize, move the maxbound
extra buffer to the front of
# the buffer, and read in a new bufsize
bytes. If a divider is found,
# save everything up to the divider.
Then empty the buffer of everything
# up to the end of the divider. Refill
buffer to bufsize+maxbound
# Note slightly odd organization. Code
before BODY: really goes with
# code following HEAD:, but is put first
to 'pre-fill' buffers. BODY:
# is placed before HEAD: because we first
need to discard any 'preface,'
# which would be analagous to a body without
a preceeding head.
$left = $len;
PART: # find each part of the multi-part while
reading data
while (1) {
last PART if $errflag;
$amt = ($left > $bufsize+$maxbound-length($buf)
? $bufsize+$maxbound-length($buf):
$left);
$errflag = (read(STDIN, $buf,
$amt, length($buf)) != $amt);
$left -= $amt;
$in{$name} .= "\0"
if defined $in{$name};
$in{$name} .= $fn if $fn;
$name=~/([-\w]+)/; # This
allows $insfn{$name} to be untainted
if (defined $1) {
$insfn{$1} .=
"\0" if defined $insfn{$1};
$insfn{$1} .=
$fn if $fn;
}
BODY:
while (($bpos = index($buf,
$boundary)) == -1) {
if ($name) {
# if no $name, then it's the prologue -- discard
if
($fn) { print FILE substr($buf, 0, $bufsize); }
else $in{$name}
.= substr($buf, 0, $bufsize); }
}
$buf = substr($buf,
$bufsize);
$amt = ($left
> $bufsize ? $bufsize : $left); #$maxbound==length($buf);
$errflag = (read(STDIN,
$buf, $amt, $maxbound) != $amt);
$left -= $amt;
}
if (defined $name) { # if
no $name, then it's the prologue -- discard
if ($fn) { print
FILE substr($buf, 0, $bpos-2); }
else {
$in {$name} .= substr($buf, 0, $bpos-2); } # kill last \r\n
}
close (FILE);
last PART if substr($buf,
$bpos + $blen, 4) eq "--\r\n";
substr($buf, 0, $bpos+$blen+2)
= '';
$amt = ($left > $bufsize+$maxbound-length($buf)
? $bufsize+$maxbound-length($buf)
: $left);
$errflag = (read(STDIN, $buf,
$amt, length($buf)) != $amt);
$left -= $amt;
undef $head; undef $fn;
HEAD:
while (($lpos = index($buf,
"\r\n\r\n")) == -1) {
$head .= substr($buf,
0, $bufsize);
$buf = substr($buf,
$bufsize);
$amt = ($left
> $bufsize ? $bufsize : $left); #$maxbound==length($buf);
$errflag = (read(STDIN,
$buf, $amt, $maxbound) != $amt);
$left -= $amt;
}
$head .= substr($buf, 0, $lpos+2);
push (@in, $head);
@heads = split("\r\n",
$head);
($cd) = grep (/^\s*Content-Disposition:/i,
@heads);
($ct) = grep (/^\s*Content-Type:/i,
@heads);
($name) = $cd =~ /\bname="([^"]+)"/i;
#";
($name) = $cd =~ /\bname=([^\s:;]+)/i
unless defined $name;
($fname) = $cd =~ /\bfilename="([^"]*)"/i;
#"; # filename can be null-str
($fname) = $cd =~ /\bfilename=([^\s:;]+)/i
unless defined $fname;
$incfn{$name} .= (defined
$in{$name} ? "\0" : "") . $fname;
($ctype) = $ct =~ /^\s*Content-type:\s*"([^"]+)"/i;
#";
($ctype) = $ct =~ /^\s*Content-Type:\s*([^\s:;]+)/i
unless defined $ctype
;
$inct{$name} .= (defined $in{$name}
? "\0" : "") . $ctype;
if ($writefiles &&
defined $fname) {
$ser++;
$fn = $writefiles
. ".$$.$ser";
open (FILE, ">$fn")
|| &CgiDie("Couldn't open $fn\n");
}
substr($buf, 0, $lpos+4) =
'';
undef $fname;
undef $ctype;
}
1;
END_MULTIPART
&CgiDie($@) if $errflag;
} else {
&CgiDie("cgi-lib.pl: Unknown
Content-type: $ENV{'CONTENT_TYPE'}\n");
}
$^W = $perlwarn;
return ($errflag ? undef : scalar(@in));
}
# PrintHeader
# Returns the magic line which tells WWW that we're an HTML document
sub PrintHeader {
return "Content-type: text/html\n\n";
}
# HtmlTop
# Returns the <head> of a document and the beginning of
the body
# with the title and a body <h1> header as specified by
the parameter
sub HtmlTop
{
local ($title) = @_;
return <<END_OF_TEXT;
<html>
<head>
<title>$title</title>
</head>
<body>
<h1>$title</h1>
END_OF_TEXT
}
# HtmlBot
# Returns the </body>, </html> codes for the bottom
of every HTML page
sub HtmlBot
{
return "</body>\n</html>\n";
}
# SplitParam
# Splits a multi-valued parameter into a list of the constituent
parameters
sub SplitParam
{
local ($param) = @_;
local (@params) = split ("\0", $param);
return (wantarray ? @params : $params[0]);
}
# MethGet
# Return true if this cgi call was using the GET request, false
otherwise
sub MethGet {
return (defined $ENV{'REQUEST_METHOD'} &&
$ENV{'REQUEST_METHOD'} eq "GET");
}
# MethPost
# Return true if this cgi call was using the POST request, false
otherwise
sub MethPost {
return (defined $ENV{'REQUEST_METHOD'} &&
$ENV{'REQUEST_METHOD'} eq "POST");
}
# MyBaseUrl
# Returns the base URL to the script (i.e., no extra path or query
string)
sub MyBaseUrl {
local ($ret, $perlwarn);
$perlwarn = $^W; $^W = 0;
$ret = 'http://' . $ENV{'SERVER_NAME'} .
($ENV{'SERVER_PORT'}
!= 80 ? ":$ENV{'SERVER_PORT'}" : '') .
$ENV{'SCRIPT_NAME'};
$^W = $perlwarn;
return $ret;
}
# MyFullUrl
# Returns the full URL to the script (i.e., with extra path or
query string)
sub MyFullUrl {
local ($ret, $perlwarn);
$perlwarn = $^W; $^W = 0;
$ret = 'http://' . $ENV{'SERVER_NAME'} .
($ENV{'SERVER_PORT'}
!= 80 ? ":$ENV{'SERVER_PORT'}" : '') .
$ENV{'SCRIPT_NAME'}
. $ENV{'PATH_INFO'} .
(length
($ENV{'QUERY_STRING'}) ? "?$ENV{'QUERY_STRING'}" : '');
$^W = $perlwarn;
return $ret;
}
# MyURL
# Returns the base URL to the script (i.e., no extra path or query
string)
# This is obsolete and will be removed in later versions
sub MyURL {
return &MyBaseUrl;
}
# CgiError
# Prints out an error message which which containes appropriate
headers,
# markup, etcetera.
# Parameters:
# If no parameters, gives a generic error message
# Otherwise, the first parameter will be the title
and the rest will
# be given as different paragraphs of the body
sub CgiError {
local (@msg) = @_;
local ($i,$name);
if (!@msg) {
$name = &MyFullUrl;
@msg = ("Error: script $name encountered
fatal error\n");
};
if (!$cgi_lib'headerout) { #')
print &PrintHeader;
print "<html>\n<head>\n<title>$msg[0]</title>\n</head>\n<body>\n";
}
print "<h1>$msg[0]</h1>\n";
foreach $i (1 .. $#msg) {
print "<p>$msg[$i]</p>\n";
}
$cgi_lib'headerout++;
}
# CgiDie
# Identical to CgiError, but also quits with the passed error
message.
sub CgiDie {
local (@msg) = @_;
&CgiError (@msg);
die @msg;
}
# PrintVariables
# Nicely formats variables. Three calling options:
# A non-null associative array - prints the items in that array
# A type-glob - prints the items in the associated assoc array
# nothing - defaults to use %in
# Typical use: &PrintVariables()
sub PrintVariables {
local (*in) = @_ if @_ == 1;
local (%in) = @_ if @_ > 1;
local ($out, $key, $output);
$output = "\n<dl compact>\n";
foreach $key (sort keys(%in)) {
foreach (split("\0", $in{$key}))
{
($out = $_) =~ s/\n/<br>\n/g;
$output .= "<dt><b>$key</b>\n
<dd>:<i>$out</i>:<br>\n";
}
}
$output .= "</dl>\n";
return $output;
}
# PrintEnv
# Nicely formats all environment variables and returns HTML string
sub PrintEnv {
&PrintVariables(*ENV);
}
# The following lines exist only to avoid warning messages
$cgi_lib'writefiles = $cgi_lib'writefiles;
$cgi_lib'bufsize = $cgi_lib'bufsize ;
$cgi_lib'maxbound = $cgi_lib'maxbound;
$cgi_lib'version = $cgi_lib'version;
1; #return true

Contact
reference@developer.com with questions or comments.
Copyright 1998
EarthWeb Inc., All rights reserved.
PLEASE READ THE ACCEPTABLE USAGE STATEMENT.
Copyright 1998 Macmillan Computer Publishing. All rights reserved.
|
|