Welcome to Teach Yourself Perl in 21 Days. Today you'll
learn about the following:
Perl is an acronym, short for Practical Extraction
and Report Language. It was designed by Larry Wall
as a tool for writing programs in the UNIX environment and is
continually being updated and maintained by him.
For its many fans, Perl provides the best of several worlds.
For instance:
In short, Perl is as powerful as C but as convenient as awk,
sed, and shell scripts.
*
Note: This book assumes that you are familiar with the basics of using the UNIX operating system.
As you'll see, Perl is very easy to learn. Indeed, if you are
familiar with other programming languages, learning Perl is a
snap. Even if you have very little programming experience, Perl
can have you writing useful programs in a very short time. By the end
of Day 2, "Basic Operators and Control Flow," you'll
know enough about Perl to be able to solve many problems.
To find out whether Perl already is available on your system,
do the following:
If you do not find Perl in this way, talk to your system
administrator and ask whether she or he has Perl running
somewhere else. If you don't have Perl running in your
environment, don't despair--read on!
One of the reasons Perl is becoming so popular is that it is
available free of charge to anyone who wants it. If you are on
the Internet, you can obtain a copy of Perl with file-transfer
protocol (FTP). The following is a sample FTP session that
transfers a copy of the Perl distribution. The items shown in
boldface type are what you would enter during the session.
$ ftp prep.ai.mit.edu Connected to prep.ai.mit.edu. 220 aeneas FTP server (Version wu-2.4(1) Thu Apr 14 20:21:35 EDT 1994) ready. Name (prep.ai.mit.edu:dave): anonymous 331 Guest login ok, send your complete e-mail address as password. Password: 230-Welcome, archive user! 230- 230-If you have problems downloading and are seeing "Access denied" or 230-"Permission denied", please make sure that you started your FTP 230-client in a directory to which you have write permission. 230- 230-If you have any problems with the GNU software or its downloading, 230-please refer your questions to <gnu@PREP.AI.MIT.EDU>. If you have any 230-other unusual problems, please report them to <root@aeneas.MIT.EDU>. 230- 230-If you do have problems, please try using a dash (-) as the first 230-character of your password this will turn off the continuation 230-messages that may be confusing your FTP client. 230- 230 Guest login ok, access restrictions apply. ftp> cd pub/gnu 250-If you have problems downloading and are seeing "Access denied" or 250-"Permission denied", please make sure that you started your FTP 250-client in a directory to which you have write permission. 250- 250-Please note that all files ending in '.gz' are compressed with 250-'gzip', not with the unix 'compress' program. Get the file README 250- and read it for more information. 250- 250-Please read the file README 250- it was last modified on Thu Feb 1 15:00:50 1996 - 32 days ago 250-Please read the file README-about-.diff-files 250- it was last modified on Fri Feb 2 12:57:14 1996 - 31 days ago 250-Please read the file README-about-.gz-files 250- it was last modified on Wed Jun 14 16:59:43 1995 - 264 days ago 250 CWD command successful. ftp> binary 200 Type set to I. ftp> get perl-5.001.tar.gz 200 PORT command successful. 150 Opening ASCII mode data connection for perl-5.001.tar.gz (1130765 bytes). 226 Transfer complete. 1130765 bytes received in 9454 seconds (1.20 Kbytes/s) ftp> quit 221 Goodbye. $
The commands entered in this session are explained in the
following steps. If some of these steps are not familiar to you,
ask your system administrator for help.
$ ftp prep.ai.mit.edu
Once you've retrieved the Perl distribution, do the following:
$ gunzip perl-5.001.tar.gz
$ tar xvf - <perl-5.001.tar
You might need your system administrator's help to do this
because you might not have the necessary permissions.
If you cannot access the MIT site from where you are, you can
get Perl from the following sites using anonymous FTP:
North America
genetics.upenn.edu Internet address 128.91.200.37
Directory /perl5
Europe
src.doc.ic.ac.uk Internet address 146.169.17.5
Directory /packages/perl5
Australia
sungear.mame.mu.oz.au Internet address 128.250.209.2
Directory /pub/perl/src/5.0
South America
ftp.inf.utfsm.cl Internet address 146.83.198.3
Directory /pub/gnu
You also can obtain Perl from most sites that store GNU source
code, or from any site that archives the Usenet newsgroup comp.sources.unix.
Now that Perl is available on your system, it's time to show
you a simple program that illustrates how easy it is to use Perl. Listing
1.1 is a simple program that asks for a line of input and writes
it out.
Listing 1.1. A simple Perl program
that reads and writes a line of input.
1: #!/usr/local/bin/perl 2: $inputline = <STDIN>; 3: print( "$inputline" ); $program1_1 This is my line of input. This is my line of input. $
Line 1 is the header comment. Line 2 reads a line of input.
Line 3 writes the line of input back to your screen.
The following sections describe how to create and run this
program, and they describe it in more detail.
To run the program shown in Listing 1.1, do the following:
$ chmod +x program1_1
$ program1_1
When you run program1_1, it waits for you to enter a
line of input. After you enter the line of input, program1_1
prints what you entered, as follows:
$ program1_1 This is my line of input. This is my line of input. $
If Listing 1.1 is stored in the file program1_1 and run
according to the preceding steps, the program should run
successfully. If the program doesn't run, one of two things has
likely happened:
If you receive the error message
program1_1 not found
or something similar, your system couldn't find the file program1_1.
To tell the system where program1_1 is located, you can do
one of two things in a UNIX environment:
If you receive the message
/usr/local/bin/perl not found
or something similar, this means that Perl is not installed
properly on your machine. See the section "How Do I Find
Perl?" earlier today, for more details.
If you don't understand these instructions or are still having
trouble running Listing 1.1, talk to your system administrator.
Now that you've run your first Perl program, let's look at
each line of Listing 1.1 and figure out what it does.
Line 1 of this program is a special line that tells the system
that this is a Perl program:
#!/usr/local/bin/perl
Let's break this line down, one part at a time:
If, after reading this, you still don't understand the meaning
of the line #!/usr/local/bin/perl don't worry. The actual
specifics of what it does are not important for our purposes in
this book. Just remember to include it as the first line of your
program, and Perl will take it from there.
Note: If you are running Perl on a system other than UNIX, you might need to replace the line #!/usr/local/bin/perl with some other line indicating the location of the Perl interpreter on your system. Ask your system administrator for details on what you need to include here.
Once you have found out what the proper first line is in your environment, include that line as the first line of every Perl program you write, and you're all set.
As you have just seen, the first character of the line
#!/usr/local/bin/perl
is the comment character, #. When the Perl
interpreter sees the #, it ignores the rest of that line.
Comments can be appended to lines containing code, or they can
be lines of their own:
$inputline = <STDIN>; # this line contains an appended comment # this entire line is a comment
You can--and should--use comments to make your programs easier
to understand. Listing 1.2 is the simple program you saw earlier,
but it has been modified to include comments explaining what the program
does.
Note: As you work through the lessons in this book and create your own programs--such as the one in Listing 1.2--you can, of course, name them anything you wish. For illustration and discussion purposes, I've adopted the convention of using a name that corresponds to the listing number. For example, the program in Listing 1.2 is called program1_2.
The program name is used in the Input-Output examples such as the one following this listing, as well as in the Analysis section where the listing is discussed in detail. When you follow the Input-Output example, just remember to substitute your program's name for the one shown in the example.
Listing 1.2. A simple Perl program
with comments.
1: #!/usr/local/bin/perl 2: # this program reads a line of input, and writes the line 3: # back out 4: $inputline = <STDIN>; # read a line of input 5: print( $inputline ); # write the line out $ program1_2 This is a line of input. This is a line of input. $
The behavior of the program in Listing 1.2 is identical to
that of Listing 1.1 because the actual code is the same. The only difference
is that Listing 1.2 has comments in it.
Note that in an actual program, comments normally are used
only to explain complicated code or to indicate that the
following lines of code perform a specific task. Because Perl
instructions usually are pretty straightforward, Perl programs
don't need to have a lot of comments.
DO use comments whenever you think that a line of code is not easy to understand.
DON'T clutter up your code with unnecessary comments. The goal is readability. If a comment makes a program easier to read, include it. Otherwise, don't bother.
DON'T put anything else after /usr/local/bin/perl in the first line:
#!/usr/local/bin/perl
This line is a special comment line, and it is not treated like the others.
Now that you've learned what the first line of Listing 1.1
does, let's take a look at line 2:
$inputline = <STDIN>;
This is the first line of code that actually does any work. To
understand what this line does, you need to know what a Perl statement
is and what its components are.
The line of code you have just seen is an example of a Perl statement.
Basically, a statement is one task for the Perl interpreter to
perform. A Perl program can be thought of as a collection of statements
performed one at a time.
When the Perl interpreter sees a statement, it breaks the
statement down into smaller units of information. In this
example, the smaller units of information are $inputline, =, <STDIN>,
and ;. Each of these smaller units of information is
called a token.
Tokens can normally be separated by as many spaces and tabs as
you like. For example, the following statements are identical in
Perl:
$inputline = <STDIN>; $inputline=<STDIN>; $inputline = <STDIN>;
Your statements can take up as many lines of code as you like.
For example, the following statement is equivalent to the ones above:
$inputline = <STDIN> ;
The collection of spaces, tabs, and new lines separating one
token from another is known as white space.
When programming in Perl, you should use white space to make
your programs more readable. The examples in this book use white
space in the following ways:
As you've seen already, the statement
$inputline = <STDIN>;
consists of four tokens: $inputline, =, <STDIN>,
and ;. The following subsections explain what each of
these tokens does.
The first token in line 1, $inputline (at the left of
the statement), is an example of a scalar variable. In
Perl, a scalar variable can store one piece of information.
The = token, called the assignment operator,
tells the Perl interpreter to store the item specified by the
token to the right of the = in the place specified by the
token to the left of the =. In this example, the item on
the right of the assignment operator is the <STDIN>
token, and the item to the left of the assignment operator is the $inputline
token. Thus, <STDIN> is stored in the scalar variable $inputline.
Scalar variables and assignment operators are covered in more
detail on Day 2, "Basic Operators and Control Flow."
The next token, <STDIN>, represents a line of
input from the standard input file. The standard input
file, or STDIN for short, typically contains everything
you enter when running a program.
For example, when you run program1_1 and enter
This is a line of input.
the line you enter is stored in the standard input file.
The <STDIN> token tells the Perl interpreter to
read one line from the standard input file, where a line
is defined to be a set of characters terminated by a new line. In
this example, when the Perl interpreter sees <STDIN>,
it reads in
This is a line of input.
If the Perl interpreter then sees another <STDIN>
in a different statement, it reads another line of data from the
standard input file. The line of data you read earlier is
destroyed unless it has been copied somewhere else.
Note: If there are more lines of input than there are <STDIN> tokens, the extra lines of input are ignored.
Because the <STDIN> token is to the right of the
assignment operator =, the line
This is a line of input.
is assigned to the scalar variable $inputline.
The ; token at the end of the statement is a special
token that tells Perl the statement is complete. You can think of
it as a punctuation mark that is like a period in English.
Now that you understand what statements and tokens are,
consider line 3 of Listing 1.1, which is
print ($inputline);
This statement refers to the library function that is
called print. Library functions, such as print, are provided
as part of the Perl interpreter; each library function performs a
useful task.
The print function's task is to send data to the standard
output file. The standard output file stores data that is to
be written to your screen. The standard output file sometimes
appears in Perl programs under the name STDOUT.
In this example, print sends $inputline to the
standard output file. Because the second line of the Perl program
assigns the line
This is a line of input.
to $inputline, this is what print sends to the
standard output file and what appears on your screen.
When a reference to print appears in a Perl program,
the Perl interpreter calls, or invokes, the print library
function. This function invocation is similar to a
function invocation in C, a GOSUB statement in BASIC, or a PERFORM
statement in COBOL. When the Perl interpreter sees the print
function invocation, it executes the code contained in print
and returns to the program when print is finished.
Most library functions require information to tell them what
to do. For example, the print function needs to know what
you want to print. In Perl, this information is supplied as a
sequence of comma-separated items located between the parentheses of
the function invocation. For example, the statement you've just
seen
print ($inputline);
supplies one piece of information that is passed to print:
the variable $inputline. This piece of information
commonly is called an argument.
The following call to print supplies two arguments:
print ($inputline, $inputline);
You can supply print with as many arguments as you
like; it prints each argument starting with the first one (the
one on the left). In this case, print writes two copies of $inputline
to the standard output file.
You also can tell print to write to any other specified
file. You'll learn more about this on Day 6, "Reading from
and Writing to Files."
If you incorrectly type in a statement when creating a Perl
program, the Perl interpreter will detect the error and tell you
where the error is located.
For example, look at Listing 1.3. This program is identical to
the program you've been seeing all along, except that it contains one
small error. Can you spot it?
Listing 1.3. A program containing an
error.
1: #!/usr/local/bin/perl 2: $inputline = <STDIN> 3: print ($inputline); $ program1_3 Syntax error in file program1_3 at line3, next char ( Execution of program1_3 aborted due to compilation errors. $
When you try to run this program, an error message appears.
The Perl interpreter has detected that line 2 of the program is missing
its closing ; character. The error message from the
interpreter tells you what the problem is and identifies the line
on which the problem is located.
Tip: You should fix errors starting from the beginning of your program and working down.
When the Perl interpreter detects an error, it tries to figure out what you meant to say and carries on from there; this feature is known as error recovery. Error recovery enables the interpreter to detect as many errors as possible at one time, which speeds up the development process.
Sometimes, however, the Perl interpreter can get confused and think you meant to do one thing when you really meant to do another. In this situation, the interpreter might start trying to detect errors that don't really exist. This problem is known as error cascading.
It's usually pretty easy to spot error cascading. If the interpreter is telling you that errors exist on several consecutive lines, it usually means that the interpreter is confused. Fix the first error, and the others might very well go away.
As you've seen, running a Perl program is easy. All you need
to do is create the program, mark it as executable, and run it. The
Perl interpreter takes care of the rest. Languages such as Perl
that are processed by an interpreter are known as interpretive
languages.
Some programming languages require more complicated
processing. If a language is a compiled language, the
program you write must be translated into machine-readable code
by a special program known as a compiler. In addition,
library code might need to be added by another special program
known as a linker. Once the compiler and linker have done
their jobs, the result is a program that can be executed on your
machine--assuming, of course, that you have written the program correctly.
If not, you have to compile and link the program all over again.
Interpretive languages and compiled languages both have
advantages and disadvantages, as follows:
As you'll see, Perl is as powerful as a compiled language.
This means that you can do a lot of work quickly and easily.
Today you learned that Perl is a programming language that
provides many of the capabilities of a high-level programming language
such as C. You also learned that Perl is easy to use; basically, you just
write the program and run it.
You saw a very simple Perl program that reads a line of input
from the standard input file and writes the line to the standard output
file. The standard input file stores everything you type from your keyboard,
and the standard output file stores everything your Perl program
sends to your screen.
You learned that Perl programs contain a header comment, which
indicates to the system that your program is written in Perl. Perl
programs also can contain other comments, each of which must be preceded
by a #.
Perl programs consist of a series of statements, which are
executed one at a time. Each statement consists of a collection
of tokens, which can be separated by white space.
Perl programs call library functions to perform certain
predefined tasks. One example of a library function is print,
which writes to the standard output file. Library functions are
passed chunks of information called arguments; these arguments
tell a function what to do.
The Perl interpreter executes the Perl programs you write. If
it detects an error in your program, it displays an error message and
uses the error-recovery process to try to continue processing
your program. If Perl gets confused, error cascading can occur,
and the Perl interpreter might display inappropriate error
messages.
Finally, you learned about the differences between
interpretive languages and compiled languages, and that Perl is
an example of an interpretive language.
The Workshop provides quiz questions to help you solidify your
understanding of the material covered and exercises to give you
experience in using what you've learned. Try to understand the quiz
and exercise answers before continuing to the next day.
#!/usr/local/bin/perl $inputline = <STDIN>; print ($inputline)
#!/usr/local/bin/perl $inputline = <STDIN>; # print my line! print($inputline);