Today's lesson describes the built-in system variables that
can be referenced from every Perl program. These system variables
are divided into five groups:
The following sections describe these groups of system
variables, and also describe how to provide English-language equivalents
of their variable names.
The global scalar variables are built-in system variables that
behave just like the scalar variables you create in the main body of
your program. This means that these variables have the following properties:
Other kinds of built-in scalar variables, which you will see
later in this lesson, do not behave in this way.
The following sections describe the global scalar variables
your Perl programs can use.
The most commonly used global scalar variable is the $_
variable. Many Perl functions and operators modify the contents
of $_ if you do not explicitly specify the scalar variable
on which they are to operate.
The following functions and operators work with the $_
variable by default:
Normally, the pattern-matching operator examines the value
stored in the variable specified by a corresponding =~ or !~ operator.
For example, the following statement prints hi if the
string abc is contained in the value stored in $val:
print ("hi") if ($val =~ /abc/);
By default, the pattern matching operator examines the value
stored in $_. This means that you can leave out the =~
operator if you are searching $_:
print ("hi") if ($_ =~ /abc/);
print ("hi") if (/abc/); # these two are the same
Note: If you want to use the !~ (true-if-pattern-not-matched) operator, you will always need to specify it explicitly, even if you are examining $_:
print ("hi") if ($_ !~ /abc/);
If the Perl interpreter sees just a pattern enclosed in / characters, it assumes the existence of a =~ operator.
$_ enables you to use pattern-sequence memory to
extract subpatterns from a string and assign them to an array
variable:
$_ = "This string contains the number 25.11."; @array = /-?(\d+)\.?(\d+)/;
In the second statement shown, each subpattern enclosed in
parentheses becomes an element of the list assigned to @array. As
a consequence, @array is assigned (25,11).
In Perl 5, a statement such as
@array = /-?(\d+)\.?(\d+)/;
also assigns the extracted subpatterns to the pattern-sequence
scalar variables $1, $2, and so on. This means that
the statement assigns 25 to $1 and 11 to $2.
Perl 4 supports assignment of subpatterns to arrays, but does not
assign the subpatterns to the pattern-sequence variables.
The substitution operator, like the pattern-matching operator,
normally modifies the contents of the variable specified by the =~
or !~ operator. For example, the following statement
searches for abc in the value stored in $val and
replaces it with def:
$val =~ s/abc/def/;
The substitution operator uses the $_ variable if you
do not specify a variable using =~. For example, the
following statement replaces the first occurrence of abc
in $_ with def:
s/abc/def/;
Similarly, the following statement replaces all white space
(spaces, tabs, and newline characters) in $_ with a single
space:
/\s+/ /g;
When you substitute inside $_, the substitution
operator returns the number of substitutions performed:
$subcount = s/abc/def/g;
Here, $subcount contains the number of occurrences of abc
that have been replaced by def. If abc is not
contained in the value stored in $_, $subcount is
assigned 0.
The behavior of the translation operator is similar to that of
the pattern matching and substitution operator: it normally
operates on the variable specified by =~, and it operates
on $_ if no =~ operator is included. For example,
the following statement translates all lowercase letters in the value stored
in $_ to their uppercase equivalents:
tr/a-z/A-Z/;
Like the substitution operator, if the translation operator is
working with $_, it returns the number of operations
performed. For example:
$conversions = tr/a-z/A-Z/;
Here, $conversions contains the number of lowercase
letters converted to uppercase.
You can use this feature of tr to count the number of
occurrences of particular characters in a file. Listing 17.1 is
an example of a program that performs this operation.
Listing 17.1. A program that counts
using tr.
1: #!/usr/local/bin/perl
2:
3: print ("Specify the nonblank characters you want to count:\n");
4: $countstring = <STDIN>;
5: chop ($countstring);
6: @chars = split (/\s*/, $countstring);
7: while ($input = <>) {
8: $_ = $input;
9: foreach $char (@chars) {
10: eval ("\$count = tr/$char/$char/;");
11: $count{$char} += $count;
12: }
13: }
14: foreach $char (sort (@chars)) {
15: print ("$char appears $count{$char} times\n");
16: }
$ program17_1 file1
Specify the nonblank characters you want to count:
abc
a appears 8 times
c appears 3 times
b appears 2 times
$
This program first asks the user for a line of input
containing the characters to be counted. These characters can be
separated by spaces or jammed into a single word.
Line 5 takes the line of input containing the characters to be
counted and removes the trailing newline character. Line 6 then splits
the line of input into separate characters, each of which is stored
in an element of the array @chars. The pattern /\s*/ splits
on zero or more occurrences of a whitespace character; this
splits on every nonblank character and skips over the blank characters.
Line 7 reads a line of input from a file whose name is
specified on the command line. Line 8 takes this line and stores
it in the system variable $_. (In most cases, system
variables can be assigned to, just like other variables.)
Lines 9--12 count the number of occurrences of each character
in the input string read in line 4. Each character, in turn, is stored
in $char, and the value of $char is substituted
into the string in line 10. This string is then passed to eval,
which executes the translate operation contained in the string.
The translate operation doesn't actually do anything because
it is "translating" a character to itself. However, it
returns the number of translations performed, which means that it
returns the number of occurrences of the character. This count is assigned
to $count.
For example, suppose that the variable $char contains
the character e and that $_ contains Hi there!.
In this case, the string in line 10 becomes the following because e
is substituted for $char in the string:
$count = tr/e/e/;
The call to eval executes this statement, which counts
the number of e's in Hi there!. Because there are
two e's in Hi there!, $count is assigned 2.
An associative array, %count, keeps track of the number
of occurrences of each of the characters being counted. Line 11 adds
the count returned by line 10 to the associative array element
whose subscript is the character currently being counted. For
example, if the program is currently counting the number of e's,
this number is added to the element $count{"e"}.
After all input lines have been read and their characters
counted, lines 14--16 print the total number of occurrences of
each character by examining the elements of %count.
In Listing 17.1, which you've just seen, the program reads a
line of input into a scalar variable named $input and then
assigns it to $_. There is a quicker way to carry out this
task, however. You can replace
while ($input = <>) {
$_ = $input;
# more stuff here
}
with the following code:
while (<>) {
# more stuff here
}
If the <> operator appears in a conditional
expression that is part of a loop (an expression that is part of
a conditional statement such as while or for) and
it is not to the right of an assignment operator, the Perl
interpreter automatically assigns the resulting input line to the
scalar variable $_.
For example, Listing 17.2 shows a simple way to print the
first character of every input line read from the standard input
file.
Listing 17.2. A simple program that
assigns to $_ using <STDIN>.
1: #!/usr/local/bin/perl
2:
3: while (<STDIN>) {
4: ($first) = split (//, $_);
5: print ("$first\n");
6: }
$ program17_2
This is a test.
T
Here is another line.
H
^D
$
Because <STDIN> is inside a conditional
expression and is not assigned to a scalar variable, the Perl
interpreter assigns the input line to $_. The program then
retrieves the first character by passing $_ to split.
Caution: The <> operator assigns to $_ only if it is contained in a conditional expression in a loop. The statement
<STDIN>;
reads a line of input from the standard input file and throws it away without changing the contents of $_. Similarly, the following statement does not change the value of $_:
if (<>) {
print ("The input files are not all empty.\n");
}
By default, the chop function operates on the value
stored in the $_ variable. For example:
while (<>) {
chop;
# you can do things with $_ here
}
Here, the call to chop removes the last character from
the value stored in $_. Because the conditional expression
in the while statement has just assigned a line of input
to $_, chop gets rid of the newline character that
terminates each input line.
The print function also operates on $_ by
default. The following statement writes the contents of $_ to
the standard output file:
print;
Listing 17.3 is an example of a program that simply writes out
its input, which it assumes is stored in $_. This program
is an implementation of the UNIX cat command, which reads
input files and displays their contents.
Listing 17.3. A simple version of the cat
command using $_.
1: #!/usr/local/bin/perl 2: 3: print while (<>); $ program17_3 file1 This is the only line in file "file1". $
This program uses the <> operator to read a line
of input at a time and store it in $_. If the line is non-empty,
the print function is called; because no variable is
specified with print, it writes out the contents of $_.
***Begin Caution***
Caution: You can use this default version of print only if you are writing to the default output file (which is usually STDOUT but can be changed using the select function). If you are specifying a file variable when you call print, you also must specify the value you are printing.
For example, to send the contents of $_ to the output file MYFILE, use the following command:
print MYFILE ($_);
If you do not specify a variable when you call study,
this function uses $_ by default:
study;
The study function increases the efficiency of programs
that repeatedly search the same variable. It is described on Day
13, "Process, String, and Mathematical Functions."
The default behavior of the functions listed previously is
useful to remember when you are writing one-line Perl programs
for use with the e option. For example, the
following command is a quick way to display the contents of the
files file1, file2, and file3:
$ perl -e "print while <>;" file1 file2 file3
Similarly, the following command changes all occurrences of abc
in file1, file2, and file3 to def:
$ perl -ipe "s/abc/def/g" file1 file2 file3
***Begin Tip***
Tip: Although $_ is useful in cases such as the preceding one, don't overuse it. Many Perl programmers write programs that have references to $_ running like an invisible thread through their programs.
Programs that overuse $_ are hard to read and are easier to break than programs that explicitly reference scalar variables you have named yourself.
The $0 variable contains the name of the program you
are running. For example, if your program is named perl1,
the statement
print ("Now executing $0...\n");
displays the following on your screen:
Now executing perl1...
The $0 variable is useful if you are writing programs
that call other programs. If an error occurs, you can determine
which program detected the error:
die ("$0: can't open input file\n");
Here, including $0 in the string passed to die
enables you to specify the filename in your error message. (Of
course, you can always leave off the trailing newline, which
tells Perl to print the filename and the line number when
printing the error message. However, $0 enables you to
print the filename without the line number, if that's what you
want.)
***Begin Note***
Note: You can change your program name while it is running by modifying the value stored in $0.
The $< and $> variables contain,
respectively, the real user ID and effective user ID for the program.
The real user ID is the ID under which the user of the program
logged in. The effective user ID is the ID associated with this
particular program (which is not always the same as the real user
ID).
***Begin Note***
Note: If you are not running your Perl program on the UNIX operating system, the $< and $> variables might have no meaning. Consult your local documentation for more details.
***End Note***
Listing 17.4 uses the real user ID to determine the user name
of the person running the program.
Listing 17.4. A program that uses the $<
variable.
1: #!/usr/local/bin/perl
2:
3: ($username) = getpwuid($<);
4: print ("Hello, $username!\n");
$ program17_4
Hello, dave!
$
The $< variable contains the real user ID, which is
the login ID of the person running this program. Line 3 passes
this user ID to getpwuid, which retrieves the password
file entry corresponding to this user ID. The user name is the
first element in this password file, and it is stored in the
scalar variable $username. Line 4 then prints this user
name.
***Begin Note***
Note: On certain UNIX machines, you can assign $< to $> (set the effective user ID to be the real user ID) or vice versa. If you have superuser privileges, you can set $< or $> to any defined user ID.
The $( and $) variables define the real group ID
and the effective group ID for this program. The real group ID is
the group to which the real user ID (stored in the variable $<)
belongs; the effective group ID is the group to which the
effective user ID (stored in the variable $>) belongs.
If your system enables users to be in more than one group at a
time, $( and $) contain a list of group IDs, with
each pair of group IDs being separated by spaces. You can convert
this into an array by calling split.
Normally, you can only assign $( to $), and vice
versa. If you are the superuser, you can set $( or $) to
any defined group ID.
***Begin Note***
Note: $( and $) might not have any useful meaning if you are running Perl on a machine running an operating system other than UNIX.
The $] system variable contains the current version
number. You can use this variable to ensure that the Perl on
which you are running this program is the right version of Perl
(or is a version that can run your program).
Normally, $] contains a character string similar to
this:
$RCSfile: perl.c,v $$Revision: 4.0.1.8 $$Date: 1993/02/05 19:39:30 $ Patch level: 36
The useful parts of this string are the revision number and
the patch level. The first part of the revision number indicates
that this is version 4 of Perl. The version number and the patch
level are often combined; in this notation, this is version 4.036
of Perl.
You can use the pattern matching operator to extract the
useful information from $]. Listing 17.5 shows one way to
do it.
Listing 17.5. A program that extracts
information from the $] variable.
1: #!/usr/local/bin/perl
2:
3: $] =~ /Revision: ([0-9.]+)/;
4: $revision = $1;
5: $] =~ /Patch level: ([0-9]+)/;
6: $patchlevel = $1;
7: print ("revision $revision, patch level $patchlevel\n");
$ program17_5
revision 4.0.1.8, patch level 36
$
This program just extracts the revision and patch level from $]
using the pattern matching operator. The built-in system variable $1,
described later today, is defined when a pattern is matched. It contains
the substring that appears in the first subpattern enclosed in
parentheses. In line 3, the first subpattern enclosed in
parentheses is [0-9.]+. This subpattern matches one or more
digits mixed with decimal points, and so it matches 4.0.1.8.
This means that 4.0.1.8 is assigned to $1 by line 3 and
is assigned to $revision by line 4.
Similarly, line 5 assigns 36 to $1 (because the
subpattern [0-9]+, which matches one or more digits, is
the first subpattern enclosed in parentheses). Line 6 then
assigns 36 to $patchlevel.
***Begin Caution***
Caution: On some machines, the value contained in $] might be completely different from the value used in this example. If you are not sure whether $] has a useful value, write a little program that just prints $]. If this program prints something useful, you'll know that you can run programs that compare $] with an expected value.
When the Perl interpreter is told to read a line of input from
a file, it usually reads characters until it reads a newline
character. The newline character can be thought of as an input
line separator; it indicates the end of a particular line.
The system variable $/ contains the current input line
separator. To change the input line separator, change the value
of $/. The $/ variable can be more than one
character long to handle the case in which lines are separated by
more than one character. If you set $/ to the null
character, the Perl interpreter assumes that the input line
separator is two newline characters.
Listing 17.6 shows how changing $/ can affect your
program.
Listing 17.6. A program that changes
the value of $/.
1: #!/usr/local/bin/perl
2:
3: $/ = ":";
4: $line = <STDIN>;
5: print ("$line\n");
$ program17_6
Here is some test input: here is the end.
Here is some test input:
$
Line 3 sets the value of $/ to a colon. This means that
when line 4 reads from the standard input file, it reads until it
sees a colon. As a consequence, $line contains the
following character string:
Here is some test input:
Note that the colon is included as part of the input line
(just as, in the normal case, the trailing newline character is
included as part of the line).
***Begin Caution***
Caution: The 0 (zero, not the letter O) switch sets the value of $/. If you change the value of $/ in your program, the value specified by 0 will be thrown away.
To temporarily change the value of $/ and then restore it to the value specified by 0, save the current value of $/ in another variable before changing it.
For more information on 0, refer to Day 16, "Command-Line Options."
The system variable $\ contains the current output line
separator. This is a character or sequence of characters that is automatically
printed after every call to print.
By default, $\ is the null character, which indicates
that no output line separator is to be printed. Listing 17.7
shows how you can set an output line separator.
Listing 17.7. A program that uses the $\
variable.
1: #!/usr/local/bin/perl
2:
3: $\ = "\n";
4: print ("Here is one line.");
5: print ("Here is another line.");
$ program17_7
Here is one line.
Here is another line.
$
Line 3 sets the output line separator to the newline
character. This means that a list passed to a subsequent print
statement always appears on its own output line. Lines 4 and 5
now no longer need to include a newline character as the last
character in the line.
***Begin Caution***
Caution: The l option sets the value of $\. If you change $\ in your program without saving it first, the value supplied with l will be lost. See Day 16 for more information on the l option.
The $, variable contains the character or sequence of
characters to be printed between elements when print is
called. For example, in the following statement the Perl
interpreter first writes the contents of $a:
print ($a, $b);
It then writes the contents of $, and then finally, the
contents of $b.
Normally, the $, variable is initialized to the null
character, which means that the elements of a print statement
are printed next to one another. Listing 17.8 is a program that
sets $, before calling print.
Listing 17.8. A program that uses the $,
variable.
1: #!/usr/local/bin/perl 2: 3: $a = "hello"; 4: $b = "there"; 5: $, = " "; 6: $\ = "\n"; 7: print ($a, $b); $ program17_8 hello there $
Line 5 sets the value of $, to a space. Consequently,
line 7 prints a space after printing $a and before
printing $b.
Note that $\, the default output separator, is set to
the newline character. This setting ensures that the terminating
newline character immediately follows $b. By contrast, the
following statement prints a space before printing the trailing
newline character:
print ($a, $b, "\n");
***Begin Note***
Note: Here's another way to print the newline immediately after the final element that doesn't involve setting $\:
print ($a, $b . "\n");
Here, the trailing newline character is part of the second element being printed. Because $b and \n are part of the same element, no space is printed between them.
Normally, if an array is printed inside a string, the elements
of the array are separated by a single space. For example:
@array = ("This", "is", "a", "list");
print ("@array\n");
Here, the print statement prints
This is a list
A space is printed between each pair of array elements.
The built-in system variable that controls this situation is
the $" variable. By default, $" contains
a space. Listing 17.9 shows how you can control your array output
by changing the value of $".
Listing 17.9. A program that uses the $"
variable.
1: #!/usr/local/bin/perl
2:
3: $" = "::";
4: @array = ("This", "is", "a", "list");
5: print ("@array\n");
$ program17_9
This::is::a::list
$
Line 3 sets the array element separator to :: (two
colons). Array element separators, like other separators you can
define, can be more than one character long.
Line 5 prints the contents of @array. Each pair of
elements is separated by the value stored in $", which
is two colons.
***Begin Note***
Note: The $" variable affects only entire arrays printed inside strings. If you print two variables together in a string, as in
print ("$a$b\n");
the contents of the two variables are printed with nothing separating them regardless of the value of $".
To change how arrays are printed outside strings, use $\, described earlier today.
By default, when the print function prints a number, it
prints it as a 20-digit floating point number in compact format.
This means that the following statements are identical if the
value stored in $x is a number:
print ($x);
printf ("%.20g", $x);
To change the default format that print uses to print
numbers, change the value of the $# variable. For example,
to specify only 15 digits of precision, use this statement:
$# = "%.15g";
This value must be a floating-point field specifier, as used
in printf and sprintf.
***Begin Note***
Note: The $# variable does not affect values that are not numbers and has no effect on the printf, write, and sprintf functions.
***End Note***
For more information on the field specifiers you can use as
the default value in $#, see "Formatting Output Using printf"
on Day 11, "Formatting Your Output."
**Begin Caution***
Caution: The $# variable is deprecated in Perl 5. This means that although $# is supported, it is not recommended for use, and might be removed from future versions of Perl.
If a statement executed by the eval function contains
an error, or an error occurs during the execution of the
statement, the error message is stored in the system variable $@.
The program that called eval can decide either to print
the error message or to perform some other action.
For example, the statement
eval ("This is not a perl statement");
assigns the following string to $@:
syntax error in file (eval) at line 1, next 2 tokens "This is"
The $@ variable also returns the error generated by a
call to die inside an eval. The following statement
assigns this string to $@:
eval ("die (\"nothing happened\")");
nothing happened at (eval) line 1.
***Begin Note***
Note: The $@ variable also returns error messages generated by the require function. See Day 18, "Object-Oriented Programming," for more information on require.
The $? variable returns the error status generated by
calls to the system function or by calls to functions
enclosed in back quotes, as in the following:
$username = 'hostname';
The error status stored in $? consists of two parts:
The value stored in $? is a 16-bit integer. The upper
eight bits are the exit value, and the lower eight bits are the
status field. To retrieve the exit value, use the >>
operator to shift the eight bits to the right:
$retcode = $? >> 8;
For more information on the status field, refer to the online
manual page for the wait function or to the file /usr/include/sys/wait.h.
For more information on commands in back quotes, refer to Day 20, "Miscellaneous
Features of Perl."
Some Perl library functions call system library functions. If
a system library function generates an error, the error code generated
by the function is assigned to the $! variable. The Perl
library functions that call system library functions vary from machine
to machine.
***Begin Note***
Note: The $! variable in Perl is equivalent to the errno variable in the C programming language.
The $. variable contains the line number of the last
line read from an input file. If more than one input file is
being read, $. contains the line number of the last input
file read. Listing 17.10 shows how $. works.
Listing 17.10. A program that uses
the $. variable.
1: #!/usr/local/bin/perl
2:
3: open (FILE1, "file1") ||
4: die ("Can't open file1\n");
5: open (FILE2, "file2") ||
6: die ("Can't open file2\n");
7: $input = <FILE1>;
8: $input = <FILE1>;
9: print ("line number is $.\n");
10: $input = <FILE2>;
11: print ("line number is $.\n");
12: $input = <FILE1>;
13: print ("line number is $.\n");
$ program17_10
line number is 2
line number is 1
line number is 3
$
When line 9 is executed, the input file FILE1 has had
two lines read from it. This means that $. contains the
value 2. Line 10 then reads from FILE2. Because it
reads the first line from this file, $. now has the value
1. When line 12 reads a third line from FILE1, $.
is set to the value 3. The Perl interpreter remembers that two
lines have already been read from FILE1.
***Begin Note***
Note: If the program is reading using <>, which reads from the files listed on the command line, $. treats the input files as if they are one continuous file. The line number is not reset when a new input file is opened.
You can use eof to test whether a particular file has ended, and then reset $. yourself (by assigning zero to it) before reading from the next file.
Normally, the operators that match patterns (the
pattern-matching operator and the substitution operator) assume
that the character string being searched is a single line of
text. If the character string being searched consists of more
than one line of text (in other words, it contains newline characters),
set the system variable $* to 1.
***Begin Note***
Note: By default, $* is set to 0, which indicates that multiline pattern matches are not required.
***End Note***
***Begin Caution***
Caution: The $* variable is deprecated in Perl 5. If you are running Perl 5, use the m pattern-matching option when matching in a multiple-line string. See Day 7, "Pattern Matching," for more details on this option.
Normally, when a program references the first element of an
array, it does so by specifying the subscript 0. For example:
@myarray = ("Here", "is", "a", "list");
$here = $myarray[0];
The array element $myarray[0] contains the string Here,
which is assigned to $here.
If you are not comfortable with using 0 as the subscript for
the first element of an array, you can change this setting by changing
the value of the $[ variable. This variable indicates
which value is to be used as the subscript for the first array element.
Here is the preceding example, modified to use 1 as the first
array element subscript:
$[ = 1;
@myarray = ("Here", "is", "a", "list");
$here = $myarray[1];
In this case, the subscript 1 now references the first array
element. This means that $here is assigned Here, as
before.
***Begin Tip***
Tip: Don't change the value of $[. It is too easy for a casual reader of your program to forget that the subscript 0 no longer references the first element of the array. Besides, using 0 as the subscript for the first element is standard practice in many programming languages, including C and C++.
***End Tip***
***Begin Note***
Note: $[ is deprecated in Perl 5.
So far, all the arrays you've seen have been one-dimensional
arrays, which are arrays in which each array element is referenced
by only one subscript. For example, the following statement uses the subscript foo
to access an element of the associative array named %array:
$myvar = $array{"foo"};
Perl does not support multidimensional arrays directly. The
following statement is not a legal Perl statement:
$myvar = $array{"foo"}{"bar"};
However, Perl enables you to simulate a multidimensional
associative array using the built-in system variable $;.
Here is an example of a statement that accesses a (simulated)
multidimensional array:
$myvar = $array{"foo","bar"};
When the Perl interpreter sees this statement, it converts it
to this:
$myvar = $array{"foo" . $; . "bar"};
The system variable $; serves as a subscript separator.
It automatically replaces any comma that is separating two array subscripts.
Here is another example of two equivalent statements:
$myvar = $array{"s1", 4, "hi there"};
$myvar = $array{"s1".$;.4.$;."hi there"};
The second statement shows how the value of the $;
variable is inserted into the array subscript.
By default, the value of $; is \034 (the Ctrl+\
character). You can define $; to be any value you want.
Listing 17.11 is an example of a program that sets $;.
Listing 17.11. A program that uses
the $; variable.
1: #!/usr/local/bin/perl
2:
3: $; = "::";
4: $array{"hello","there"} = 46;
5: $test1 = $array{"hello","there"};
6: $test2 = $array{"hello::there"};
7: print ("$test1 $test2\n");
$ program17_11
46 46
$
Line 3 sets $; to the string ::. As a
consequence, the subscript "hello","there"
in lines 4 and 5 is really hello::there because the Perl
interpreter replaces the comma with the value of $;.
Line 7 shows that both "hello","there"
and hello::there refer to the same element of the
associative array.
***Begin Caution***
Caution: If you set $;, be careful not to set it to a character that you are actually using in a subscript. For example, if you set $; to ::, the following statements reference the same element of the array:
$array{"a::b", "c"} = 1;
$array{"a", "b::c"} = 2;
In each case, the Perl interpreter replaces the comma with ::, producing the subscript a::b::c.
On Day 11 you learned how to format your output using print
formats and the write statement. Each print format
contains one or more value fields that specify how output is to
appear on the page.
If a value field in a print format begins with the ^
character, the Perl interpreter puts a word in the value field
only if there is room enough for the entire word. For example, in
the following program (a duplicate of Listing 11.9):
1: #!/usr/local/bin/perl 2: 3: $string = "Here\nis an unbalanced line of\ntext.\n"; 4: $~ = "OUTLINE"; 5: write; 6: 7: format OUTLINE = 8: ^<<<<<<<<<<<<<<<<<<<<<<<<<<< 9: $string 10: .
the call to write uses the OUTLINE print format
to write the following to the screen:
Here is an unbalanced line
Note that the word of is not printed because it cannot
fit into the OUTLINE value field.
To determine whether a word can fit in a value field, the Perl
interpreter counts the number of characters between the next character
to be formatted and the next word-break character. A word-break
character is one that denotes either the end of a word or a place
where a word can be split into two parts.
By default, the legal word-break characters in Perl are the
space character, the newline character, and the
(hyphen) character. The acceptable word break characters are
stored in the system variable $:.
To change the list of acceptable word-break characters, change
the value of $:. For example, to ensure that all
hyphenated words are in the same line of formatted output, define $:
as shown here:
$: = " \n";
Now only the space and newline characters are legal word-break
characters.
***Begin Caution***
Caution: Normally, the tab character is not a word break character. To allow lines to be broken on tabs, add the tab character to the list specified by the $: variable:
$: = " \t\n-";
The $$ system variable contains the process ID for the
Perl interpreter itself. This is also the process ID for your
program.
When you use the <> operator, the Perl
interpreter reads input from each file named on the command line.
For example, suppose that you are executing the program myprog
as shown here:
$ myprog test1 test2 test3
In myprog, the first occurrence of the <>
operator reads from test1. Subsequent occurrences of <> continue
reading from test1 until it is exhausted; at this point, <>
reads from test2. This process continues until all the
input files have been read.
On Day 6, "Reading from and Writing to Files," you
learned that the @ARGV array lists the elements of the
command line and that the first element of @ARGV is
removed when the <> operator reads a line. (@ARGV
also is discussed later today.)
When the <> operator reads from a file for the
first time, it assigns the name of the file to the $ARGV
system variable. This enables you to keep track of what file is
currently being read. Listing 17.12 shows how you can use $ARGV.
Listing 17.12. A simple
file-searching program using $ARGV.
1: #!/usr/local/bin/perl
2:
3: print ("Enter the search pattern:\n");
4: $string = <STDIN>;
5: chop ($string);
6: while ($line = <>) {
7: if ($line =~ /$string/) {
8: print ("$ARGV:$line");
9: }
10: }
$ program17_12 file1 file2 file3
Enter the string to search:
the
file1:This line contains the word "the".
$
This program reads each line of the input files supplied on
the command line. If a line contains the pattern specified by $string, line
8 prints the name of the file and then the line itself. Note that
the pattern in $string can contain special pattern
characters.
***Begin Note***
Note: If <> is reading from the standard input file (which occurs when you have not specified any input files on the command line), $ARGV contains the string (a single hyphen).
The $^A variable is used by write to store
formatted lines to be printed. The contents of $^A are erased
after the line is printed.
This variable is defined only in Perl 5.
The $^D variable displays the current internal
debugging value. This variable is defined only when the D
switch has been specified and when your Perl interpreter has been
compiled with debugging included.
See your online Perl documentation for more details on
debugging Perl. (Unless you are using an experimental version of
Perl, you are not likely to need to debug it.)
The $^F variable controls whether files are to be
treated as system files. Its value is the largest UNIX file
descriptor that is treated as a system file.
Normally, only STDIN, STDOUT, and STDERR
are treated as system files, and the value assigned to $^F
is 2. Unless you are on a UNIX machine, are familiar with file
descriptors, and want to do something exotic with them, you are
not likely to need to use the $^F system variable.
The $^I variable is set to a nonzero value by the Perl
interpreter when you specify the i option (which
edits files as they are read by the <> operator).
The following statement turns off the editing of files being
read by <>:
undef ($^I);
When $^I is undefined, the next input file is opened
for reading, and the standard output file is no longer changed.
***Begin do/don't***
DO open the files for input and output yourself if your program wants to edit some of its input files and not others; this process is easier to follow.
DON'T use $^I if you are reading files using the n or p option unless you really know what you are doing, because you are not likely to get the behavior you expect. If i has modified the default output file, undefining $^I does not automatically set the default output file to STDOUT.
The $^L variable contains the character or characters
written out whenever a print format wants to start a new page.
The default value is \f, the form-feed character.
The $^P variable is used by the Perl debugger. When
this variable is set to zero, debugging is turned off.
You normally won't need to use $^P yourself, unless you
want to specify that a certain chunk of code does not need to be debugged.
The $^T variable contains the time at which your
program began running. This time is in the same format as is
returned by the time function: the number of seconds since
January 1, 1970.
The following statement sets the file access and modification
times of the file test1 to the time stored in $^T:
utime ($^T, $^T, "test1");
For more information on the time and utime
functions, refer to Day 12, "Working with the File System."
***Begin Note***
Note: The time format used by $^T is also the same as that used by the file test operators A, C, and M.
The $^W system variable controls whether warning
messages are to be displayed. Normally, $^W is set to a
nonzero value only when the w option is specified.
You can set $^W to zero to turn off warnings inside
your program. This capability is useful if your program contains statements
that generate warnings you want to ignore (because you know that your
statements are correct). For example
$^W = 0; # turn off warning messages # code that generates warnings goes here $^W = 1; # turn warning messages back on
***Begin Caution***
Caution: Some warnings are printed before program execution starts (for example, warnings of possible typos). You cannot turn off these warnings by setting $^W to zero.
The $^X variable displays the first word of the command
line you used to start this program. If you started this program
by entering its name, the name of the program appears in $^X.
If you used the perl command to start this program, $^X
contains perl.
The following statement checks to see whether you started this
program with the command perl:
if ($^X ne "perl") {
print ("You did not use the 'perl' command ");
print ("to start this program.\n");
}
The system variables you have seen so far are all defined
throughout your program. The following system variables are defined
only in the current block of statements you are running. (A block
of statements is any group of statements enclosed in the brace
characters { and }.) These pattern system variables
are set by the pattern-matching operator and the other operators
that use patterns (such as, for example, the substitution
operator). Many of these pattern system variables were first introduced
on Day 7.
***Begin Tip***
Tip: Even though the pattern system variables are defined only inside a particular block of statements, your programs should not take advantage of that fact. The safest way to use the pattern-matching variables is to assign any variable that you might need to a scalar variable of your own.
When you specify a pattern for the pattern-matching or
substitution operator, you can enclose parts of the pattern in parentheses.
For example, the following pattern encloses the subpattern \d+ in parentheses.
(The parentheses themselves are not part of the pattern.)
/(\d+)\./
This subpattern matches one or more digits.
After a pattern has been matched, the system variables $1, $2,
and so on match the subpatterns enclosed in parentheses. For example,
suppose that the following pattern is successfully matched:
/(\d+)([a-z]+)/
In this case, the match found must consist of one or more
digits followed by one or more lowercase letters. After the match has
been found, $1 contains the sequence of one or more digits,
and $2 contains the sequence of one or more lowercase letters.
Listing 17.13 is an example of a program that uses $1, $2,
and $3 to match subpatterns.
Listing 17.13. A program that uses
variables containing matched subpatterns.
1: #!/usr/local/bin/perl
2:
3: while (<>) {
4: while (/(-?\d+)\.(\d+)([eE][+-]?\d+)?/g) {
5: print ("integer part $1, decimal part $2");
6: if ($3 ne "") {
7: print (", exponent $3");
8: }
9: print ("\n");
10: }
11: }
$ program17_13 file1
integer part 26, decimal part 147, exponent e-02
integer part -8, decimal part 997
$
This program reads each input line and searches for
floating-point numbers. Line 4 matches if a floating-point number
is found. (Line 4 is a while statement, not an if,
to enable the program to detect lines containing more than one
floating-point number. The loop starting in line 4 iterates until no
more matches are found on the line.)
When a match is found, the first set of parentheses matches
the digits before the decimal point; these digits are copied into $1. The
second set of parentheses matches the digits after the decimal point; these
matched digits are stored in $2. The third set of
parentheses matches an optional exponent; if the exponent exists,
it is stored in $3.
Line 5 prints the values of $1 and $2 for each
match. If $3 is defined, its value is printed by line 7.
***Begin do/don't***
DO use $1, not $0, to retrieve the first matched subpattern. $0 contains the name of the program you are running.
DON'T confuse $1 with \1. \1, \2, and so on are defined only inside a pattern. See Day 7 for more information on \1.
***End Sidebar***
In patterns, parentheses are counted starting from the left.
This rule tells the Perl interpreter how to handle nested
parentheses:
/(\d+(\.)?\d+)/
This pattern matches one or more digits optionally containing
a decimal point. When this pattern is matched, the outer set of parentheses
is considered to be the first set of parentheses; these parentheses
contain the entire matched number, which is stored in $1.
The inner set of parentheses is treated as the second set of
parentheses because it includes the second left parenthesis seen
by the pattern matcher. The variable $2, which contains
the subpattern matched by the second set of parentheses, contains .
(a period) if a decimal point is matched and the empty string if
it is not.
When a pattern is matched successfully, the matched text
string is stored in the system variable $&. This is
the only way to retrieve the matched pattern because the pattern
matcher returns a true or false value indicating whether the
pattern match is successful. (This is not strictly true, because
you could enclose the entire pattern in parentheses and then
check the value of $1; however, $& is easier to
use in this case.) Listing 17.14 is a program that uses $&
to count all the digits in a set of input files.
Listing 17.14. A program that uses $&.
1: #!/usr/local/bin/perl
2:
3: while ($line = <>) {
4: while ($line =~ /\d/g) {
5: $digitcount[$&]++;
6: }
7: }
8: print ("Totals for each digit:\n");
9: for ($i = 0; $i <= 9; $i++) {
10: print ("$i: $digitcount[$i]\n");
11: }
$ program17_14 file1
Totals for each digit:
0: 11
1: 6
2: 3
3: 1
4: 2
5:
6: 1
7:
8:
9: 1
$
This program reads one line at a time from the files specified
on the command line. Line 4 matches each digit in the input line in
turn; the matched digit is stored in $&.
Line 5 takes the value of $& and uses it as the
subscript for the array @digitcount. This array keeps a
count of the number of occurrences of each digit.
When the input files have all been read, lines 9--11 print the
totals for each digit.
***Begin Note***
Note: If you need the value of $&, be sure to get it before exiting the while loop or other statement block in which the pattern is matched. (A statement block is exited when the Perl interpreter sees a } character.)
For example, the pattern matched in line 4 cannot be accessed outside of lines 4--6 because this copy of $& is defined only in these lines. (This rule also holds true for all the other pattern system variables defined in today's lesson.)
The best rule to follow is to either use or assign a pattern system variable immediately following the statement that matches the pattern.
***End Note***
***Production: in the following heading, the first ' in
$' should be a back quote***
When a pattern is matched, the text of the match is stored in
the system variable $&. The rest of the string is
stored in two other system variables:
***Production: the ' in $' in the following first bullet should be a back quote***
For example, if the Perl interpreter searches for the /\d+/
pattern in the string qwerty1234uiop, it matches 1234,
which is stored in $&. The substring qwerty,
which precedes the match, is stored in $`. The rest of the
string, uiop, is stored in $'.
If the beginning of a text string is matched, $` is set
to the empty string. Similarly, if the last character in the
string is part of the match, $' is set to the empty
string.
The $+ variable matches the last subpattern enclosed in
parentheses. For example, when the following pattern is matched, $+ matches
the digits after the decimal point:
/(\d+)\.(\d+)/
This variable is useful when the last part of a pattern is the
only part you really need to look at.
Several system variables are associated with file variables.
One copy of each file system variable is defined for each file
that is referenced in your Perl program. Many of these system
variables were first introduced on Day 11. The variables
mentioned there are redefined here for your convenience.
When the write statement sends formatted output to a
file, it uses the value of the $~ system variable for that
file to determine the print format to use.
When a program starts running, the default value of $~
for each file is the same as the name of the file variable that
represents the file. For example, when you write to the file
represented by the file variable MYFILE, the default value
of $~ is MYFILE. This means that write
normally uses the MYFILE print format. (For the standard
output file, this default print format is named STDOUT.)
If you want to specify a different print format, change the
value of $~ before calling the write function. For
example, to use the print format MYFORMAT when writing to
the standard output file, use the following code:
select (STDOUT); # making sure you are writing to STDOUT $~ = "MYFORMAT"; write;
This call to write uses MYFORMAT to format its
output.
***Begin Caution***
Caution: Remember that one copy of $~ is defined for each file variable. Therefore, the following code is incorrect:
$~ = "MYFORMAT"; select (MYFILE); write;
In this example, the assignment to $~ changes the default print format for whatever the current output file happens to be. This assignment does not affect the default print format for MYFILE because MYFILE is selected after $~ is assigned. To change the default print format for MYFILE, select it first:
select (MYFILE); $~ = "MYFORMAT"; write;
This call to write now uses MYFORMAT to write to MYFILE.
The $= variable defines the page length (number of
lines per page) for a particular output file. $= is normally
initialized to 60, which is the value that the Perl interpreter
assumes is the page length for every output file. This page
length includes the lines left for page headers, and it is the
length that works for most printers.
If you are directing a particular output file to a printer
with a nonstandard page length, change the value of $= for
this file before writing to it:
select ("WEIRDLENGTH");
$= = 72;
This code sets the page length for the WEIRDLENGTH file
to 72.
***Begin Caution***
Caution: $= is set to 60 by default only if a page header format is defined for the page. If no page header is defined, $= is set to 9999999 because Perl assumes that you want your output to be a continuous stream.
If you want paged output without a page header, define an empty page header for the output file.
The $ variable associated with a particular file
variable lists the number of lines left on the current page of
that file. Each call to write subtracts the number of
lines printed from $. If write is called when $
is zero, a new page is started. (If $ is greater than
zero, but write is printing more lines than the value of $, write
starts a new page in the middle of its printing operation.)
When a new page is started, the initial value of $
is the value stored in $=, which is the number of lines on
the page.
The program in Listing 17.15 displays the value of $.
Listing 17.15. A program that
displays $.
1: #!/usr/local/bin/perl
2:
3: open (OUTFILE, ">outfile");
4: select ("OUTFILE");
5: write;
6: print STDOUT ("lines to go before write: $-\n");
7: write;
8: print STDOUT ("lines to go after write: $-\n");
9: format OUTFILE =
10: This is a test.
11: .
12: format OUTFILE_TOP =
13: This is a test.
14: .
$ program17_15
lines to go before write: 58
lines to go after write: 57
$
Line 3 opens the output file outfile and associates the
file variable OUTFILE with this file. Line 4 then calls select,
which sets the default output file to OUTFILE.
Line 5 calls write, which starts a new page. Line 6
then sends the value of $ to the standard output file, STDOUT,
by specifying STDOUT in the call to print. Note
that the copy of $ printed is the copy associated
with OUTFILE, not STDOUT, because OUTFILE is
currently the default output file.
Line 7 calls write, which sends a line of output to OUTFILE
and decreases the value of $ by one. Line 8 prints
this new value of $.
***Begin Note***
Note: If you want to force your next output to appear at the beginning of a new page, you can set $ to zero yourself before calling write.
When a file is opened, the copy of $ for this file is given the initial value of zero. This technique ensures that the first call to write always starts a page (and generates the header for the page).
When write starts a new page, you can specify the page
header that is to appear on the page. To do this, define a page header
print format for the output file to which the page is to be sent.
The system variable $^ contains the name of the print
format to be used for printing page headers. If this format is
defined, page headers are printed; if it does not exist, no page
headers are printed.
By default, the copy of $^ for a particular file is set
equal to the name of the file variable plus the string _TOP.
For example, for the file represented by the file variable MYFILE, $^
is given an initial value of MYFILE_TOP.
To change the page header print format for a particular file,
set the default output file by calling select, and then
set $^ to the print format you want to use. For example:
select (MYFILE); $^ = "MYHEADER";
This code changes the default output file to MYFILE and
then changes the page header print format for MYFILE to MYHEADER.
As always, you must remember to select the file before changing $^ because
each file has its own copy of $^.
When you send output to a file using print or write,
the operating system might not write it right away. Some systems
first send the output to a special array known as a buffer; when
the buffer becomes full, it is written all at once. This process
of output buffering is usually a more efficient way to write
data.
In some circumstances, you might want to send output straight
to your output file without using an intervening buffer. (For example,
two processes might be sending output to the standard output file at
the same time.)
The $| system variable indicates whether a particular
file is buffered. By default, the Perl interpreter defines a
buffer for each output file, and $| is set to 0. To
eliminate buffering for a particular file, select the file and
then set the $| variable to a nonzero value. For example,
the following code eliminates buffering for the MYFILE
output file:
select ("MYFILE");
$| = 1;
These statements set MYFILE as the default output file
and then turn off buffering for it.
***Begin Caution***
Caution: If you want to eliminate buffering for a particular file, you must set $| before writing to the file for the first time because the operating system creates the buffer when it performs the first write operation.
Each output file opened by a Perl program has a copy of the $%
variable associated with it. This variable stores the current page
number. When write starts a new page, it adds one to the
value of $%. Each copy of $% is initialized to 0,
which ensures that $% is set to 1 when the first page is printed. $%
often is displayed by page header print formats.
The system variables you've seen so far have all been scalar
variables. The following sections describe the array variables
that are automatically defined for use in Perl programs. All of
these variables, except for the @_ variable, are global
variables: their value is the same throughout a program.
The @_ variable, which is defined inside each
subroutine, is a list of all the arguments passed to the subroutine.
For example, suppose that the subroutine my_sub is
called as shown here:
&my_sub("hello", 46, $var);
The values hello and 46, plus the value stored
in $var, are combined into a three-element list. Inside my_sub,
this list is stored in @_.
In a subroutine, the @_ array can be referenced or
modified, just as with any other array variable. Most
subroutines, however, assign @_ to locally defined scalar
variables using the local function:
sub my_sub {
local ($arg1, $arg2, $arg3) = @_;
# more stuff goes here
}
Here, the local statement defines three local
variables, $arg1, $arg2, and $arg3. $arg1
is assigned the first element of the list stored in @_, $arg2
is assigned the second, and $arg3 is assigned the third.
For more information on subroutines, refer to Day 9,
"Using Subroutines."
***Begin Note***
Note: If the shift function is called inside a subroutine with no argument specified, the @_ variable is assumed, and its first element is removed.
When you run a Perl program, you can specify values that are
to be passed to the program by including them on the command
line. For example, the following command calls the Perl program myprog
and passes it the values hello and 46:
$ myprog "hello" 46
Inside the Perl program, these values are stored in a special
built-in array named @ARGV. In this example, @ARGV contains
the list ("hello", 46).
Here is a simple statement that prints the values passed on
the command line:
print ("@ARGV\n");
The @ARGV array also is associated with the <>
operator. This operator treats the elements in @ARGV as
filenames; each file named in @ARGV is opened and read in
turn. Refer to Day 6 for a description of the <>
operator.
***Begin Note***
Note: If the shift function is called in the main body of a program (outside a subroutine) and no arguments are passed with it, the Perl interpreter assumes that the @ARGV array is to have its first element removed.
The following loop assigns each element of @ARGV, in turn, to the variable $var:
while ($var = shift) {
# stuff
}
In Perl, if you specify the n or p
option, you can also supply the -a option. This option
tells the Perl interpreter to break each input line into
individual words (throwing away all tabs and spaces). These words
are stored in the built-in array variable @F. After an
input line has been (automatically) read, the @F array
variable behaves like any other array variable.
For more information on the a, n, or p
options, refer to Day 16, "Command-Line Options."
***Begin Note***
Note: When the a option is specified and an input line is broken into words, the original input line can still be accessed because it is stored in the $_ system variable.
The @INC array variable contains a list of directories
to be searched for files requested by the require
function. This list consists of the following items, in order
from first to last:
Like any array variable, @INC can be added to or
modified.
For more information on the require function, refer to
Day 18, "Object-Oriented Programming."
The built-in associative array %INC lists the files
requested by the require function that have already been
found.
When require finds a file, the associative array
element $INC{file} is defined, in which file is the name
of the file. The value of this associative array element is the
location of the actual file.
When require requests a file, the Perl interpreter
first looks to see whether an associative array element has
already been created for this file. This action ensures that the
interpreter does not try to include the same code twice.
The %ENV associative array lists the environment
variables defined for this program and their values. The
environment variables are the array subscripts, and the values of
the variables are the values of the array elements.
For example, the following statement assigns the value of the
environment variable TERM to the scalar variable $term:
$term = $ENV{"TERM"};
In the UNIX environment, processes can send signals to other
processes. These signals can, for example, interrupt a running program,
trigger an alarm in the program, or kill off the program.
You can control how your program responds to signals it
receives. To do this, modify the %SIG associative array.
This array contains one element for each available signal, with
the signal name serving as the subscript for the element. For
example, the INT (interrupt) signal is represented by the $SIG{"INT"}
element.
The value of a particular element of %SIG is the action
that is to be performed when the signal is received. By default,
the value of an array element is DEFAULT, which tells the
program to do what it normally does when it receives this signal.
You can override the default action for some of the signals in
two ways: you can tell the program to ignore the signal, or you can
define your own signal handler. (Some signals, such as KILL, cannot be
overridden.)
To tell the program to ignore a particular type of signal, set
the value of the associative array element for this signal to IGNORE.
For example, the following statement indicates that the program
is to ignore any INT signals it receives:
$SIG{"INT"} = "IGNORE";
If you assign any value other than DEFAULT or IGNORE
to a signal array element, this value is assumed to be the name
of a function that is to be executed when this signal is
received. For example, the following statement tells the program
to jump to the subroutine named interrupt when it receives
an INT signal:
$SIG{"INT"} = "interrupt";
Subroutines that can be jumped to when a signal is received
are called interrupt handlers, because signals interrupt
normal program execution. Listing 17.16 is an example of a
program that defines an interrupt handler.
Listing 17.16. A program containing
an interrupt handler.
1: #!/usr/local/bin/perl
2:
3: $SIG{"INT"} = "wakeup";
4: sleep();
5:
6: sub wakeup {
7: print ("I have woken up!\n");
8: exit();
9: }
$ program17_16
I have woken up!
$
Line 3 tells the Perl interpreter that the program is to jump
to the wakeup subroutine when it receives the INT
signal. Line 4 tells the program to go to sleep. Because no
argument is passed to sleep, the program will sleep until
a signal wakes it up.
To wake up the process, get the process ID using the ps
command, and then send an INT signal to the process using
the kill command. (See the manual page for kill,
and the related documentation for signal handling, to see how to
perform this task in your environment.)
When the program receives the INT signal, it executes
the wakeup subroutine. This subroutine prints the
following message and then exits:
I have woken up!
If desired, you can use the same subroutine to handle more
than one signal. The signal actually sent is passed as an
argument to the called subroutine, which ensures that your
subroutine can determine which signal triggered it:
sub interrupt {
local ($signal) = @_;
print ("Interrupted by the $signal signal.\n");
}
If a subroutine exits normally, the program returns to where
it was executing when it was interrupted. If a subroutine calls exit or die,
the program execution is terminated.
***Begin Note***
Note: When a program continues executing after being interrupted, the element of %SIG corresponding to the received signal is reset to DEFAULT. To ensure that repeated signals are trapped by your interrupt handler, redefine the appropriate element of %SIG.
Perl provides several built-in file variables, most of which
you have previously seen. The only file variables that have not
yet been discussed are DATA and _ (underscore). The
others are briefly described here for the sake of completeness.
The file variable STDIN is, by default, associated with
the standard input file. Using STDIN with the <>
operator, as in <STDIN>, normally reads data from
your keyboard. If your shell has used < or some
equivalent redirection operator to specify input from a file, <STDIN>
reads from that file.
The file variable STDOUT normally writes to the
standard output file, which is usually directed to your screen.
If your shell has used > or the equivalent to redirect
standard output to a file, writing to STDOUT sends output
to that file.
STDERR represents the standard error file, which is
almost always directed to your screen. Writing to STDERR
ensures that you see error messages even when you have redirected
the standard output file.
You can associate STDIN, STDOUT, or STDERR
with some other file using open:
open (STDIN, "myinputfile"); open (STDOUT, "myoutputfile"); open (STDERR, "myerrorfile");
Opening a file and associating it with STDIN overrides
the default value of STDIN, which means that you can no
longer read from the standard input file. Similarly, opening a
file and associating it with STDOUT or STDERR means
that writing to that particular file variable no longer sends output to
the screen.
To associate a file variable with the standard input file
after you have redirected STDIN, specify a filename of :
open (MYSTDIN, "-");
To associate a file variable with the standard output file,
specify a filename of >:
open (MYSTDOUT, ">-");
You can, of course, specify STDIN with or STDOUT
with > to restore the original values of these
file variables.
ARGV is a special file variable that is associated with
the current input file being read by the <> operator.
For example, consider the following statement:
$line = <>;
This statement reads from the current input file. Because ARGV
represents the current input file, the preceding statement is equivalent
to this:
$line = <ARGV>;
You normally will not need to access ARGV yourself
except via the <> operator.
The DATA file variable is used with the __END__
special value, which can be used to indicate the end of a
program. Reading from DATA reads the line after __END__,
which enables you to include a program and its data in the same
file.
Listing 17.17 is an example of a program that reads from DATA.
Listing 17.17. An example of the DATA
file variable.
1: #!/usr/local/bin/perl
2:
3: $line = <DATA>;
4: print ("$line");
5: __END__
6: This is my line of data.
$ program17_17
This is my line of data.
$
The __END__ value in line 5 indicates the end of the
program. When line 3 reads from the DATA file variable,
the first line after __END__ is read in and is assigned to $line.
(Subsequent requests for input from DATA read successive
lines, if any exist.) Line 6 then prints this input line.
***Begin Note***
Note: For more information on __END__ and methods of indicating the end of the program, refer to Day 20, "Miscellaneous Features of Perl."
The _ (underscore) file variable represents the file
specified by the last call to either the stat function or
a file test operator. For example:
$readable = r "/u/jqpublic/myfile"; $writeable = w _;
Here, the _ file variable used in the second statement
refers to /u/jqpublic/myfile because this is the filename
that was passed to r.
You can use _ anywhere that a file variable can be
used, provided that the file has been opened appropriately:
if (-T $myoutfile) {
print _ ("here is my output\n");
}
Here, the file whose name is stored in $myoutfile is
associated with _ because this name was passed to T
(which tests whether the file is a text file). The call to print
writes output to this file.
The main benefit of _ is that it saves
time when you are using several file-test operators at once:
if (-r "myfile" || -w _ || -x _) {
print ("I can read, write, or execute myfile.\n");
}
Using _ rather than myfile saves time because
file test operators normally call the UNIX system function stat.
If you specify _, the Perl interpreter is told to use the
results of the preceding call to the UNIX stat function
and to not bother calling it again.
As you have seen, the system variables defined by Perl
normally consist of a $, @ or % followed by a
single non-alphanumeric character. This ensures that you cannot
define a variable whose name is identical to that of a Perl
system variable.
If you find Perl system variable names difficult to remember
or type, Perl 5 provides an alternative for most of them. If you add
the statement
use English;
at the top of your program, Perl defines alternative variable
names that more closely resemble English words. This makes it easier
to understand what your program is doing. Table 17.1 lists these alternative
variable names
Table 17.1.
Alternative names for Perl system variables.
***Production: Please put no more than 1p0 between the
columns of the table. Thanks*** Variable Alternative name(s) $_ $ARG $0 $PROGRAM_NAME $< $REAL_USER_ID
or $UID $> $EFFECTIVE_USER_ID or $EUID $( $REAL_GROUP_ID or $GID $) $EFFECTIVE_GROUP_ID
or $EGID $] $PERL_VERSION $/ $INPUT_RECORD_SEPARATOR
or $RS $\ $OUTPUT_RECORD_SEPARATOR or $ORS $, $OUTPUT_FIELD_SEPARATOR
or $OFS $" $LIST_SEPARATOR $# $OFMT $@ $EVAL_ERROR $? $CHILD_ERROR $! $OS_ERROR
or $ERRNO $. $INPUT_LINE_NUMBER or $NR $* $MULTILINE_MATCHING $[
none (deprecated in Perl 5) $; $SUBSCRIPT_SEPARATOR
or $SUBSEP $: $FORMAT_LINE_BREAK_CHARACTERS $$ $PROCESS_ID
or $PID $^A $ACCUMULATOR $^D $DEBUGGING $^F $SYSTEM_FD_MAX $^I $INPLACE_EDIT $^L $FORMAT_FORMFEED $^P $PERLDB $^T $BASETIME $^W $WARNING $^X $EXECUTABLE_NAME $& $MATCH $` $PREMATCH $' $POSTMATCH $+ $LAST_PAREN_MATCH $~ $FORMAT_NAME $= $FORMAT_LINES_PER_PAGE $- $FORMAT_LINES_LEFT $^ $FORMAT_TOP_NAME $| $OUTPUT_AUTOFLUSH $% $FORMAT_PAGE_NUMBER
Today you learned about the built-in system variables
available within every Perl program. These system variables are
divided into five groups:
You also learned how to specify English-language equivalents
for Perl system variables.
print ("@array");
print (@array);
The Workshop provides quiz questions to help you solidify your
understanding of the material covered, and exercises to provide
you with experience in using what you've learned.