Today's lesson teaches you how to manipulate your machine's
file system using some of Perl's built-in library functions. Today,
you'll learn about the following:
Caution: Many of the functions described in today's lesson use features of the UNIX operating system. If you are using Perl on a machine that is not running UNIX, some of these functions might not be defined or might behave differently.
Check the documentation supplied with your version of Perl for details on which functions are supported or emulated on your machine.
The following sections describe the built-in library functions
that read information from files and write information to files. These
library functions perform the following tasks:
Some of the input and output functions supplied by Perl have
been discussed in earlier chapters. These are
The following sections briefly describe these functions again,
along with some features of these functions that have not been discussed
previously.
The open function enables a Perl program to access a
file. It associates a special file variable with each accessed
file. The following is an example:
open (MYVAR, "/u/jqpublic/file");
Here, open requests access to the file /u/jqpublic/file,
and it associates the file MYVAR with this file after it
is open. open returns a nonzero value if the open
succeeds, and zero if the open fails.
By default, open opens a file for reading only. To open
a file for writing, put a > character in front of the
filename, as follows:
open (MYVAR, ">/u/jqpublic/file");
To append information to an existing file, put two >
characters in front of the filename, as follows:
open (MYVAR, ">>/u/jqpublic/file");
To treat the open file as a command to which to pipe data, put
a pipe (|) character in front of the filename, as follows:
open (MAIL, "|mail dave");
(For more information, refer to Day 6, "Reading from and
Writing to Files.")
The open function enables you to open files in several
other ways not previously discussed. For example, to treat the
open file as a command that is piping data to this program, put a |
character after the filename. For example:
open (CAT, "cat file*|");
This call to open executes the command cat file*.
This command creates a temporary file consisting of the contents
of all files whose name starts with file; these contents
are joined (concatenated) into a single file. This file is
treated as an input file that is accessible using the file variable CAT.
$input = <CAT>;
Listing 12.1 is another example of a program that uses piped
input. This program uses the output from the w command to
list the users who are currently logged on to the machine.
Listing 12.1. A program that receives
input from a piped command.
1: #!/usr/local/bin/perl
2:
3: open (WOUT, "w|");
4: $time = <WOUT>;
5: $time =~ s/^ *//;
6: $time =~ s/ .*//;
7: <WOUT>; # skip headings line
8: @users = <WOUT>;
9: close (WOUT);
10: foreach $user (@users) {
11: $user =~ s/ .*//;
12: }
13: print ("Current time: $time");
14: print ("Users logged on:\n");
15: $prevuser = "";
16: foreach $user (sort @users) {
17: if ($user ne $prevuser) {
18: print ("\t$user");
19: $prevuser = $user;
20: }
21: }
$ program12_1
Current time: 4:25pm
Users logged on:
dave
kilroy
root
zarquon
$
The w command lists the current time, the machine load,
and the users logged onto the machine. It also lists the job time
and the currently executing command for each user.
Here is sample output for the w command:
4:25pm up 1 day, 6:37, 6 users, load average: 0.79, 0.36, 0.28 User tty login@ idle JCPU PCPU what dave ttyp0 2:26pm 27 3 w kilroy ttyp1 9:01am 2:27 1:04 11 -csh kilroy ttyp2 9:02am 43 1:46 27 rn root ttyp3 4:22pm 2 -csh zarquon ttyp4 1:26pm 4 43 16 cc myprog.c kilroy ttyp5 9:03am 2:14 48 /usr/games/hack
This Perl program takes the output from the w command
and massages it to retrieve only the information needed: the
current time and the users who are currently logged on.
Line 3 starts the w command. The call to open
specifies that the output from w is to be treated as input
to this program, and that the file variable WOUT is to be
used to access this input.
Line 4 reads the first line of the input piped from WOUT.
This is the line read:
4:25pm up 1 day, 6:37, 6 users, load average: 0.79, 0.36, 0.28
The following two lines extract the current time from this
line. First, line 5 removes the leading spaces. Then, line 6
removes everything after the first word, except for the trailing
newline character. This leaves the time, 4:25pm, along
with the trailing newline, stored in $time.
Line 7 reads the second line from WOUT. Because this
line contains no useful information, there is no need to assign
it to any scalar variable.
Line 8 reads the rest of the output from w to the array
variable @users. After this output has been read, line 9
closes WOUT, which terminates the process that is running
the w command.
Each element of the list stored in @users contains one
line of user information. Because this program only needs the
first word of each line, lines 10-12 get rid of everything else
(except, again, for the trailing newline character). After this
loop is complete, the array in @users contains a list of
users logged on.
Line 13 prints the current time, as stored in $time.
Note that print does not need to specify a trailing newline
character, because $time contains one.
Lines 16-21 sort the list of users in @users and prints
them. Because a user can be logged on more than once, $prevuser stores
the last user name printed. The value stored in $user is
not printed unless it is not the same as the value stored in $prevuser.
Many UNIX shells enable you to direct both the standard output
file and the standard error file to the same output file. For example,
in the Bourne shell sh, the command
$ foo >file1 2>&1
runs the command foo and stores the output from the
standard output file and the standard error file in file1.
Listing 12.2 shows how you can do this in Perl.
Listing 12.2. A program that
redirects the standard output and standard error files.
1: #!/usr/local/bin/perl
2:
3: open (STDOUT, ">file1") || die ("open STDOUT failed");
4: open (STDERR, ">&STDOUT") || die ("open STDERR failed");
5: print STDOUT ("line 1\n");
6: print STDERR ("line 2\n");
7: close (STDOUT);
8: close (STDERR);
This program produces no output.
The following are the contents of the output file file1:
line 2 line 1
As you can see, these lines aren't in the order intended. To
understand what is happening, let's examine this program in more detail.
Line 3 redirects the standard output file. To do this, it
opens the output file file1 and associates it with the
file variable STDOUT; this closes the standard output
file.
Line 4 redirects the standard error file. The argument >&STDOUT
tells the Perl interpreter to use the file already opened and associated
with STDOUT. This means that the file variable STDERR refers
to the same file as STDOUT.
Lines 5 and 6 write to STDOUT and STDERR,
respectively. Because these file variables refer to the same
file, both lines are written to file1. Unfortunately, they
are written in the wrong order. What has happened?
The problem arises because of how UNIX handles the writing of
output. When you use print (or any other function) to
write to a file such as the standard output file, what the UNIX
operating system really does is copy the output to a special
internal storage area called a buffer. (You can think of a
buffer as a giant character string or as an array of characters.)
Subsequent output operations continue writing to the buffer until
it is full; when the buffer is full, the entire buffer is written
out. Copying to a buffer and then writing out the entire buffer
takes much less time than writing individual lines of text. (This
is because, on most machines, input-output operations are slower
than memory-access operations.)
When a program ends, any non-empty buffers are written out.
However, the system maintains separate buffers for STDERR and STDOUT,
and it writes out the buffer for STDERR first. This means
that line 2, which is stored in the STDERR buffer, appears
before line 1, which is stored in the STDOUT
buffer.
To get around this problem, you can tell the Perl interpreter
not to use a buffer for a particular file. To do this, do the following:
The system variable $| indicates whether a particular
file is to be buffered (in other words, whether it should use a
buffer or not). If $| is assigned a nonzero value, no
buffer is used. As with $~ and $^, assigning to $|
affects the current default file, which is the file last
specified in a call to select (or STDOUT, if select
has not been called).
Listing 12.3 shows how you can use $| to ensure that
your output lines appear in the correct order.
Listing 12.3. A program that
redirects standard input and output and turns off buffering.
1: #!/usr/local/bin/perl
2:
3: open (STDOUT, ">file1") || die ("open STDOUT failed");
4: open (STDERR, ">&STDOUT") || die ("open STDERR failed");
5: $| = 1;
6: select (STDERR);
7: $| = 1;
8: print STDOUT ("line 1\n");
9: print STDERR ("line 2\n");
10: close (STDOUT);
11: close (STDERR);
This program produces no output.
The contents of the output file file1 are now the
following:
line 1 line 2
Line 5 sets $| to 1, which tells the Perl interpreter
that the current default file does not need to be buffered.
Because select has not yet been called, the current
default file is STDOUT, which means that line 5 turns off
buffering for the standard output file (which has been redirected
to file1).
Line 6 sets the current default file to STDERR, and
line 7 once again sets $| to 1. This turns off buffering
for the standard error file (which has also been redirected to file1).
Because buffering has been turned off for both STDERR
and STDOUT, lines 8 and 9 write to file1 right
away. This means that the output lines appear in file1 in
the order in which they are printed.
To open a file for both read and write access, specify +>
before the filename, as follows:
open (READWRITE, "+>file1");
This opens the file named file1 for both reading and
writing. This enables you to overwrite portions of a file.
Opening a file for reading and writing works best in
conjunction with the library functions seek and tell,
which enable you to skip to the middle of a file. (For more
information on seek and tell, refer to the section called
"Skipping and Re-Reading Data," later in today's
lesson.)
Note: You also can use +< as the prefix to specify both reading and writing, as follows:
open (READWRITE, "+<file1");
The prefix <, by itself, specifies that the file is to be opened for reading. This means that the following two statements are identical:
open (READONLY, "<read"); open (READONLY, "read");
The library function close was discussed on Day 6,
"Reading from and Writing to Files." It closes a file
opened by open, as follows:
close (MYFILE);
Here, MYFILE is the file variable (passed to open)
that is associated with the open file.
Note: If you use close to close a pipe, the program will wait for the piped program to terminate. For example:
open (MYPIPE, "cat file*|"); close (MYPIPE);
When close is called, the program suspends execution until the command cat file* is terminated.
The print, printf, and write functions
have been covered also in previous chapters, but I'll briefly recap
them here.
The print function is the simplest function. It writes
to the file specified, or to the current default file if no file
is specified. For example:
print ("Hello, there!\n");
print OUTFILE ("Hello, there!\n");
The first statement writes to the current default file (which
is STDOUT unless select has been called). The
second statement writes to the file specified by OUTFILE.
The printf function formats a string and sends it to
either the file specified or the current default file. For
example, the statement
printf OUTFILE ("You owe me %8.2f", $owing);
takes the value stored in $owing and substitutes it for %8.2f
in the specified string. %8.2f is an example of a field
specifier and indicates that the value stored in $owing
is to be treated as a floating-point number.
The write function uses a print format to send
formatted output to the file that is specified or to the current
default file. For example:
select (OUTFILE); $~ = "MYFORMAT"; write;
This call to write uses the print format MYFORMAT
to send output to the file OUTFILE.
For more information on printf or write, refer
to Day 11, "Formatting Your Output."
The select function also is covered on Day 11. This
function is passed a file variable, which becomes the new current
default file. For example:
select (MYFILE);
In this case, MYFILE is now the current default file,
which means that calls to print, write, and printf write
to MYFILE unless a file variable is explicitly specified.
The library function eof checks whether the last input
file read has been exhausted. If all of the input has been read, eof
returns a nonzero value. If there is input remaining, eof
returns zero.
The eof function was first introduced on Day 6. You
might have noticed that, on that day, the examples that use eof
use it without parentheses. This is because the behavior of eof
is a little tricky if you are using it in conjunction with the <>
operator; in this case, eof and eof() behave differently.
Listing 12.4 shows how eof interacts with <>.
It prints the contents of one or more input files whose names are
supplied on the command line. A line of dashes is printed after
each input file is completed.
To run this program yourself, create two files named file1
and file2. Put the following in file1:
This is a line from the first file. Here is the last line of the first file.
Then, put the following in file2:
This is a line from the second and last file. Here is the last line of the last file.
Finally, specify file1 and file2 on the command
line when you run this program. For example, if you have called
this program program 12_4, run it as follows:
$ program12_4 file1 file2
This will give you the output shown in the Input-Output
example.
Listing 12.4. A program that uses eof
and <> together.
1: #!/usr/local/bin/perl
2:
3: while ($line = <>) {
4: print ($line);
5: if (eof) {
6: print (" end of current file \n");
7: }
8: }
$ program12_4 file1 file2
This is a line from the first file.
Here is the last line of the first file.
end of current file
This is a line from the second and last file.
Here is the last line of the last file.
end of current file
$
The <> operator in line 3 tells the program to
read the next line of input from the input files supplied on the command
line. Line 4 then prints the line.
Line 5 calls eof without parentheses. This is the form
of eof that you are familiar with. It returns true if the
current input file has been completely read.
Caution: When you test for end-of-file, use either eof or eof() but not both.
Compare the program in Listing 12.4 with Listing 12.5, which
uses eof() instead of eof.
Listing 12.5. A program that uses eof()
and <> together.
1: #!/usr/local/bin/perl
2:
3: while ($line = <>) {
4: print ($line);
5: if (eof()) {
6: print (" end of output \n");
7: }
8: }
$ program12_5 file1 file2
This is a line from the first file.
Here is the last line of the first file.
This is a line from the second and last file.
Here is the last line of the last file.
end of output
$
Line 5 of this program calls eof with parentheses.
Calls to eof with parentheses only return true when all of
the files have been read. If the program is at the end of the
first input file, eof() returns false because there is
still input to be read.
Note: If you like, you can use eof with a particular file. For example:
if (eof(MYFILE)) {
# do end-of-file stuff
}
Here, the conditional expression returns true if all of MYFILE has been read.
Also, note that the distinction between eof and eof() is only meaningful when you are using the <> operator. If you are just reading from a single file, it doesn't matter whether you supply parentheses or not. For example:
while ($line = <STDIN>) {
# stuff goes here
if (eof) { # you can also use eof() here
# more stuff here
}
}
When you call any of the functions described so far in today's
lesson, you can indicate which file to use by specifying a file variable.
However, these functions also enable you to supply a scalar variable
in place of a file variable; when you do, the Perl interpreter
treats the value stored in the scalar variable as the name of the
file variable. For example, consider the following:
$filename = "MYFILENAME"; open ($filename, ">file1");
This call to open takes the value stored in $filename--MYFILENAME--and
uses it as the file-variable name. This means that the file
variable MYFILENAME is now associated with the output file file1.
Listing 12.6 is an example of a program that stores a
file-variable name in a scalar variable and passes the library
variable to Perl input and output functions.
Listing 12.6. A program that uses a
scalar variable to store a file variable name.
1: #!/usr/local/bin/perl
2:
3: &open_file("INFILE", "", "file1");
4: &open_file("OUTFILE", ">", "file2");
5: while ($line = &read_from_file("INFILE")) {
6: &print_to_file("OUTFILE", $line);
7: }
8:
9: sub open_file {
10: local ($filevar, $filemode, $filename) = @_;
11:
12: open ($filevar, $filemode . $filename) ||
13: die ("Can't open $filename");
14: }
15: sub read_from_file {
16: local ($filevar) = @_;
17:
18: <$filevar>;
19: }
20: sub print_to_file {
21: local ($filevar, $line) = @_;
22:
23: print $filevar ($line);
24: }
This program produces no output.
This program is just a fancy way of copying the contents of file1
to file2. Line 3 opens the input file, file1, for reading
by calling the subroutine open_file. This subroutine is
passed the name of the file variable to use, which is INFILE.
Line 4 uses the same subroutine, open_file, to open the
output file, file2, for writing. The file variable OUTFILE
is used in this open operation.
Line 5 calls read_from_file to read a line of input and
passes it the file variable name INFILE. Line 18
substitutes the value of $filevar, INFILE, into <$filevar>,
yielding the result <INFILE>; then, it reads a line
from this input file. Because this line-reading operation is the
last expression evaluated in the subroutine, the line read is
returned by the subroutine and assigned to $line.
Line 6 then passes OUTFILE and the input line just read
to the subroutine print_to_file.
Note: All of the functions you've seen so far in this chapter--open, close, print, printf, write, select, and eof--enable you to use a scalar variable in place of a file variable.
The functions open, close, write, select, and eof also enable you to use an expression in place of a file variable. The value of the expression must be a character string that can be used as a file variable.
In the programs you've seen so far, input files have always
been read in order, starting with the first line of input and
continuing on to the end. Perl provides two special functions, seek
and tell, which enable you to skip forward or backward in
a file so that you can skip or re-read data.
The seek function moves backward or forward in a file.
The syntax for the seek function is
seek (filevar, distance, relative_to);
As you can see, seek requires three arguments:
If relative_to is 0, the number of bytes to skip
is relative to the beginning of the file. If relative_to is 1,
the skip is relative to the current position in the file (the
current position is the location of the next line to be read). If relative_to
is 2, the skip is relative to the end of the file.
For example, to skip back to the beginning of the file MYFILE,
use the following:
seek(MYFILE, 0, 0);
The following statement skips forward 80 bytes:
seek(MYFILE, 80, 1);
The following statement skips backward 80 bytes:
seek(MYFILE, -80, 1);
And the following statement skips to the end of the file
(which is useful when the file has been opened for reading and
writing):
seek(MYFILE, 0, 2);
The seek function returns true (nonzero) if the skip
was successful, and 0 if it failed. It is often used in
conjunction with the tell function, described in the next
section.
The tell function returns the distance, in bytes,
between the beginning of the file and the current position of the
file (the location of the next line to be read).
The syntax for the tell function is
tell (filevar);
filevar, which is required, represents the file
whose current position is needed.
For example, the following statement retrieves the current
position of the file MYFILE:
$offset = tell (MYFILE);
Note: tell and seek accept an expression in place of a file variable, provided the value of the expression is the name of a file variable.
You can use tell and seek to skip to a
particular position in a file. For example, Listing 12.7 uses these
functions to print pairs of lines twice each. (This is, of
course, not the fastest way to do this.)
Listing 12.7. A program that
demonstrates seek and tell.
1: #!/usr/local/bin/perl
2:
3: @array = ("This", "is", "a", "test");
4: open (TEMPFILE, ">file1");
5: foreach $element (@array) {
6: print TEMPFILE ("$element\n");
7: }
8: close (TEMPFILE);
9: open (TEMPFILE, "file1");
10: while (1) {
11: $skipback = tell(TEMPFILE);
12: $line = <TEMPFILE>;
13: last if ($line eq "");
14: print ($line);
15: $line = <TEMPFILE>; # assume the second line exists
16: print ($line);
17: seek (TEMPFILE, $skipback, 0);
18: $line = <TEMPFILE>;
19: print ($line);
20: $line = <TEMPFILE>;
21: print ($line);
22: }
$ program12_7
This
is
This
is
a
test
a
test
$
Lines 3-8 of this program create a temporary file named file1
consisting of four lines: This, is, a, and test.
Line 9 opens this temporary file for reading.
Lines 10-22 loop through the test file. Line 11 calls tell
to obtain the current position of the file before reading the
pair of lines. Lines 12-16 read the lines and print them (first
testing whether the end of the file has been reached).
Line 17 then calls seek, which positions the file at
the point returned by tell in line 11. This means that the
pair of lines read by lines 12 and 15 are read again by lines 18
and 20. Therefore, lines 19 and 21 print a second copy of the
input lines.
Caution: You cannot use seek and tell if the file variable actually refers to a pipe. For example, if you open a pipe using the statement
open (MYPIPE, "cat file*|");
then the following statement makes no sense:
$illegal = tell (MYPIPE);
In Perl, the easiest way to read input from a file is to use
the <filevar> operator, where filevar
is the file variable representing the file to read. Perl also
provides two other functions that read from an input file:
Perl also enables you to write output using the built-in
function syswrite, which calls the UNIX write function.
These functions are described in the following sections.
The read function is designed to be equivalent to the
UNIX function fread. It enables you to read an arbitrary
number of characters (bytes) into a scalar variable.
The syntax for the read function is
read (filevar, result, length, skipval);
Here, filevar is the file variable representing
the file to read, result is the scalar variable (or
array variable element) into which the bytes are to be stored,
and length is the number of bytes to read.
skipval is an optional argument which specifies
the number of bytes to skip before reading.
For example:
read (MYFILE, $scalar, 80);
This call to read tries to read 80 bytes from the file
represented by the file variable MYFILE, storing the
resulting character string in $scalar. It returns the
number of bytes actually read; if MYFILE is at end-of-file,
it returns 0 (read returns the null string if an error
occurs).
You can use read to append to an existing scalar
variable by specifying a fourth argument, which indicates the
number of bytes to skip in the scalar variable.
read (MYFILE, $scalar, 40, 80);
This call to read reads another 40 bytes from MYFILE.
When copying these bytes into $scalar, read first
skips the first 80 bytes already stored there.
If you want to read data as quickly as possible, you can call sysread
instead of read.
The syntax for the sysread function is
sysread (filevar, result, length, skipval);
These arguments are the same as for read.
For example:
sysread (MYFILE, $scalar, 80); sysread (MYFILE, $scalar, 40, 80);
sysread is equivalent to the UNIX function read.
The arguments to sysread are the same as those for the
Perl read function.
To write as quickly as possible, call the syswrite
function, which is equivalent to the UNIX function write.
The syntax of the syswrite function is
syswrite (filevar, data, length, skipval);
Here, filevar is the file to write to, data
is the place where the data is located, length is
the number of bytes to write, and skipval is the
number of bytes to skip before writing.
For instance, the following call writes the first 80 bytes of $scalar
to the file specified by MYFILE:
syswrite (MYFILE, $scalar, 80);
Similarly, the following statement skips the first 80 bytes
stored in $scalar, and then writes the next 40 bytes to
the file specified by MYFILE:
syswrite (MYFILE, $scalar, 40, 80);
Caution: Don't use sysread and syswrite unless you know what you are doing. For more information on these functions, refer to the UNIX system manual pages for the read and write functions.
Perl provides one other built-in function, getc, which
reads a single character of input from a file.
The syntax for calls to the getc function is
char = getc (infile);
infile is the file from which to read, and char
is the character returned.
For example:
$singlechar = getc(INFILE);
This statement reads a character from the file represented by INFILE
and stores it (as a character string) in the scalar variable $singlechar.
The getc is useful for "hot key"
applications. These applications accept and process input one character
at a time rather than one line at a time. Listing 12.8 is an
example of such a program. It reads one character at a time and
checks whether the character is alphanumeric. If it is, it writes out the
next higher letter or number. For example, when you enter a,
the program prints out b, and so on. In this example, the
alphabetic letters a through z and the digits 0
through 9 are typed in.
Listing 12.8. A program that
demonstrates the use of getc.
1: #!/usr/local/bin/perl
2:
3: &start_hot_keys;
4: while (1) {
5: $char = getc(STDIN);
6: last if ($char eq "\\");
7: $char =~ tr/a-zA-Z0-9/b-zaB-ZA1-90/;
8: print ($char);
9: }
10: &end_hot_keys;
11: print ("\n");
12:
13: sub start_hot_keys {
14: system ("stty cbreak");
15: system ("stty -echo");
16: }
17:
18: sub end_hot_keys {
19: system ("stty -cbreak");
20: system ("stty echo");
21: }
$ program12_8
bcdefghijklmnopqrstuvwxyza1234567890
$
The subroutine start_hot_keys modifies the runtime
environment to support hot-key input. To do this, it uses two calls
to the built-in function system, which simply takes its
argument and executes it. The command stty cbreak tells
the system to process input one character at a time, and the
command stty -echo tells the system not to display
characters typed at the keyboard.
Note: Some machines might not support hot keys or might use different commands to establish the hot-key environment. If you are on a machine that uses different commands to establish the environment, you still can run this program; just change the stty commands to whatever works on your machine.
The loop in lines 4-9 reads and writes one character per loop
iteration. Line 5 starts off by reading a character from the standard
input file using getc.
Line 6 tests whether the character read is a backslash. If it
is, the loop terminates. If the character is not a backslash, the program
continues with line 7. This line translates all alphanumeric characters
to the next-highest letter or number; for example, it translates g
to h, E to F, and 7 to 8. The
characters z, Z, and 9 are translated to a, A,
and 0, respectively.
Line 8 prints out the translated character. Because the
characters you type at the keyboard are not displayed, the
program makes it look like your keyboard is malfunctioning. (It's
quite disorienting!)
The subroutine end_hot_keys restores the normal working
environment by undoing the system calls that are performed by start_hot_keys.
Caution: If you are using hot keys, when you clean up make sure you call stty -cbreak before calling stty echo. If you call stty echo first, your terminal might wind up not printing newline characters properly.
If your machine distinguishes between text files and binary
files (files that contain unprintable characters), your Perl
program can tell the system that a particular file is a binary
file. To do this, call the built-in function binmode.
The syntax for calling the binmode function is
binmode (filevar);
filevar is a file variable.
binmode expects a file variable (or an expression whose
value is the name of a file variable). It must be called after
the file is opened, but before the file is read.
The following is an example of a call to binmode:
binmode (MYFILE);
Note: Normally, you won't need to use this function unless you are running in a DOS-like environment.
The input and output functions that you have seen earlier read
and write data to files. Perl also provides a group of functions that
enable you to manipulate UNIX directories. Functions exist that enable
you to create, read, open, close, delete, and skip around in
directories. The following sections describe these functions.
To create a new directory, call the function mkdir.
The syntax for the mkdir function is
mkdir (dirname, permissions);
mkdir requires two arguments:
For example, to create a directory named /u/jqpublic/newdir,
you can use the following statement:
mkdir ("/u/jqpublic/newdir", 0777);
To create a subdirectory of the current working directory,
just specify the new directory name, as follows:
mkdir ("newdir", 0777);
If the current working directory is /u/janedoe/mydir,
this creates a subdirectory named /u/janedoe/mydir/newdir.
The permissions value of 0777 in both these examples
grants read, write, and execute permissions to everybody. Table
12.1 lists each possible access permission and the octal number
associated with it.
Table 12.1. Access
permissions for the mkdir function.
Value Permission 4000
Set user ID on execution 2000 Set group ID on execution 1000
Sticky bit (see the UNIX chmod manual page) 0400
Read permission for file owner 0200 Write permission for
file owner 0100 Execute permission for file owner 0040
Read permission for owner's group 0020 Write permission
for owner's group 0010 Execute permission for owner's group 0004
Read permission for world 0002 Write permission for world 0001
Execute permission for world
You can combine access permissions by adding (or doing a
logical "or" operation on) the appropriate octal values
in the table. For example, to grant read, write, and execute
permission to the owner but only read permission to everybody
else, specify 0744 as the permission value.
Note: All of the permission values shown here are in octal notation, because a leading zero is specified. If you like, you can use decimal or hexadecimal here, but it won't be as easy to read.
Also note that the permission value set here is affected by the current value of umask. See the description of the umask function later today for more information.
mkdir returns true (nonzero) if the directory is
successfully created. It returns false (0) if the directory is
not.
To set a directory to be the current working directory, use
the function chdir.
The syntax for the chdir function is
chdir (dirname);
dirname is the name of the new current working
directory.
chdir returns true if the current directory is set
properly, false if an error occurs.
For example, to set the current working directory to /u/jqpublic/newdir,
use the following statement:
chdir ("/u/jqpublic/newdir");
Note: As with mkdir, the directory name passed to chdir can be either a character string or an expression whose value is a directory name. For example, the following sets the current directory to be /u/jqpublic/newdir:
$dir = "/u/jqpublic/"; chdir ($dir . "newdir");
You can have your program examine a list of the files
contained in a directory. To do this, the first step is to call
the built-in function opendir.
The syntax for the opendir function is
opendir (dirvar, dirname);
dirvar is the name the program is to use to
represent the directory, also known as a directory variable,
and dirname is the name of the directory to open
(which can be a character string or the value of an expression).
opendir returns true if the open operation is
successful, and it returns false otherwise.
For example, to open the directory named /u/janedoe/mydir,
you can use the following statement:
opendir (DIR, "/u/janedoe/mydir");
This associates the directory variable DIR with the
opened directory.
Note: If you like, you can use the same name as both a directory variable and a file variable.
opendir (MYNAME, "/u/jqpublic/dir"); open (MYNAME, "/u/jqpublic/dir/file");
The Perl interpreter always can tell from context whether a name is being used as a directory variable or as a file variable. (However, there is no real reason to do so. Your programs will be easier to read if you use different names to represent files and directories.)
To close an opened directory, call the closedir
function.
The syntax for the closedir function is
closedir (mydir);
closedir expects one argument: the directory variable
associated with the directory to be closed.
After opendir has opened a directory, you can access
the name of each file or subdirectory stored in the directory by
calling the function readdir.
The syntax for the readdir function is
readdir (mydir);
Like closedir, readdir is passed the directory
variable that is associated with the open directory.
If the value returned from readdir is assigned to a
scalar variable, readdir returns the name of the first
file or subdirectory stored in the directory. For example:
$filename = readdir(MYDIR);
The first name is returned also if the return value from readdir
is assigned to an element of an array variable. For example:
$filearray[3] = readdir(MYDIR);
$filearray{"foo"} = readdir(MYDIR);
If readdir is called again, it returns the next name in
the directory; subsequent calls return other names, continuing
until the directory is exhausted. Listing 12.9 uses readdir
to list the files and subdirectories in a directory.
Listing 12.9. A program that lists
the files and subdirectories in a directory.
1: #!/usr/local/bin/perl
2:
3: opendir(HOMEDIR, "/u/jqpublic") ||
4: die ("Unable to open directory");
5: while ($filename = readdir(HOMEDIR)) {
6: print ("$filename\n");
7: }
8: closedir(HOMEDIR);
$ program12_9
.
..
.cshrc
.Xresources
.xsession
test
bin
letter
file1
$
Line 3 opens the directory /u/jqpublic, which is the
home directory for user jqpublic. The opendir
function associates the directory variable HOMEDIR with /u/jqpublic.
Lines 5-7 read the name of each file in the directory in turn.
Line 6 prints each filename as it is read in.
Note that, on a UNIX system, the list of names includes two
special files:
As you can see, readdir reads the names in the order in
which they appear in the directory.
Listing 12.10 shows how you can display the names in
alphabetical order.
Listing 12.10. A program that lists
the files and subdirectories in a directory in alphabetical
order.
1: #!/usr/local/bin/perl
2:
3: opendir(HOMEDIR, "/u/jqpublic") ||
4: die ("Unable to open directory");
5: @files = readdir(HOMEDIR);
6: closedir(HOMEDIR);
7: foreach $file (sort @files) {
8: print ("$file\n");
9: }
$ program12_10
.
..
.Xresources
.cshrc
.xsession
bin
file1
letter
test
$
The readdir function behaves differently when its
return value is assigned to an array; in this case, the entire
list of files and subdirectories in the directory is assigned to
the array variable @files by line 5.
After the entire list is stored, sort can be called to
sort the list into alphabetical order. The foreach loop in
lines 7-9 then prints the sorted list one name at a time.
As you've seen, the library functions tell and seek
enable you to skip backward and forward in a file. Similarly, the
library functions telldir and seekdir enable you to
skip backward and forward in a list of directories.
To use telldir, pass it the directory variable defined
by opendir. telldir returns the current directory location
(where you are in the list of files).
The syntax for the telldir function is
location = telldir (mydir);
Here, mydir is the directory variable
corresponding to the directory whose file list you are examining,
and location is assigned the current directory
location.
To skip to the directory location returned by telldir,
call seekdir.
The syntax for the seekdir function is
seekdir(mydir, location);
This call to seekdir sets the current directory
location to the location specified by location.
Caution: seekdir works only with directory locations returned by telldir.
Although being able to skip anywhere you like in a directory
list is useful, the most common skipping operation in directory lists
is rewinding the directory list, or starting over again.
Because of this, Perl provides a special function, rewinddir,
that handles the rewind operation.
The syntax for the rewinddir function is
rewinddir (mydir);
rewinddir sets the current directory location to the
beginning of the list of files, which lets you read the entire
list of files again. As with the other directory functions, mydir
is the directory variable defined by opendir.
The final directory function supplied by Perl is rmdir,
which deletes an empty directory.
The syntax for calling the rmdir function is
rmdir (dirname);
rmdir returns true (nonzero) if the directory dirname
is deleted successfully, and false if the directory is not empty
or cannot be deleted.
Perl provides several library functions that modify the
attributes or behavior of files. These functions can be divided
into the following groups:
These groups of functions are described in the following
sections.
Perl provides the following file-relocation functions:
The built-in function rename changes the name of a
file.
The syntax for the rename function is
rename (oldname, newname);
oldname is the old file name, and newname
is the new file name.
The rename function returns true if the rename
succeeds, and false if an error occurs.
For example, to change a file named name1 to name2,
use the following:
rename ("name1", "name2");
You can use the value stored in a scalar variable as an
argument to rename, or any variable or expression whose
value is a character string, as follows:
rename ($oldname, &get_new_name);
You can also use rename to move a file from one
directory to another (provided both directories are in the same
file system). For example:
rename ("/u/jqpublic/name1", "/u/janedoe/name2");
Caution: When rename moves a file, as in
rename ("name1", "name2");
it does not check whether a file named name2 already exists. Any existing name2 is destroyed by the rename operation.
To get around this problem, use the -e file-test operator, which checks whether a named file exists, as follows:
-e "name2" || rename (name1, name2);
Here, the || operator ensures that rename is called only when no file named name2 already exists.
To delete a file, use the unlink function.
The syntax for the unlink function is
num = unlink (filelist);
This function takes a list as its argument and deletes all the
files named in that list.
unlink returns the number of files actually deleted.
The following is an example of a call to unlink:
@deletelist = ("file1", "file2");
unlink (@deletelist);
The function is called unlink, instead of delete,
because what it is actually doing is removing a reference, or link,
to the particular file. See the following section for more
details on links in Perl.
In the UNIX environment, files can be "contained" in
more than one directory at a time. Each directory contains a
reference, or link, to the file.
The following sections describe how to create and access
links.
Note: If a file is referenced by multiple links, unlink removes only one of the links, and the file can still be referenced.
To create a link to an existing file, use the built-in
function link.
The syntax for the link function is
link (newlink, file);
newlink is the link being created, and file
is the file being linked to.
link returns true if the link is created, and false if
an error occurs.
For example:
link ("/u/jqpublic/file", "/u/janedoe/newfile");
After link has been called, the file /u/jqpublic/file
also can be thought of as the file /u/janedoe/newfile. If unlink
is called using /u/jqpublic/file, as in
unlink ("/u/jqpublic/file");
you can still reference the file by specifying the name /u/janedoe/newfile.
The link created by the link function is called a hard
link, which means that it actually references the file
itself. Many operating systems also support symbolic links,
which are references to the filename, not to the file itself.
To create a symbolic link, use the function symlink.
The syntax for the symlink function is
symlink (newlink, file);
newlink is the link being created, and file
is the file being linked to.
symlink, like link returns true if the link is
created, and false if an error occurs.
The following is an example of symlink:
symlink("/u/jqpublic/file", "/u/janedoe/newfile");
Here, /u/janedoe/newfile is symbolically linked to /u/jqpublic/file.
Now, when the following statement is executed, the file is actually
deleted:
unlink ("/u/jqpublic/file");
/u/janedoe/newfile now references nothing at all. (In
this case, /u/janedoe/newfile is an example of an unresolved
symbolic link.) When /u/jqpublic/file is created
again, you will be able to access the new file using /u/janedoe/newfile.
If a filename, such as /u/janedoe/newfile, is actually
a symbolic link to another filename, the function readlink
returns the filename to which it is linked.
The syntax for the readlink function is
filename = readlink (linkname);
linkname is the symbolic link, and filename
is the equivalent filename.
readlink returns an empty string if the filename is not
a symbolic link. (In particular, readlink fails if the
filename is actually a hard link.)
For example:
$linkname = readlink("/u/janedoe/newfile");
# $linkname now contains "/u/jqpublic/file"
Listing 12.11 is an example of a program that prints all the
symbolic links in a particular directory.
Listing 12.11. A program that prints
symbolic links.
1: #!/usr/local/bin/perl
2:
3: $dir = "/u/janedoe";
4: opendir(MYDIR, $dir);
5: while ($name = readdir(MYDIR)) {
6: if (-l $dir . "/" . $name) {
7: print ("$name is linked to ");
8: print (readlink($dir . "/". $name) . "\n");
9: }
10: }
11: closedir(MYDIR);
$ program12_11
newfile is linked to /u/jqpublic/file
$
This program uses opendir and readdir to examine
each file in the directory in turn. Line 6 uses the -l
file-test operator to determine whether the filename is actually
a symbolic link. If the filename is a symbolic link, the
following expression becomes true, and the program executes the calls
to print in lines 7 and 8:
-l $dir . "/" . $name
Line 8 calls readlink, passing it the directory name
and the filename stored in $name. Because readlink
is only called if the expression in line 6 is true, $name
is always a symbolic link.
As you've seen, the built-in function mkdir requires
you to specify the access permissions for the directory you are
creating. These permissions indicate, for example, whether
particular users are allowed to read files from the directory or
write into the directory.
In the UNIX environment, each individual file has its own set
of access permissions. The set of possible permissions is the same
as for directories. (Refer to Table 12.1 in the section titled
"The mkdir Function" earlier in today's lesson
for a complete list of the possible functions.)
In Perl, three functions are defined that deal with access
permissions.
To change the access permissions for a list of files, call the chmod
function.
The syntax for the chmod function is
chmod (permissions, filelist);
permissions is the set of access permissions you
want to give, and is a standard UNIX file permissions mask. (For
example, setting permissions to 0777 gives
read, write, and execute permission to everybody. See the section
called "The mkdir Function" for a description of
the set of permissions.) filelist is the list of
files whose permissions you want to change.
The chmod function returns the number of files whose
permissions were successfully set.
**end syntax box***
The following is an example of a call to chmod:
@filelist = ("file1", "file2");
chmod (0777, @filelist);
In this example, the files file1 and file2 are
assigned global read, write, and execute permissions.
Note: You cannot change access permissions using chmod unless you have permission to do so. You need to have been granted write permission on a file before you can change its permissions.
Normally, the owner of a file is the person who created it. To
change the owner of a file, use the function chown:
The syntax for the chown function is
chown (userid, groupid, filelist);
The chown function requires three arguments:
The chown function returns the number of files changed.
The following is an example of a call to chown:
@filelist = ("file1", "file2");
chown (17, -1, @filelist);
Note: On most UNIX systems, you can retrieve a user ID or group ID from the /etc/passwd file. You can use the Perl function getpwnam to retrieve information from this file. For more information on getpwnam, refer to Day 15, "System Functions."
Also, the superuser (system administrator) is usually the only user allowed to change the owner of a file.
As you've seen, you can change the access permissions for a
file using chmod. To specify access permissions you cannot
use when you create a file, use the umask function.
The syntax for calls to umask is
oldmaskval = umask (maskval);
maskval is the current umask value, and umask
returns the previous (superseded) umask value in oldmaskval.
Each umask value is a file creation mask, and is used to
set the default permissions for files and directories. (See the umask
manual page for more details on file creation masks.)
For example, the following statement disables group and world
access permissions for the newly created file:
$oldperms = umask(0022);
Note: You can determine the current umask value by passing no arguments to umask, as follows:
$currperms = umask();
This statement assigns the current umask value to $currperms.
Some file-test operators in Perl are designed to test for
various permissions. Table 12.2 lists these file-test operators;
in each case, filename is the name of the file
being tested.
Table 12.2. File-test
operators that test for permissions.
Operator Description -g
Does filename have its set group ID bit set? -k
Does filename have its "sticky bit" set? -r
Is filename a readable file? -u Does filename
have its set user ID bit set? -w Is filename
a writable file? -x Is filename an
executable file? -R Is filename readable
only if the real user ID can read it? -W Is filename
writable only if the real user ID can write? -X Is filename executable
only if the real user ID can execute it?
In this case, the real user ID is the userid specified at
login, as opposed to the effective user ID, which is the userid
under which you are currently running. (On some machines, a
command such as /usr/local/etc/suid enables you to change
your effective user ID.)
(See Day 6 for more information on how to use file-test
operators.)
The following sections describe other Perl functions that
manipulate files.
The truncate function enables you to reduce the size of
a specified file to a particular length.
The syntax for the truncate function is
truncate (filename, length);
filename is the name of the file to reduce, and length
is the new length of the file.
For example, the statement
truncate ("/u/jqpublic/longfile", 5000);
reduces the size of /u/jqpublic/longfile to 5000 bytes
in length. (If the file is already smaller than 5000 bytes, truncate
does nothing.)
Note: You can use a file variable in place of the filename.
truncate (MYFILE, 5000);
The file variable must refer to a file opened for writing by the open function.
The stat function retrieves information about a
particular file when given its name or a file variable representing
its name.
The syntax for the stat function is
stat (file);
Here, file is either a filename or a file
variable.
stat returns a list containing the following elements,
in this order:
Some of the items returned by stat can be obtained
using file test operators. Table 12.3 lists these items.
Table 12.3. File-test
operators that check information returned by stat.
Operator Description -b Is filename a
mountable disk (block device)? -c Is filename
an I/O device (character device)? -s Is filename
a non-empty file? -t Does filename represent
a terminal? -A How long since filename
accessed? -C How long since filename's inode
accessed? -M How long since filename
modified? -S Is filename a socket?
For more information on stat or the information it
returns, see the UNIX manual page for the stat command on
your machine.
The lstat function returns the same information as stat,
but it assumes that the name being passed as an argument is a
symbolic link.
The syntax for lstat is the same as that for stat.
lstat (file); file is either a filename or a file variable.
The access and modification times returned by stat and
by the -A and -M file-test operators are integers
representing the number of elapsed seconds from January 1, 1970,
to the time the file was accessed or modified.
To obtain the number of elapsed seconds from January 1, 1970,
to the present time, call the built-in function time.
The syntax for calls to the time function is
currtime = time();
currtime is the returned elapsed-seconds value.
The value returned by time can be converted to either
Greenwich Mean Time or your computer's local time.
To convert to Greenwich Mean Time, call the gmtime
function. To convert to local time, call the localtime
function.
The syntax for the gmtime and localtime
functions is identical:
timelist = gmtime (timeval); timelist = localtime (timeval);
Both functions accept the time value returned by time, stat,
or the -A and -M file-test operators.
Both functions return a list consisting of the following nine
elements:
For more information on the list returned by gmtime or localtime,
refer to the UNIX manual pages for the system functions with the
same names.
The time values returned by stat, time, and the -A
and -M file-test operators can be used to set the access
and modification times of other files. To do this, use the utime
function.
The syntax for the utime function is
utime (acctime, modtime, filelist);
acctime is the new access time, modtime
is the new modification time, and filelist is the
list of files.
utime returns the number of files whose access and
modification times have been successfully changed.
The following is an example of a call to utime:
$acctime = -A "file1";
$modtime = -M "file1";
@filelist = ("file2", "file3");
utime ($acctime, $modtime, @filelist);
Here, the files file2 and file3 have their
access and modification times changed to those of file1.
The fileno function returns the internal UNIX file
descriptor associated with a particular file variable.
The syntax for the fileno function is
filedesc = fileno (filevar);
Here, filevar is the file variable whose
descriptor is to be retrieved.
The file descriptor returned by fileno is used in
various UNIX system calls; these calls can be accessed using the system function
(as described on Day 15).
The flock and fcntl functions call the UNIX
system commands of the same name.
The syntax for the flock and fcntl functions is
fcntl (filevar, fcntlrtn, value); flock (filevar, flockop);
Here, filevar is a file variable representing an
open file. fcntlrtn is a fcntl function as
defined in the UNIX fcntl manual page, and value
is the value passed to the function, if appropriate. Similarly, flockop
is a file-locking operation, as defined in the UNIX flock
manual page.
For more information on these functions, refer to the manual
pages or to a book about UNIX. (You won't really be able to use
these functions effectively unless you know a fair bit about how your operating
system works.)
Many systems on which Perl is available support files that are
created using the Data Base Management (DBM) library. Perl enables
you to use an associative array to access a particular DBM file.
The following sections describe how to access DBM files from
Perl programs using the dbmopen and dbmclose
functions. If you are running Perl 5, these functions have been
superseded by the tie and untie functions; see Day
18, "Object-Oriented Programming," for more details.
For more information on DBM, refer to your system's
appropriate manual pages.
To associate an associative array with a DBM file, use the dbmopen
function.
The syntax for the dbmopen function is
dbmopen (array, dbmfilename, permissions);
This function requires three arguments:
After the DBM file has been opened, the subscripts for the
associative array represent the DBM file keys, and the values of the
array represent the values associated with the keys.
Caution: Calling dbmopen destroys any existing values in the associative array.
To close a DBM file opened by dbmopen, use dbmclose.
The syntax for the dbmclose function is
dbmclose (array);
Here, array is the associative array specified
in the call to dbmopen.
Today, you learned how to open a pipe that directs input to
the program, how to open a file for both reading and writing, and how
to associate multiple file variables with a single file. You also learned
how to test for the end of a particular input file or for the end
of the last input file.
You also learned how to skip backward and forward in files and
how to read single characters from a file using getc. You
can use getc to build hot-key applications, which act as
soon as they read a single character from the keyboard.
Perl provides several functions for manipulating directories.
They enable you to create, open, read, close, delete, and skip around
in directories. Other Perl functions enable you to move a file from
one directory to another, create hard and symbolic links from one
location to another, and delete a hard link (or a file).
You learned about the Perl functions that enable you to change
the file owner or file permissions, truncate a file, retrieve
file information, set file access and modification times,
retrieve the file descriptor, and call the flock and fcntl
system commands.
Finally, Perl provides an interface to the DBM library that
enables you to associate DBM files with associative arrays.
while ($line = <>) ...
The Workshop provides quiz questions to help you solidify your
understanding of the material covered and exercises to give you
experience in using what you've learned. Try and understand the quiz
and exercise answers before you go on to tomorrow's lesson.
#!/usr/local/bin/perl
while ($line = <>) {
print ($line);
if (eof()) {
print (" end of current file \n");
}
}