Today's lesson describes the use of Perl references and the
concept of pointers. Today's lesson also shows you how to use references
to create complex data structures, pass pointers around, and work
with subroutines. You learn the following topics:
A reference is simply a pointer to something, such as a
Perl variable, array, hash (also known as an associative array),
or even a subroutine. The concept of a reference is probably
familiar to Pascal or C programmers. A reference is simply an address
to a value. How you use that value is up to you as the programmer
and what the language lets you get away with. In Perl, you can
refer to a pointer as a reference; in fact, you can use the terms
pointer and reference interchangeably without any loss of meaning.
References are useful in creating complex data structures in
Perl. In fact, you cannot really define any complicated
structures in Perl without using references.
The two types of references in Perl 5 are hard and symbolic.
A symbolic reference contains the name of a variable. Symbolic references
are useful for creating variable names and addressing them at run
time. Basically, a symbolic link is like the name of a file or a
soft link on a UNIX system. Hard references are more like hard
links in the file system (that is, merely another path to the
same underlying item).
Perl 4 permitted only symbolic references, which were
difficult to use. For example, in Perl 4, you would have to use
names to index to an associative array called _main{} of
symbol names for a package. Perl 5 now lets you have hard
references to data.
Hard references keep track of reference counts. When the
reference count becomes zero, Perl automatically frees the item referred
to. If that item happens to be a Perl object, the object is destructed[md]freed
to the memory pool. Perl is object-oriented in itself because
everything in Perl is an object. Packages and modules make it
much easier to use objects in Perl.
Hard references are easy to use in Perl as long as you use
them as scalars. To use hard references as anything but scalars, you
have to explicitly de-reference the variable and tell it how you
want it to behave. If this sounds confusing, don't worry; references
are covered in Day 19, "Object-Oriented Programming in
Perl" to help make this concept clearer.
In today's lesson, a scalar value refers to a variable such as $pointer.
The variable $pointer contains one data item; whether the
item is a number, string, or an address is determined by how you
use it.
Any scalar can hold a hard reference, and because arrays and
hashes do contain scalars, it follows that you can now easily build
complex data structures of different combinations of arrays of arrays, arrays
of hashes, hashes of functions, and so on. As long as you
understand that you are working only with scalars, you should be
able to navigate through the most complex structures with proper dereferencing.
Let's cover some of the basics first before we get too deep
into the chapter.
To use the value of $pointer as the pointer to an
array, you reference the items in the array as @$pointer.
This notation of "@$pointer" roughly translates
to "take the address in $pointer and then use it as
an array." Similarly for hashes, you would use %$pointer
as the reference to the first element in the hash.
Because there are several ways to construct references, you
can have references to just about anything, such as arrays,
scalar variables, subroutines, file handles, and, yes[md]to the
delight of C programmers[md]even other references. Perl gives you
the power to write enough complicated code to hang yourself.
Now look at some of the ways that you can create and use
references in Perl.
Using the backslash operator is analogous to using the
ampersand (&) operator in C to pass the address of an
operator. Usually, you use the backslash operator to create a
second, new reference to a variable. The following code shows how
to create a reference to a scalar variable:
$variable = 22; $pointer = \$variable; $ice = "jello" $iceptr = \$ice;
$pointer points to the location that contains the value
of $variable. The pointer $iceptr points to "jello".
Even if the original reference $variable gets destroyed,
you can still access the value from the $pointer
reference. There is a hard reference at work here, so you will
have to get rid of both $pointer and $variable for
the space in which 22 is allocated to be freed back to the
memory pool.
In the preceding code, the variable $pointer contains
the address of $variable, not the value itself. To get the
value, you have to de-reference $pointer with two $$.
The following sample script shows how this works:
#!/usr/bin/perl $value = 10; $pointer = \$value; printf "\n Pointer Address $pointer of $value \n"; printf "\n What Pointer *($pointer) points to $$pointer\n";
The $value in the script is set to 10. The $pointer
is set to point to the address of $value. The two printf
statements show how the value of the variable is referenced. If
you run the script shown, you see something very close to the
following output:
Pointer Address SCALAR(0x806c520) of 10 What Pointer *(SCALAR(0x806c520)) points to 10
The address in the output from your script will probably be
different from what's shown. However, you can see that $pointer gave
the address and $$pointer gave the value of the scalar that $variable points
to.
Pay attention to how the address is shown in the pointer
variable. The word SCALAR is followed by a long
hexadecimal number. The word SCALAR tells you that the
address points to a scalar variable. The number following the SCALAR
is the address where the actual value of the scalar variable is
kept.
Note: A pointer is an address. The data at that address is referred to by a pointer. If the pointer happens to point to an invalid address, you can get bad data. Generally, Perl will simply return a NULL value, but you should not rely on this, and program to initialize all your pointers to refer to valid data items.
Perhaps the most important point you must remember about Perl
is that all Perl @ARRAYs and %HASHes are always one-dimensional.
As such, the arrays and hashes hold scalar values only and do not
directly contain other arrays or complex data structures. A
member of an array is either a number or a reference (including
strings).
You can use the backslash operator on arrays and hashes just
as you would for scalar variables. You would use something like
Listing 18.1 for arrays. (The line numbers are for illustrative purposes only.)
Listing 18.1. Using the backslash
operator on arrays.
1 #!/usr/bin/perl
2 #
3 # Using Array references
4 #
5 $pointer = \@ARGV;
6 printf "\n Pointer Address of ARGV = $pointer\n";
7 $i = scalar(@$pointer);
8 printf "\n Number of arguments : $i \n";
9 $i = 0;
10 foreach (@$pointer) {
11 printf "$i : $$pointer[$i++]; \n";
12 }
$ test 1 2 3 4
Pointer Address of ARGV = ARRAY(0x806c378)
Number of arguments : 4
0 : 1;
1 : 2;
2 : 3;
3 : 4;
Examine the lines that pertain to references in the shell
script shown, which prints the contents of the input argument
array @ARGV. Line 5 is where the reference $pointer
is set to point to the array @ARGV. Line 6 simply prints
the address of ARGV. You probably will never have to use
the address of ARGV, but had you been using another array,
this is a quick way to get to the address of the first element of
the array.
Note: Pointers are referred to as references, and vice versa.
The $pointer returns the address of the first element
of an array. In Listing 18.1, the array happened to be @ARGV. A pointer
to an array should sound familiar to C programmers because a
reference to a one-dimensional array is simply a pointer to the
first element of the array.
Line 7 calls the function scalar() (not to be
confused with the type of variable scalar) to get the count of
the number of elements in an array. The parameter passed in could
be @ARGV, but with the pointer $pointer, you must
specify the type of parameter that is expected by the scalar() function. Therefore,
you specify the type of parameter as an array by using @$pointer.
The type of $pointer in this case is a pointer
to the array whose number of elements you must return from the scalar()
function. The call to the function has @$pointer as the
passed parameter. The $pointer gives the address of the
first element, and the @ sign forces the passing of
the address of the first element as an array reference.
Line 10 contains the same reference to the array that line 7
contains. Line 11 lists all the elements of the array using the $$pointer[$i]
item. How do you interpret this? The $pointer points to
the first element in the array. The program then gets the ($i
- 1)-th item in the array ($pointer[$i++]) and increment $i.
Finally, the value at $$pointer[$i] is returned as a
scalar. Because the autoincrement operator is low on the operator
precedence priority list, $i is incremented last of all.
You can also use the backslash operator with associative
arrays. The idea is the same[md]you are substituting the $pointer
for all references to the name of the associative array. The
number following the word ARRAY in the pointer address of ARGV in
the previous example is the address of ARGV. The address
itself won't do you any good, because most programs do not need
this information, but just realize that references to arrays and
scalars are displayed with the type that they happen to be pointing
to.
For pointers to functions, the address is printed with the
word CODE, and for a hash, it is printed as HASH.
See Listing 18.2 for an example of how to print out an address to
a hash.
Listing 18.2. Using references to a
hash.
#!/usr/bin/perl
1#
2 # Using Associative Array references
3 #
4 %month = (
5 '01', 'Jan',
6 '02', 'Feb',
7 '03', 'Mar',
8 '04', 'Apr',
9 '05', 'May',
10 '06', 'Jun',
11 '07', 'Jul',
12 '08', 'Aug',
13 '09', 'Sep',
14 '10', 'Oct',
15 '11', 'Nov',
16 '12', 'Dec',
17 );
18
19 $pointer = \%month;
20
21 printf "\n Address of hash = $pointer\n ";
22
23 #
24 # The following lines would be used to print out the
25 # contents of the associative array if %month was used.
26 #
27 # foreach $i (sort keys %month) {
28 # printf "\n $i $$pointer{$i} ";
29 # }
30
31 #
32 # The reference to the associative array via $pointer
33 #
34 foreach $i (sort keys %$pointer) {
35 printf "$i is $$pointer{$i} \n";
36 }
$ mth
Address of hash = HASH(0x806c52c)
01 is Jan
02 is Feb
03 is Mar
04 is Apr
05 is May
06 is Jun
07 is Jul
08 is Aug
09 is Sep
10 is Oct
11 is Nov
12 is Dec
The reference to the associative array is made with the code
in the line $pointer = \%month;. As with ordinary arrays,
the references to the elements of the array are made with the $$pointer{$index}
construct. Of course, because the array is really a hash, the $index
is the key into the hash and not a number. See lines 34 and 35 to
see how elements in the array are being referenced.
You don't have to construct associative arrays using the comma
operator. You can use the => operator instead. In the
later Perl module and sample code in this chapter, you will see
the => operator, which is the same as the comma
operator. Using => makes the code a bit easier to read. See
Listing 18.3 for a sample usage of the => operator.
Listing 18.3. Using the =>
operator.
1 #!/usr/bin/perl
2 #
3 # Using Array references
4 #
5 %weekday = (
6 '01' => 'Mon',
7 '02' => 'Tue',
8 '03' => 'Wed',
9 '04' => 'Thu',
10 '05' => 'Fri',
11 '06' => 'Sat',
12 '07' => 'Sun',
13 );
14 $pointer = \%weekday;
15 $i = '05';
16 printf "\n ================== start test ================= \n";
17 #
18 # These next two lines should show an output
19 #
20 printf '$$pointer{$i} is ';
21 printf "$$pointer{$i} \n";
22 printf '${$pointer}{$i} is ';
23 printf "${$pointer}{$i} \n";
24 printf '$pointer->{$i} is ';
25
26 printf "$pointer->{$i}\n";
27 #
28 # These next two lines should not show anything
29 #
30 printf '${$pointer{$i}} is ';
31 printf "${$pointer{$i}} \n";
32 printf '${$pointer->{$i}} is ';
33 printf "${$pointer->{$i}}";
34 printf "\n ================== end of test ================= \n";
35
================== start test =================
$$pointer{$i} is Fri
${$pointer}{$i} is Fri
$pointer->{$i} is Fri
${$pointer{$i}} is
${$pointer->{$i}} is
================== end of test =================
As you can see, the first two lines provided the expected
output. The first reference is used in the same way as references
to regular arrays. The second line uses the ${pointer} and then indexes
using {$i}, and the leftmost $ de-references (gets)
the value at the location reached after the indexing. See Lines
20 through 23.
Note: When in doubt, print it out. Always use the print statements in Perl to print out values of suspect code. This way you can be sure of how Perl is interpreting your code. Print statements are a cheap tool to use for learning how the Perl interpreter works.
Then, two lines of the output didn't work as expected. In the
third line, $pointer{$i} tries to reference an array where
there is no first element. Because the first element does not
point to a valid string, nothing is printed. Nothing is printed
in the fourth line of the output for the same reason. See lines
30 through 33.
You create a reference to an array through the statement @array
= list. You use square brackets to create a reference to a complex
anonymous array. Consider the following statement, which sets the parameters
for a three-dimensional drawing program:
$line = ['solid', 'black', ['1','2','3'] , ['4', '5', '6']];
The preceding statement constructs an array of four elements.
The array is referred to by the scalar $line. The first
two elements are scalars, indicating the type and color of the
line to draw. The next two elements are references to anonymous arrays
and contain the starting and ending points of the line.
To get to the elements of the inner array elements, you can
use the following multidimensional syntax:
$arrayReference->[$index1][$index2][$index3] three-dimensional array
You can create as complex a structure as your sanity, design
practices, and computer memory allow. Be kind to the person who
might have to manage your code[md]please keep it as simple as possible.
On the other hand, if you are just trying to impress someone with
your coding ability, Perl gives you a lot of opportunity to
mystify yourself and improve your social life.
Tip: When you have more than three dimensions for any array, consider using a different data structure to simplify the code.
Let's see how creating arrays within arrays works in practice.
See Listing 18.4 to see how to print out the information pointed at
by the $list reference.
Listing 18.4. Using multi-dimensional
array references.
1 #!/usr/bin/perl 2 # 3 # Using Multi-dimensional Array references 4# 5 $line = ['solid', 'black', ['1','2','3'] , ['4', '5', '6']]; 6 print "\$line->[0] = $line->[0] \n"; 7 print "\$line->[1] = $line->[1] \n"; 8 print "\$line->[2][0] = $line->[2][0] \n"; 9 print "\$line->[2][1] = $line->[2][1] \n"; 10 print "\$line->[2][2] = $line->[2][2] \n"; 11 print "\$line->[3][0] = $line->[3][0] \n"; 12 print "\$line->[3][1] = $line->[3][1] \n"; 13 print "\$line->[3][2] = $line->[3][2] \n"; 14 print "\n"; # The obligatory output beautifier. $line->[0] = solid $line->[1] = black $line->[2][0] = 1 $line->[2][1] = 2 $line->[2][2] = 3 $line->[3][0] = 4 $line->[3][1] = 5 $line->[3][2] = 6
What about the third dimension for an array? Look at a
modified version of the same program but add a new twist to the
list just created. See Listing 18.5.
Listing 18.5. Using multi-dimensional
array references again.
1 #!/usr/bin/perl 2# 3# Using Multi-dimensional Array references again 4# 5$line = ['solid', 'black', ['1','2','3', ['4', '5', '6']]]; 6 print "\$line->[0] = $line->[0] \n"; 7 print "\$line->[1] = $line->[1] \n"; 8 print "\$line->[2][0] = $line->[2][0] \n"; 9 print "\$line->[2][1] = $line->[2][1] \n"; 10 print "\$line->[2][2] = $line->[2][2] \n"; 11 print "\$line->[2][3][0] = $line->[2][3][0] \n"; 12 print "\$line->[2][3][1] = $line->[2][3][1] \n"; 13 print "\$line->[2][3][2] = $line->[2][3][2] \n"; 14 print "\n";
There is no output for this listing.
In this example of an array that's three deep, you must use a
reference such as $line->[2][3][0]. For a C programmer,
this is akin to the statement Array_pointer[2][3][0],
where the pointer is pointing to what's declared as an array with
three indices.
Can you see how easy it is to set up complex structures of
arrays within arrays? The examples shown thus far have used only hard-coded
numbers as the indices. There is nothing preventing you from
using variables instead.
As with array constructors, you can mix and match hashes and
arrays to create as complex a structure as you want.
Let's see how these two hashes and arrays can be combined.
Listing 18.6 uses the point numbers and coordinates to define a cube.
Listing 18.6. Defining a cube.
1 #!/usr/bin/perl
2 #
3 # Using Multi-dimensional Array and Hash references
4 #
5 %cube = (
6 '0', ['0', '0', '0'],
7 '1', ['0', '0', '1'],
8 '2', ['0', '1', '0'],
9 '3', ['0', '1', '1'],
10 '4', ['1', '0', '0'],
11 '5', ['1', '0', '1'],
12 '6', ['1', '1', '0'],
13 '7', ['1', '1', '1']
14 );
15 $pointer = \%cube;
16 print "\n Da Cube \n";
17 foreach $i (sort keys %$pointer) {
18 $list = $$pointer{$i};
19 $x = $list->[0];
20 $y = $list->[1];
21 $z = $list->[2];
22 printf " Point $i = $x,$y,$z \n";
23}
There is no output for this listing.
In Listing 18.6, %cube contains point numbers and
coordinates in a hash. Each coordinate itself is an array of three
numbers. The $list variable is used to get a reference to
each coordinate definition with the following statement:
$list = $$pointer{$i};
After you get the list, you can reference off of it to get to
each element in the list with the following statement:
$x = $list->[0]; $y = $list->[1];
The same result[md]assigning values to $x, $y,
and $z[md]could be achieved with the following two lines
of code:
($x,$y,$z) = @$list; $x = $list->[0];
This works because you are de-referencing what $list
points to and using it as an array, which in turn is assigned to
the list ($x,$y,$z). The $x is still assigned with
the -> operator.
When you're working with hashes or arrays, de-referencing by ->
is similar to de-referencing by $. When you are accessing individual
array elements, you are often faced with writing statements such as
the following:
$$names[0] = "Kamran"; $names->[0] = "Kamran";
Both lines are equivalent. The $names in the first line
has been replaced with the -> operator in the second
line. In the case of hashes, the two statements that do the same
type of referencing are listed as shown in the following code:
$$lastnames{"Kamran"} = "Husain";
$lastnames->{"Kamran"} = "Husain";
Array references are created automatically when they are first
referenced in the left side of an equation. Using a reference such
as $array[$i] creates an array into which you can
index with $I. Scalars and even multidimensional arrays
are created the same way. The following statement creates the contours
array if it did not already exist:
$contours[$x][$y][$z] = &xlate($mouseX,$mouseY);
Arrays in Perl can be created and grown on demand. Referencing
them for the first time creates the array. Referencing them again
at different indices creates the referenced elements for you.
In the same way you reference individual items such as arrays
and scalar variables, you can also point to subroutines. This is similar
to pointing to a function in C. To construct such a reference, you use
the following type of statement:
$pointer_to_sub = sub { ... declaration of sub ... } ;
Notice the use of the semicolon at the end of the sub
declaration. The subroutine pointed to by $pointer_to_sub points
to the same function reference even if this statement is placed
in a loop. This feature of Perl enables you to declare anonymous sub() functions
in a loop without worrying about whether you are chewing up
memory by declaring the same function over and over.
To call a subroutine by reference, you must use the following
type of reference:
&$pointer_to_sub( parameters );
This code works because you are de-referencing the $pointer_to_sub
and using it with the ampersand (&) as a pointer to a function.
The parameters portion might or might not be empty depending
on how your function is defined.
The code within a sub is simply a declaration created
through a previous statement. The code within the sub is
not executed immediately, however. It is compiled and set for
each use. Consider Listing 18.7.
Listing 18.7. References to
subroutines.
1#!/usr/bin/perl
2 sub print_coor{
3 my ($x,$y,$z) = @_;
4 print "$x $y $z \n";
5 return $x;};
6 $k = 1;
7 $j = 2;
8 $m = 4;
9 $this = print_coor($k,$j,$m);
10 $that = print_coor(4,5,6);
When you execute Listing 18.7, you get the following output:
$ test 1 2 3 4 5 6
This output reflects that the assignment of $x, $y,
and $z was done when the first declaration of print_coor
was encountered as a call. In Listing 18.7, each reference $this
and $that points to a different subroutine, the arguments
to which were passed at run time.
Subroutines are not limited to returning data types only; they
can also return references to other subroutines. The returned subroutines
run in the context of the calling routine but are set up in the original
call that created them. This behavior is due to the way closure
is handled in Perl. Closure means that if you define a
function in one context, it runs in that particular context where
it was first defined. (See a book on object-oriented programming
to get more information on closure.)
For an example of how closure works, Listing 18.8 shows code
that you could use to set up different types of error messages.
Such subroutines are useful in creating templates of all error messages.
Listing 18.8. Using closures.
#!/usr/bin/perl
sub errorMsg {
my $lvl = shift;
#
# define the subroutine to run when called.
#
return sub {
my $msg = shift; # Define the error type now.
print "Err Level $lvl:$msg\n"; }; # print later.
}
$severe = errorMsg("Severe");
$fatal = errorMsg("Fatal");
$annoy = errorMsg("Annoying");
&$severe("Divide by zero");
&$fatal("Did you forget to use a semi-colon?");
&$annoy("Uninitialized variable in use");
$severe = errorMsg("Severe");
$fatal = errorMsg("Fatal");
$annoy = errorMsg("Annoying");
The subroutine errorMsg declared here uses a local
variable called lvl. After this declaration, errorMsg
uses $lvl in the subroutine it returns to the caller. The
value of $lvl is therefore set in the context when the
subroutine errorMsg is first called, even though the
keyword my is used. The three calls that follow set up
three different $lvl variable values, each in their own context:
$severe = errorMsg("Severe");
$fatal = errorMsg("Fatal");
$annoy = errorMsg("Annoying");
When the subroutine, errorMsg, returns, the value of $lvl
is retained for each context in which $lvl was declared.
The $msg value from the referenced call is used, but the
value of $lvl remains what was first set in the actual
creation of the function.
Sounds confusing? It is. This is primarily the reason you do
not see such code in most Perl programs.
Using arrays is great for collecting relevant
information in one place. Now let's see how we can work with
multiple arrays through subroutines. You pass one or more arrays
into Perl subroutines by reference. However, you have to keep in
mind a few subtle things about using the @_ symbol when
you process these arrays in the subroutine. Look at Listing 18.9,
which is an example of a subroutine that expects a list of names
and a list of phone numbers.
Listing 18.9. Passing multiple
arrays.
1 #!/usr/bin/perl
2 @names = (mickey, goofy, daffy );
3 @phones = (5551234, 5554321, 666 );
4 $i = 0;
5 sub listem {
6 my (@a,@b) = @_;
7 foreach (@a) {
8 print "a[$i] = ". $a[$i] . " " . "\tb[$i] = " . $b[$i] ."\n";
9 $i++;
10 }
11 }
12 &listem(@names, @phones);
a[0] = mickey b[0] =
a[1] = goofy b[1] =
a[2] = daffy b[2] =
a[3] = 5551234 b[3] =
a[4] = 5554321 b[4] =
a[5] = 666 b[5] =
Whoa! What happened to the @b array, and why is the
rest of @a just like the array @b? This result
occurs because the array @_ of parameters in a subroutine
is one[md]I repeat, only one[md]long list of parameters. If you
pass in fifty arrays, the @_ is one array of all
the elements of the fifty arrays concatenated together.
In the subroutine in Listing 18.9, the assignment my (@a,
@b) = @_ gets loosely interpreted by your Perl interpreter
as, "Let's see, @a is an array, so assign one array
from @_ to @a and then assign everything else to @b."
Never mind that the @_ is itself an array and will
therefore get assigned to @a, leaving nothing to assign to @b.
To illustrate this point, let's change the script to how it
appears in Listing 18.10.
Listing 18.10. Passing a scalar and
an array.
#!/usr/bin/perl
@names = (mickey, goofy, daffy );
@phones = (5551234, 5554321, 666 );
$i = 0;
sub listem {
my ($a,@b) = @_;
print " \$a is " . $a . "\n";
foreach (@b) {
print "b[$i] = $b[$i] \n";
$i++;
}
# ---------------------------------------------------
# Actually, you could write the for loop as
# foreach (@b) {
# print $_ . "\n" ;
# }
# This your secret answer to Quiz question 18.4.
# ----------------------------------------------------
}
&listem(@names, @phones);
$ testArray
$a is mickey
b[0] = goofy
b[1] = daffy
b[2] = 5551234
b[3] = 5554321
b[4] = 666
Do you see how $a was assigned the first value and then @b
was assigned the rest of the values? In order get around this @_
interpretation feature and pass arrays into subroutines, you have
to pass arrays in by reference, which you do by modifying the
script to look like the following:
#!/usr/bin/perl
@names = (mickey, goofy, daffy );
@phones = (5551234, 5554321, 666 );
$i = 0;
sub listem {
my ($a,$b) = @_;
foreach (@$a) {
print "a[$i] = " . @$a[$i] . " " . "\tb[$i] = " . @$b[$i] ."\n";
$i++;
}
}
&listem(\@names, \@phones);
The following major changes were necessary to bring the
original script to this point:
The following output matches what we expected:
$ testArray2 a[0] = mickey b[0] = 5551234 a[1] = goofy b[1] = 5554321 a[2] = daffy b[2] = 666
DO pass by reference whenever possible.
DO pass arrays by reference when you are passing more than one array to a subroutine.
DON'T use (@variable)=@_ in a subroutine unless you want to concatenate all the passed parameters into one long array.
When used in a subroutine argument list, scalar variables are
always passed by reference. You do not have a choice here. You
can, however, modify the values of these variables if you really
want to. To access these variables, you can use the @_ array
and index each individual element in it using $_[$index],
where $index counts from zero up.
Arrays and hashes are different beasts altogether. You can
either pass them as references once or pass references to each element
in the array. For long arrays, the choice should be fairly
obvious[md]pass the reference to the array only. In either case,
you can use the references to modify what you want in the
original array.
The @_ mechanism concatenates all the input arrays in a
subroutine into one long array. This feature is nice if you do
want to process the incoming arrays as one long array. Usually,
you want to keep the arrays separate when you process them in a subroutine,
and passing by reference is the best way to do that. Hold that
thought: Don't use globals.
In short, pass by reference and respect the value of any
global variable unless there is a strong compelling reason not
to.
Sometimes, you have to write the same output to different
output files. For example, an application programmer might want the
output to go to the screen in one instance, the printer in another,
and a file in another[md]or even all three at the same time.
Rather than make separate statements for each handle, it would be
nice to write something like the following:
spitOut(\*STDIN); spitOut(\*LPHANDLE); spitOut(\*LOGHANDLE);
Notice that the file handle reference is sent with the \*FILEHANDLE syntax
because you refer to the symbol table in the current package. In
the subroutine that handles the output to the file handle, you
would have code that looks something like the following:
sub spitOut {
my $fh = shift;
print $fh "Gee Wilbur, I like this lettuce\n";
}
In UNIX (and other operating systems), the asterisk is a sort
of wildcard operator. In Perl, you can refer to other variables and
so on by using the asterisk operator:
*iceCream;
When used in this manner, the asterisk is also known as a typeglob.
The asterisk at the beginning of a term can be thought of as a
wildcard match for all the mangled names generated internally by Perl.
You can use a typeglob in the same way you use a reference
because the de-reference syntax always indicates the kind of reference
you want. ${*iceCream} and ${\$iceCream} both
indicate the same scalar variable. Basically, *iceCream
refers to the entry in the internal _main associative array
of all symbol names for the _main package. *kamran
really translates to $_main{'kamran'} if you are in the _main
package context. If you are in another package, the _packageName{}
hash is used.
When evaluated, a typeglob produces a scalar value that
represents the first objects of that name. This includes file
handles, format specifiers, and subroutines.
Using brackets around references makes constructing strings
easier:
$road = ($w) ? "free":"high";
print "${road}way";
The preceding line prints highway or freeway
depending on the value of $w. This syntax will be familiar
to you if you write make files or shell scripts. In fact, you can
use this ${variable} construct outside of double quotes,
as in the following example:
print ${road};
print ${road} . "way";
print ${ road } . "way";
You can also use reserved words in the ${ } brackets.
Check out the following lines:
$if = "road";
print "\n ${if} way \n";
Using reserved words for anything other than their intended
purpose, however, is playing with fire. Be imaginative and make up
your own variables. You can use reserved words, but will have to remember
to force interpretation as a reserved word by adding anything
that makes it more than a reference. It's generally not a good
idea to use a variable called ${while}, because it is
confusing to read.
When you work with hashes, you have to create an extra
reference to the index. In other words, you cannot use something like
this:
$clients { \$credit } = "despicable" ;
The \$credit variable will be converted to a string and
won't be used correctly as an index in the hash. You have to use
a two-step procedure such as this:
$chist = \@credit;
$x{ $chist } = "despicable";
The preceding section brings up an interesting point about
curly braces for a use other than indexing into hashes. In Perl,
curly braces are usually reserved for delimiting blocks of code. Assume
you were returning the passed list by sorting it in reverse order.
The passed list is in @_ of the called subroutine, so the
following two statements are equivalent:
sub backward {
{ reverse sort @_ ; }
};
sub backward {
reverse sort @_ ;
};
When preceded by the @ operator, curly braces enable
you to set up small blocks of evaluated code.
#!/usr/bin/perl
sub average {
($a,$b,$c) = @_;
$x = $a + $b + $c;
$x2 = $a*$a + $b*$b + $c*$c;
return ($x/3, $x2/3 ); }
$x = 1;
$y = 34;
$x = 47;
print "The midpt is @{[&average($x,$y,$z)]} \n";
This script prints 27 and 1121.6666. In the last line of code
with the @{} in the double-quoted string, the contents of
the @{} are evaluated as a block of code. The block
creates a reference to an anonymous array that contains the
results of the call to the subroutine average($x,$y,$z).
The array is constructed because of the brackets around the call.
As a result, the [] construct returns a reference to an
array, which in turn is converted by @{} into a string and
inserted into the double-quoted string.
By now, you should be able to see the difference between hard
and symbolic links. Let's look at some of the minor details of the
two types of links and how these links are handled in Perl.
When you use a symbolic reference that does not exist, Perl
creates the variable for you and uses it. For variables that
already exist, the value of the variable is substituted for the $variable
string. This substitution is a powerful feature of Perl because
you can construct variable names from variable names.
Consider the following example:
1 $lang = "java";
2 $java = "coffee";
3 print "${lang}\n";
4 print "hot${lang}\n";
5 print "$$lang \n"
Look at line 5. The $$lang is first reduced to $java.
Then recognizing that $java can also be re-parsed, the
value of $java ("coffee") is used.
The value of the scalar produced by $$lang is taken to
be the name of a new variable, and the variable at $name
is used. The following is the output from this example:
java hotjava coffee
The difference between a hard reference ($lang) and a
symbolic reference ($$lang) is how the variable name is
derived. With a hard reference, you are referring to a variable's
value directly. Either the variable exists in the symbol table
for the package you are in (that is, which lexical context you
are in), or the variable does not exist. With a symbolic
reference, you are using another level of indirection by
constructing or deriving a symbol name from an existing variable.
To force only hard references in a program and protect
yourself from accidentally creating symbolic references, you can
use the module called strict, which forces Perl to do
strict type checking. To use this module, place the following
statement at the top of your Perl script:
use strict 'refs';
From this point on, only hard references are allowed for the
rest of the script. You place this use strict ...
statement within curly braces to limit the type checking to the
code block within the braces. For example, in the following code,
the type checking would be limited to the code in the subroutine java():
sub java {
use strict "refs";
#
# type checking here.
}
...
# no type checking here.
To turn off the strict type checking at any time within a code
block, use this statement:
no strict 'refs';
One last point: Symbolic references cannot be used on
variables declared with the my construct because these
variables are not kept in any symbol table. Variables declared
with the my construct are only valid for the block in
which they are created. Variables declared with the local word
are visible to all ensuing lower code blocks because they are in
a symbol table.
In addition to consulting the obvious documents such as the
Perl man pages, take a look at the Perl source code for more information.
The 't/op' directory in the Perl source tree has some regression test
routines that should definitely get you thinking. A lot of
documents and references are available at the Web sites www.perl.com and www.metronet.com.
The two types of references in Perl 5 are hard and symbolic.
Hard links work like hard links in UNIX file systems. You can have
more than one hard link to the same item; Perl keeps a reference count
for you. This reference count is incremented or decremented as
references to the item are created or destroyed. When the count
goes to zero, the link and the object it is pointing to are both destroyed.
Symbolic links, which are created through the ${}
construct, are useful in providing multiple stages of references
to objects.
You can have references to scalars, arrays, hashes,
subroutines, and even other references. References themselves are
scalars and have to be de-referenced to the context before being
used. Use @$pointer for an array, %$pointer for a
hash, &$pointer for a subroutine, and so on for dereferencing..
Multidimensional arrays are possible using references in
arrays and hashes.
Parameters are passed into a subroutine through references.
The @_ array is really all the passed parameters
concatenated in one long array. To send separate arrays, use the
references to the individual items.
Tomorrow's lesson covers Perl objects and references to
objects. We have deliberately not covered Perl objects in this chapter
because it requires some knowledge of references. References are
used to create and refer to objects, constructors, and packages.
mysub(\@one, \@two);
my ($a, $b) = @_;
The Workshop provides quiz questions to help you solidify your
understanding of the material covered and exercises to give you
experience in using what you've learned. Try and understand the quiz
and exercise answers before you go on to tomorrow's lesson.
$x= ${$pointer->{$i}};
sub xxx {
my ($a, $b) = @_;
}
printf "$i : $$pointer[$i++]; "; printf " and $i : $pointer->[$i++]; \n";
$HelpHelpHelp = \\\"Help"; print $$$$HelpHelpHelp;
$name = ${$scalarref};
draw(@{$coordinates}, $display);
${$months}[0] = "March";
$foo = sub foo { print "foo\n"; }
$bar = sub bar { print "bar\n"; }
$yuk = sub yuk { print "yuk\n"; }
$huh = sub huh { print "huh\n"; }
@list = ($foo, $bar, $yuk, $huh);