Today's lesson describes everything you need to know about
scalar values in Perl. Today, you learn about the following:
Basically, a scalar value is one unit of data. This
unit of data can be either a number or a chunk of text.
There are several types of scalar values that Perl
understands. Today's lesson describes each of them in turn and
shows you how you can use them.
The most common scalar values in Perl programs are integer
scalar values, also known as integer constants or integer literals.
An integer scalar value consists of one or more digits,
optionally preceded by a plus or minus sign and optionally
containing underscores.
Here are a few examples:
14 10000000000 -27 1_000_000
You can use integer scalar values in expressions or assign
them to scalar variables, as follows:
$x = 12345;
if (1217 + 116 == 1333) {
# statement block goes here
}
In Perl, there is a limit on the size of integers included in
a program. To see what this limit is and how it works, take a
look at Listing 3.1, which prints out integers of various sizes.
Listing 3.1. A program that displays
integers and illustrates their size limitations.
1: #!/usr/local/bin/perl
2:
3: $value = 1234567890;
4: print ("first value is ", $value, "\n");
5: $value = 1234567890123456;
6: print ("second value is ", $value, "\n");
7: $value = 12345678901234567890;
8: print ("third value is ", $value, "\n");
$ program3_1
first value is 1234567890
second value is 1234567890123456
third value is 12345678901234567168
$
This program assigns integer scalar values to the variable $value,
and then prints $value.
Lines 3 and 4 store and print the value 1234567890
without any difficulty. Similarly, lines 5 and 6 successfully
store and print the value 1234567890123456.
Line 7 attempts to assign the value 12345678901234567890
to $value. Unfortunately, this number is too big for Perl
to understand. When line 8 prints out the value assigned to $value,
it prints out
12345678901234567168
As you can see, the last three digits have been replaced with
different values.
Here's what has happened: Perl actually stores integers in the
floating-point registers on your machine. In other words, integers
are treated as if they are floating-point numbers (numbers containing
decimal points).
On most machines, floating-point registers can store
approximately 16 digits before running out of space. As the
output from line 8 shows, the first 17 digits of the number 12345678901234567890
are remembered and stored by the Perl interpreter, and the rest
are thrown away. This means that the value printed by line 8 is
not the same as the value assigned in line 7.
This somewhat annoying limitation on the number of digits in
an integer can be found in almost all programming languages. In fact,
many programming languages have an upper integer limit of 4294967295
(which is equal to 2[32] minus 1).
The number of digits that can be stored varies from machine to
machine. For a more detailed explanation, refer to the discussion
of precision in the following section, "Floating-Point
Scalar Values."
Caution: An integer constant that starts with a 0 is a special case:
$x = 012345;
The 0 at the beginning of the constant (also known as a leading zero) tells the Perl interpreter to treat this as an octal integer constant. To find out about octal integer constants, refer to the section called "Using Octal and Hexadecimal Notation" later today.
As you have just seen, integers in Perl actually are
represented as floating-point numbers. This means that an integer
scalar value is actually a special kind of floating-point scalar
value.
In Perl, a floating-point scalar value consists of all of the
following:
Here are some simple examples of floating-point scalar values:
11.4 -275 -0.3 .3 3.
The optional exponent tells the Perl interpreter to multiply
or divide the scalar value by a power of ten. An exponent
consists of all of the following:
The number in the exponent represents the value by which to
multiply or divide, represented as a power of ten. For example, the
exponent e+01 tells the Perl interpreter to multiply the
scalar value by 10 to the power of 1, or 10. This means that the scalar
value 8e+01 is equivalent to 8 multiplied by 10, or 80.
Similarly, the exponent e+02 is equivalent to
multiplying by 100, e+03 is equivalent to multiplying by 1,000,
and so on. The following scalar values are all equal:
541e+01 54.1e+02 5.41e+03
A negative exponent tells the Perl interpreter to divide by
10. For example, the value 54e-01 is equivalent to 54
divided by 10, or 5.4. Similarly, e-02 tells the Perl
interpreter to divide by 100, e-03 to divide by 1000, and
so on.
The exponent e+00 is equivalent to multiplying by 1,
which does nothing. Therefore, the following values are equal:
5.12e+00 5.12
If you like, you can omit the + when you multiply by a
power of ten.
5.47e+03 5.47e03
Listing 3.2 shows how Perl works with and prints out
floating-point scalar values.
Listing 3.2. A program that displays
various floating-point scalar values.
1: #!/usr/local/bin/perl
2:
3: $value = 34.0;
4: print ("first value is ", $value, "\n");
5: $value = 114.6e-01;
6: print ("second value is ", $value, "\n");
7: $value = 178.263e+19;
8: print ("third value is ", $value, "\n");
9: $value = 123456789000000000000000000000;
10: print ("fourth value is ", $value, "\n");
11: $value = 1.23e+999;
12: print ("fifth value is ", $value, "\n");
13: $value = 1.23e-999;
14: print ("sixth value is ", $value, "\n");
$ program3_2
first value is 34
second value is 11.460000000000001
third value is 1.7826300000000001e+21
fourth value is 1.2345678899999999e+29
fifth value is Infinity
sixth value is 0
$
As in Listing 3.1, this program stores and prints various
scalar values. Line 3 assigns the floating-point value 34.0
to $value. Line 4 then prints this value. Note that
because there are no significant digits after the decimal point,
the Perl interpreter treats 34.0 as if it is an integer.
Line 5 assigns 114.6e-01 to $value, and line 6
prints this value. Whenever possible, the Perl interpreter
removes any exponents, shifting the decimal point appropriately.
As a result, line 6 prints out
11.460000000000001
which is 114.6e-01 with the exponent e-01
removed and the decimal point shifted one place to the left
(which is equivalent to dividing by 10).
Note that the number printed by line 6 is not exactly equal to
the value assigned in line 5. This is a result of round-off
error. The floating-point register cannot contain the exact
value 11.46, so it comes as close as it can. It comes
pretty close--in fact, the first 16 digits are correct. This
number of correct digits is known as the precision, and it
is a property of the machine on which you are working; the
precision of a floating-point number varies from machine to
machine. (The machine on which I ran these test examples supports
a floating-point precision of 16 or 17 digits. This is about normal.)
Note: The size of an integer is roughly equivalent to the supported floating-point precision. If a machine supports a floating-point precision of 16 digits, an integer can be approximately 16 digits long.
Line 6 shows that a floating-point value has its exponent
removed whenever possible. Lines 7 and 8 show what happens when a
number is too large to be conveniently displayed without the exponent. In
this case, the number is displayed in scientific notation.
In scientific notation, one digit appears before the
decimal point, and all the other significant digits (the rest of
the machine's precision) follow the decimal point. The exponent
is adjusted to reflect this. In this example, the number
178.263e+19
is converted into scientific notation and becomes
1.7826300000000001e+21
As you can see, the decimal point has been shifted two places
to the left, and the exponent has, as a consequence, been adjusted
from 19 to 21. As before, the 1 at the end
is an example of round-off error.
If an integer is too large to be displayed conveniently, the
Perl interpreter converts it to scientific notation. Lines 9 and
10 show this. The number
123456789000000000000000000000
is converted to
1.2345678899999999e+29
Here, scientific notation becomes useful. At a glance, you can
tell approximately how large the number is. (In conventional notation,
you can't do this without counting the zeros.)
Lines 11 and 12 show what happens when the Perl interpreter is
given a number that is too large to fit into the machine's floating-point
register. In this case, Perl just prints the word Infinity.
The maximum size of a floating-point number varies from
machine to machine. Generally, the largest possible exponent that can
be stored is about e+308.
Lines 13 and 14 illustrate the case of a number having a
negative exponent that is too large (that is, it's too small to
store). In such cases, Perl either gets as close as it can or
just prints 0.
The largest negative exponent that produces reliable values is
about e-309. Below that, accuracy diminishes.
The arithmetic operations you saw on Day 2, "Basic
Operators and Control Flow," also work on floating-point
values. On that day, you saw an example of a miles-to-kilometers
conversion program that uses floating-point arithmetic.
When you perform floating-point arithmetic, you must remember
the problems with precision and round-off error. Listing 3.3 illustrates
what can go wrong and shows you how to attack this problem.
Listing 3.3. A program that
illustrates round-off error problems in floating-point arithmetic.
1: #!/usr/local/bin/perl
2:
3: $value = 9.01e+21 + 0.01 - 9.01e+21;
4: print ("first value is ", $value, "\n");
5: $value = 9.01e+21 - 9.01e+21 + 0.01;
6: print ("second value is ", $value, "\n");
$ program3_3
first value is 0
second value is 0.01
$
Line 3 and line 5 both subtract 9.01e+21 from itself
and add 0.01. However, as you can see when you examine the
output produced by line 4 and line 6, the order in which you
perform the addition and subtraction has a significant effect.
In line 3, a very small number, 0.01, is added to a
very large number, 9.01e+21. If you work it out yourself,
you see that the result is 9.01000000000000000000001e+21.
The final 1 in the preceding number can be retained
only on machines that support 24 digits of precision in their
floating-point numbers. Most machines, as you've seen, handle
only 16 or 17 digits. As a result, the final 1, along with
some of the zeros, is lost, and the number instead is stored as 9.0100000000000000e+21.
This is the same as 9.01e+21, which means that
subtracting 9.01e+21 yields zero. The 0.01 is lost along
the way.
Line 5, however, doesn't have this problem. The two large
numbers are operated on first, yielding 0, and then 0.01
is added. The result is what you expect: 0.01.
The moral of the story: Floating-point arithmetic is accurate
only when you bunch together operations on large numbers. If the arithmetic
operations are on values stored in variables, it might not be as
easy to spot this problem.
$result = $number1 + $number2 - $number3;
If $number1 and $number3 contain large numbers
and $number2 is small, $result is likely to contain an
incorrect value because of the problem demonstrated in Listing
3.3.
So far, all the integer scalar values you've seen have been in
what normally is called base 10 or decimal notation.
Perl also enables you to use two other notations to represent
integer scalar values:
To use octal notation, put a zero in front of your integer
scalar value:
$result = 047;
This assigns 47 octal, or 39 decimal, to $result.
To use hexadecimal notation, put 0x in front of your
integer scalar value, as follows:
$result = 0x1f;
This assigns 1f hexadecimal, or 31 decimal, to $result.
Perl accepts either uppercase letters or lowercase letters as
representations of the digits a through f:
$result = 0xe; $result = 0xE;
Both of the preceding statements assign 14 (decimal) to $result.
If you are not familiar with octal and hexadecimal notations
and would like to learn more, read the following sections. These sections
explain how to convert numbers to different bases. If you are familiar
with this concept, you can skip to the section called
"Character Strings."
To understand how the octal and hexadecimal notations work,
take a closer look at what the standard decimal notation actually
represents.
In decimal notation, each digit in a number has one of ten
values: the standard numbers 0 through 9. Each digit in a number
in decimal notation corresponds to a power of ten.
Mathematically, the value of a digit x in a number is
x * 10 to the exponent n,
where n is the number of digits you have to skip
before reaching x.
This might sound complicated, but it's really straightforward.
For example, the number 243 can be expressed as follows:
Adding the three numbers together yields 243.
Working through these steps might seem like a waste of time
when you are dealing with decimal notation. However, once you understand
this method, reading numbers in other notations becomes simple.
For example, in octal notation, each digit x in a
number is
x * 8 to the exponent n
where x is the value of the digit, and n
is the number of digits to skip before reaching x.
This is the same formula as in decimal notation, but with the 10
replaced by 8.
Using this method, here's how to determine the decimal
equivalent of 243 octal:
Adding 128, 32 and 3 yields 163, which is the decimal notation
equivalent of 243 octal.
Hexadecimal notation works the same way, but with 16 as the
base instead of 10 or 8. For example, here's how to convert 243
hexadecimal to decimal notation:
Adding these three numbers together yields 579.
Note that the letters a through f represent the numbers 10
through 15, respectively. For example, here's the hexadecimal number
fe in decimal notation:
Adding 240 and 14 yields 254, which is the decimal equivalent
of fe.
You might be wondering why Perl bothers supporting octal and
hexadecimal notation. Here's the answer: Computers store numbers
in memory in binary (base 2) notation, not decimal (base 10) notation.
Because 8 and 16 are multiples of 2, it is easier to represent
stored computer memory in base 8 or base 16 than in base 10. (You
could use base 2, of course; however, base 2 numbers are clumsy
because they are very long.)
Note: Perl supports base-2 operations on integer scalar values. These operations, called bit-manipulation operations, are discussed on Day 4, "More Operators."
On previous days, you've seen that Perl enables you to assign
text to scalar variables. In the following statement, for
instance
$var = "This is some text";
the text This is some text is an example of what is
called a character string (frequently shortened to just string).
A character string is a sequence of one or more letters, digits,
spaces, or special characters.
The following subsections show you
Note: C programmers should be advised that character strings in Perl do not contain a hidden null character at the end of the string. In Perl, null characters can appear anywhere in a string. (See the discussion of escape sequences later today for more details.)
Perl supports scalar variable substitution in character
strings enclosed by double quotation-mark characters. For
example, consider the following assignments:
$number = 11; $text = "This text contains the number $number.";
When the Perl interpreter sees $number inside the
string in the second statement, it replaces $number with
its current value. This means that the string assigned to $text
is actually
This text contains the number 11.
The most immediate practical application of this is in the print
statement. So far, many of the print statements you have
seen contain several arguments, as in the following:
print ("The final result is ", $result, "\n");
Because Perl supports scalar variable substitution, you can
combine the three arguments to print into a single
argument, as in the following:
print ("The final result is $result\n");
Note: From now on, examples and listings that call print use scalar variable substitution because it is easier to read.
Character strings that are enclosed in double quotes accept escape
sequences for special characters. These escape sequences
consist of a backslash (\) followed by one or more characters. The
most common escape sequence is \n, which represents the
newline character as shown in this example:
$text = "This is a string terminated by a newline\n";
Table 3.1 lists the escape sequences recognized in
double-quoted strings.
Table 3.1. Escape
sequences in strings.
Escape Sequence Description \a Bell (beep) \b
Backspace \cn The Ctrl+n
character \e Escape \E Ends the effect of \L, \U or \Q \f
Form feed \l Forces the next letter into lowercase \L
All following letters are lowercase \n Newline \r
Carriage return \Q Do not look for special pattern characters \t Tab \u
Force next letter into uppercase \U All following letters
are uppercase \v Vertical tab
The \Q escape sequence is useful only when the string
is used as a pattern. Patterns are described on Day 7,
"Pattern Matching."
The escape sequences \L, \U, and \Q can
be turned off by \E, as follows:
$a = "T\LHIS IS A \ESTRING"; # same as "This is a STRING"
To include a backslash or double quote in a double-quoted
string, precede the backslash or quote with another backslash:
$result = "A quote \" in a string"; $result = "A backslash \\ in a string";
A backslash also enables you to include a $ character
in a string. For example, the statements
$result = 14;
print("The value of \$result is $result.\n");
print the following on your screen:
The value of $result is 14.
You can specify the ASCII value for a character in base 8 or
octal notation using \nnn, where each n
is an octal digit; for example:
$result = "\377"; # this is the character 255, or EOF
You can also use hexadecimal notation to specify the ASCII
value for a character. To do this, use the sequence \xnn,
where each n is a hexadecimal digit.
$result = "\xff"; # this is also 255
Listing 3.4 is an example of a program that uses escape
sequences. This program takes a line of input and converts it to
a variety of cases.
Listing 3.4. A case-conversion
program.
1: #!/usr/local/bin/perl
2:
3: print ("Enter a line of input:\n");
4: $inputline = <STDIN>;
5: print ("uppercase: \U$inputline\E\n");
6: print ("lowercase: \L$inputline\E\n");
7: print ("as a sentence: \L\u$inputline\E\n");
$ program3_4
Enter a line of input:
tHis Is My INpUT LiNE.
uppercase: THIS IS MY INPUT LINE.
lowercase: this is my input line.
as a sentence: This is my input line.
$
Line 3 of this program reads a line of input and stores it in
the scalar variable $inputline.
Line 5 replaces the string $inputline with the current
value of the scalar variable $inputline. The escape
character \U tells the Perl interpreter to convert
everything in the string into uppercase until it sees a \E
character; as a result, line 4 writes the contents of $inputline
in uppercase.
Similarly, line 6 writes the input line in all lowercase
characters by specifying the escape character \L in the
string.
Line 7 combines the escape characters \L and \u.
The \L specifies that everything in the string is to be in
lowercase; however, the \u special character temporarily
overrides this and tells the Perl interpreter that the next
character is to be in uppercase. When this character--the first
character in the line--is printed, the \L escape character
remains in force, and the rest of the line is printed in lowercase.
The result is as if the input line is a single sentence in
English. The first character is capitalized, and the remainder is
in lowercase.
Perl also enables you to enclose strings using the '
(single quotation mark) character:
$text = 'This is a string in single quotes';
There are two differences between double-quoted strings and
single-quoted strings. The first difference is that scalar
variables are replaced by their values in double-quoted strings
but not in single-quoted strings. The following is an example:
$string = "a string"; $text = "This is $string"; # becomes "This is a string" $text = 'This is $string'; # remains 'This is $string'
The second difference is that the backslash character, \,
does not have a special meaning in single-quoted strings. This
means that the statement
$text = 'This is a string.\n';
assigns the following string to $text:
This is a string.\n
The \ character is special in only two instances for
single-quoted strings. The first is when you want to include a
single-quote character ' in a string.
$text = 'This string contains \', a quote character';
The preceding line of code assigns the following string to $text:
This string contains ', a quote character
The second instance is to escape the backslash itself.
$text = 'This string ends with a backslash \\';
The preceding code line assigns the following string to $text:
This string ends with a backslash \
As you can see, the double backslash makes it possible for the
backslash character (\) to be the last character in a
string.
Caution: Single-quoted strings can be spread over multiple lines. The statement
$text = 'This is two lines of text ';
is equivalent to the statement
$text = "This is two\nlines of text\n";
This means that if you forget the closing ' for a string, the Perl interpreter is likely to get quite confused because it won't detect an error until after it starts processing the next line.
As you've seen, you can use a scalar variable to store a
character string, an integer, or a floating-point value. In
scalar variables, a value that was assigned as a string can be
used as an integer whenever it makes sense to do so, and vice
versa. In the following example
$string = "43"; $number = 28; $result = $string + $number;
the value of $string is converted to an integer and
added to the value of $number. The result of the addition,
71, is assigned to $result.
Another instance in which strings are converted to integers is
when you are reading a number from the standard input file. The following
is some code similar to code you've seen before:
$number = <STDIN>; chop ($number); $result = $number + 1;
This is what is happening: When $number is assigned a
line of standard input, it really is being assigned a string. For
instance, if you enter 22, $number is assigned the string 22\n
(the \n represents the newline character). The chop
function removes the \n, leaving the string 22, and
this string is converted to the number 22 in the
arithmetic expression.
Caution: If a string contains characters that are not digits, the string is converted to 0 when used in an integer context. For example:
$result = "hello" * 5; # this assigns 0 to $result, since "hello" becomes 0
This is true even if the string is a valid hexadecimal integer if the quotes are removed, as in the following:
$result = "0xff" + 1;
In cases like this, Perl does not tell you that anything has gone wrong, and your results might not be what you expect.
Also, strings containing misprints might not contain what you expect. For example
$result = "12O34"; # the letter O, not the number 0
When converting from a string to an integer, Perl starts at the left and continues until it sees a letter that is not a digit. In the preceding instance, 12O34 is converted to the integer 12, not 12034.
In Perl, all scalar variables have an initial value of the
null string, "". This means that you do not need to
define a value for a scalar variable.
#!/usr/local/bin/perl
$result = $undefined + 2; # $undefined is not defined
print ("The value of \$result is $result.\n");
This short program is perfectly legal Perl. The output is
The value of $result is 2.
Because $undefined is not defined, the Perl interpreter
assumes that its value is the null string. This null string is
then converted to 0, because it is being used in an addition
operation. The result of the addition, 2, is assigned to $result.
Tip: Although you can use uninitialized variables in your Perl programs, you shouldn't. If your Perl program gets to be large (as many complicated programs do), it might be difficult to determine whether a particular variable is supposed to be appearing for the first time or whether it is a spelling mistake that should be fixed. To avoid ambiguity and to make life easier for yourself, initialize every scalar variable before using it.
Perl supports three kinds of scalar values: integers,
floating-point numbers, and character strings.
Integers can be in three notations: standard (decimal)
notation, octal notation, and hexadecimal notation. Octal
notation is indicated by a leading 0, and hexadecimal
notation is indicated by a leading 0x. Integers are stored
as floating-point values and can be as long as the machine's floating-point
precision (usually 16 digits or so).
Floating-point numbers can consist of a string of digits that
contain a decimal point and an optional exponent. The exponent's range
can be anywhere from about e-309 to e+308. (This
value might be different on some machines.) When possible, floating-point
numbers are displayed without the exponent; failing that, they
are displayed in scientific notation (one digit before the
decimal point).
When you use floating-point arithmetic, be alert for round-off
errors. Performing arithmetic operations in the proper order--operating
on large numbers first--might yield better results.
You can enclose character strings in either double quotes (")
or single quotes ('). If a scalar variable name appears in
a character string enclosed in double quotes, the value of the
variable is substituted for its name. Escape characters are recognized
in strings enclosed in double quotes; these characters are
indicated by a backslash \.
Character strings in single quotes do not support escape
characters, with the exception of \\ and \'. Scalar
variable names are not replaced by their values.
Strings and integers are freely interchangeable in Perl
whenever it is logically possible to do so.
The Workshop provides quiz questions to help you solidify your
understanding of the material covered and exercises to give you
experience in using what you've learned. Try and understand the quiz
and exercise answers before you go on to tomorrow's lesson.
print ("I am bored\b\b\b\b\bhappy!\n");
#!/usr/local/bin/perl
$inputline = <STDIN>;
print ('here is the value of \$inputline\', ": $inputline");
$num1 = 6.02e+23; $num2 = 11.4; $num3 = 5.171e+22; $num4 = -2.5; $result = $num1 + $num2 - $num3 + $num4;