Issue #008
March, 1996


Contents:

Virtual Functions in C++
Introduction to Stream I/O Part 3 - Copying Files
Using C++ as a Better C Part 8 - Type Names
Answer to Programming Quiz


INTRODUCTION

In this issue we'll talk about C++ virtual functions and related issues
like polymorphism, present several ways of copying files using stream
I/O, and discuss some differences between C and C++ with usage of type
names and name hiding.


VIRTUAL FUNCTIONS IN C++

Imagine that you are doing some graphics programming, with a variety
of shapes to be output to the screen.  Initially, you want to support
Line, Circle, and Text.  Each shape has an X,Y origin and a color.

How might this be done in C++?  One way is to use virtual functions.
A virtual function is a function member of a class, declared using the
"virtual" keyword.  A pointer to a derived class object may be
assigned to a base class pointer, and a virtual function called
through the pointer.  If the function is virtual and occurs both in
the base class and in derived classes, then the right function will be
picked up based on what the base class pointer "really" points at.

For graphics, we can use a base class called Shape, with derived
classes named Line, Circle, and Text.  Shape and each of the derived
classes has a virtual function draw().  We create new objects and
point at them using Shape* pointers.  But when we call a draw()
function, as in:

        Shape* p = new Line(0.1, 0.1, Co_blue, 0.4, 0.4);

        p->draw();

the draw() function for a Line is called, not the draw() function for
Shape.  This style of programming is very common and goes by names
like "polymorphism" and "object-oriented programming".  To illustrate
it further, here is an example of this type of programming for a
graphics application.  Annotations in /* */ explain in some detail
what is going on.

        #include <string.h>
        #include <assert.h>
        #include <iostream.h>
        
        typedef double Coord;

        /*
        The type of X/Y points on the screen.
        */
        
        enum Color {Co_red, Co_green, Co_blue};

        /*
        Colors.
        */
        
        // abstract base class for all shape types
        class Shape {
        protected:
                Coord xorig; // X origin
                Coord yorig; // Y origin
                Color co; // color

        /*
        These are protected so that they can be accessed
        by derived classes.  Private wouldn't allow this.
        
        These data members are common to all shape types.
        */

        public:
                Shape(Coord x, Coord y, Color c) :
                    xorig(x), yorig(y), co(c) {} // constructor

        /*
        Constructor to initialize data members common to
        all shape types.
        */

                virtual ~Shape() {} // virtual destructor

        /*
        Destructor for Shape.  It's a virtual function.
        Destructors in derived classes are virtual also
        because this one is declared so.
        */

                virtual void draw() = 0; // pure virtual draw() function

        /*
        Similarly for the draw() function.  It's a pure virtual and
        is not called directly.
        */

        };
        
        // line with X,Y destination
        class Line : public Shape {

        /*
        Line is derived from Shape, and picks up its
        data members.
        */

                Coord xdest; // X destination
                Coord ydest; // Y destination

        /*
        Additional data members needed only for Lines.
        */

        public:
                Line(Coord x, Coord y, Color c, Coord xd, Coord yd) :
                    xdest(xd), ydest(yd),
                    Shape(x, y, c) {} // constructor with base initialization

        /*
        Construct a Line, calling the Shape constructor as well
        to initialize data members of the base class.
        */

                ~Line() {cout << "~Line\n";} // virtual destructor

        /*
        Destructor.
        */

                void draw() // virtual draw function
                {
                        cout << "Line" << "(";
                        cout << xorig << ", " << yorig << ", " << int(co);
                        cout << ", " << xdest << ", " << ydest;
                        cout << ")\n";
                }

        /*
        Draw a line.
        */

        };
        
        // circle with radius
        class Circle : public Shape {
                Coord rad; // radius of circle

        /*
        Radius of circle.
        */

        public:
                Circle(Coord x, Coord y, Color c, Coord r) : rad(r),
                    Shape(x, y, c) {} // constructor with base initialization

                ~Circle() {cout << "~Circle\n";} // virtual destructor
                void draw() // virtual draw function
                {
                        cout << "Circle" << "(";
                        cout << xorig << ", " << yorig << ", " << int(co);
                        cout << ", " << rad;
                        cout << ")\n";
                }
        };
        
        // text with characters given
        class Text : public Shape {
                char* str; // copy of string
        public:
                Text(Coord x, Coord y, Color c, const char* s) :
                    Shape(x, y, c) // constructor with base initialization
                {
                        str = new char[strlen(s) + 1];
                        assert(str);
                        strcpy(str, s);

        /*
        Copy out text string.  Note that this would be done differently
        if we were taking advantage of some newer C++ features like
        exceptions and strings.
        */

                }
                ~Text() {delete [] str; cout << "~Text\n";} // virtual dtor

        /*
        Destructor; delete text string.
        */

                void draw() // virtual draw function
                {
                        cout << "Text" << "(";
                        cout << xorig << ", " << yorig << ", " << int(co);
                        cout << ", " << str;
                        cout << ")\n";
                }
        };
        
        int main()
        {
                const int N = 5;
                int i;
                Shape* sptrs[N];

        /*
        Pointer to vector of Shape* pointers.  Pointers to classes
        derived from Shape can be assigned to Shape* pointers.
        */

                // initialize set of Shape object pointers
        
                sptrs[0] = new Line(0.1, 0.1, Co_blue, 0.4, 0.5);
                sptrs[1] = new Line(0.3, 0.2, Co_red, 0.9, 0.75);
                sptrs[2] = new Circle(0.5, 0.5, Co_green, 0.3);
                sptrs[3] = new Text(0.7, 0.4, Co_blue, "Howdy!");
                sptrs[4] = new Circle(0.3, 0.3, Co_red, 0.1);

        /*
        Create some shape objects.
        */

                // draw set of shape objects
        
                for (i = 0; i < N; i++)
                        sptrs[i]->draw();

        /*
        Draw them using virtual functions to pick up the
        right draw() function based on the actual object
        type being pointed at.
        */

                // cleanup
        
                for (i = 0; i < N; i++)
                        delete sptrs[i];        

        /*
        Clean up the objects using virtual destructors.
        */

                return 0;
        }

When we run this program, the output is:

        Line(0.1, 0.1, 2, 0.4, 0.5)
        Line(0.3, 0.2, 0, 0.9, 0.75)
        Circle(0.5, 0.5, 1, 0.3)
        Text(0.7, 0.4, 2, Howdy!)
        Circle(0.3, 0.3, 0, 0.1)
        ~Line
        ~Line
        ~Circle
        ~Text
        ~Circle

with enum color values represented by small integers.

A few additional comments.  Virtual functions typically are
implemented by placing a pointer to a jump table in each object
instance.  This table pointer represents the "real" type of the
object, even though the object is being manipulated through a base
class pointer.

Because virtual functions usually need to have their function address
taken, to store in a table, declaring them inline as the above example
does is often a waste of time.  They will be laid down as static copies
per object file.  There are some advanced techniques for optimizing
virtual functions, but you can't count on these being available.

Note that we declared the Shape destructor virtual (there are no
virtual constructors).  If we had not done this, then when we iterated
over the vector of Shape* pointers, deleting each object in turn, the
destructors for the actual object types derived from Shape would not
have been called, and in the case above this would result in a memory
leak in the Text class.

Shape is an example of an abstract class, whose purpose is to serve as
a base for derived classes that actually do the work.  It is not
possible to create an actual object instance of Shape, because it
contains at least one pure virtual function.


INTRODUCTION TO STREAM I/O PART 3 - COPYING FILES

Suppose that you're writing a program to copy from standard input to
standard output.  A common way of doing this is to say:

        #include <stdio.h>
        #include <assert.h>

        int main(int argc, char* argv[])
        {
                FILE* fpin;
                FILE* fpout;
                int c;

                assert(argc == 3);

                fpin = fopen(argv[1], "r");
                fpout = fopen(argv[2], "w");

                assert(fpin && fpout);

                while ((c = getc(fpin)) != EOF)
                        putc(c, fpout);

                fclose(fpin);
                fclose(fpout);

                return 0;
        }

EOF is a marker used to signify the end of file; its value typically
is -1.  In most commonly-used operating systems there is no actual
character in a file to signify end of file.

This approach works on text files.  Unfortunately, however, for binary
files, an attempt to copy a 10406-byte file resulted in output of only
383 bytes.  Why?  Because EOF is itself a valid character that can
occur in a binary file.  If set to -1, then this is equivalent to 255
or 0377 or 0xff, a perfectly legal byte in a file.  So we would need
to say:

        #include <stdio.h>
        #include <assert.h>

        int main(int argc, char* argv[])
        {
                FILE* fpin;
                FILE* fpout;
                int c;

                assert(argc == 3);

                fpin = fopen(argv[1], "rb");
                fpout = fopen(argv[2], "wb");

                assert(fpin && fpout);

                for (;;) {
                        c = getc(fpin);
                        if (feof(fpin))
                                break;
                        fputc(c, fpout);
                }

                fclose(fpin);
                fclose(fpout);

                return 0;
        }

feof() is a macro that tells whether the previous operation, in this
case getc(), hit end of file.  Note also that we open the files in
binary mode.

How would we do the equivalent in C++?  One way would be to say:

        #include <fstream.h>
        #include <assert.h>

        int main(int argc, char* argv[])
        {
                assert(argc == 3);

                ifstream ifs(argv[1], ios::in | ios::binary);
                ofstream ofs(argv[2], ios::out | ios::binary);
                assert(ifs && ofs);

                char c;

                while (ifs.get(c))
                        ofs.put(c);

                return 0;
        }

ifstream and ofstream are input and output file streams, taking a
single char* argument and a set of flags.

These classes are derived from ios, which has an operator conversion
function (from a stream object to void*).  If a statement like:

        assert(ifs && ofs);

is specified, then this conversion function is called.  It returns 0
if there's something wrong with the stream.  In other words, an object
like "ifs" is converted to a void* automatically, and the value of the
void* pointer tells the stream status (non-zero for a good state, zero
for bad).

The actual copying is straightforward, using the get() member
function.  It accepts a reference to a character, so there's no need
to use the return value to pass back the character that was read.

A somewhat terser approach would be to say:

        #include <fstream.h>
        #include <assert.h>

        int main(int argc, char* argv[])
        {
                assert(argc == 3);

                ifstream ifs(argv[1], ios::in | ios::binary);
                ofstream ofs(argv[2], ios::out | ios::binary);
                assert(ifs && ofs);

                ofs << ifs.rdbuf();

                return 0;
        }

with no loop involved.  The expression:

        ifs.rdbuf()

returns a filebuf*, a pointer to an object that actually represents
the low-level buffering for the file.  filebuf is derived from a class
streambuf, and ofstream is derived from ostream, and ostream has an
operator<< defined for streambufs.  So the looping over the input file
occurs within operator<<.  We are "outputting" a filebuf/streambuf.

Finally, how about code for copying standard input to output:

        #include <iostream.h>

        int main()
        {
                char c;

                while (cin >> c)
                        cout << c;

                return 0;
        }

If you run this program on text input, you will notice that the
output's pretty jumbled.  This is because by default whitespace is
skipped on input.  To fix this problem, you can say:

        #include <iostream.h>

        int main()
        {
                char c;

                cin.unsetf(ios::skipws);

                while (cin >> c)
                        cout << c;

                return 0;
        }

to disable the skipws flag.  This program does not, however, work with
binary files.  To make it work gets into a tricky issue; the binary
mode is specified when opening a file, and in this example standard
input and output are already open.  This ties in with low-level
buffering and reading the first chunk of a file when it's opened.  By
contrast, skipping whitespace is a higher-level operation in the
stream I/O library.


USING C++ AS A BETTER C PART 8 - TYPE NAMES

In C, a common style of usage is to say:

        struct A {
                int x;
        };
        typedef struct A A;

after which A can be used as a type name to declare objects:

        void f()
        {
                A a;
        }

In C++, classes, structs, unions, and enum names are automatically type
names, so you can say:

        struct A {
                int x;
        };

        void f()
        {
                A a;
        }

or:

        enum E {ee};

        void f()
        {
                E e;
        }

By using the typedef trick you can follow a style of programming in C
somewhat like that used in C++.

But there is a quirk or two when using C++.  Consider usage like:

        struct A {
                int x;
        };

        int A;

        void f()
        {
                A a;
        }

This is illegal because the int declaration A hides the struct
declaration.  The struct A can still be used, however, by specifying
it via an "elaborated type specifier":

        struct A

The same applies to other type names:

        class A a;

        union U u;

        enum E e;

Taking advantage of this feature, that is, giving a class type and a
variable or function the same name, isn't very good usage.  It's
supported for compatibility reasons with old C code; C puts structure
tags (names) into a separate namespace, but C++ does not.  Terms like
"struct compatibility hack" and "1.5 namespace rule" are sometimes
used to describe this feature.


ANSWER TO PROGRAMMING QUIZ

The question was about how many lines this program produces for output:

        #include <iostream.h>

        main()
        {
                unsigned int n;

                n = 10;

                do {
                        cout << "xxx" << "\n";
                        n--;
                } while (n >= 0);

                return 0;
        }

The answer is an infinite number.  The loop never terminates because
it continues while the condition:

        n >= 0

is true.  n is unsigned and thus cannot have a negative value.  The
actual initial value of n, or changes to its value such as:

        n--

are irrelevant.


ACKNOWLEDGEMENTS

Thanks to Nathan Myers, Eric Nagler, Terry Rudd, Jonathan Schilling,
and Clay Wilson for help with proofreading.


SUBSCRIPTION INFORMATION / BACK ISSUES

To subscribe to the newsletter, send mail to majordomo@world.std.com
with this line as its message body:

subscribe c_plus_plus

Back issues are available via FTP from:

        rmii.com /pub2/glenm/newslett

or on the Web at:

        http://www.rmii.com/~glenm

There is also a Java newsletter.  To subscribe to it, say:

subscribe java_letter

using the same majordomo@world.std.com address.

-------------------------

Copyright (c) 1996 Glen McCluskey.  All Rights Reserved.

This newsletter may be further distributed provided that it is copied
in its entirety, including the newsletter number at the top and the
copyright and contact information at the bottom.

Glen McCluskey & Associates
Professional C++ Consulting
Internet: glenm@glenmccl.com
Phone: (800) 722-1613 or (970) 490-2462
Fax: (970) 490-2463
FTP: rmii.com /pub2/glenm/newslett (for back issues)
Web: http://www.rmii.com/~glenm
