Issue #004
January, 1996


Contents:

Introduction to C++ Namespaces Part 4 - using declarations
Clarification of New/Delete Example in Newsletter #003
Using C++ as a Better C Part 4 - Declaration Statements
C++ and Java
Writing Robust C++ Code Part 2 - Constructors and Integrity Checking
Performance - Declaration Statements


This is the fourth issue of the C++ newsletter.  If you have comments
or suggestions, please send them to glenm@glenmccl.com.  Back issues
are available via FTP or the Web (see below).


INTRODUCTION TO C++ NAMESPACES - PART 4

Previously we talked about how namespaces can be used to group names
into logical divisions:

        namespace Vendor1 {
                class String {
                        ...
                };
                int x;
        }

        namespace Vendor2 {
                class String {
                        ...
                };
                int x;
        }

and how those names can be accessed via qualification:

        Vendor2::String s;

or a using directive:

        using namespace Vendor2;

        String s;

Another way of accessing names is to employ a using declaration:

        using Vendor2::String;

        String s;

This can be a little confusing.  A using directive:

        using namespace X;

says that all the names in namespace X are available for use, but none
of them are actually declared or introduced.  A using declaration, on
the other hand, actually introduces a name into the current scope.  So
saying:

        using namespace Vendor2;

makes String and x available for use, but doesn't declare them.
Saying:

        using Vendor2::String;

actually introduces Vendor2::String into the current scope as a
declaration.  Saying:

        using Vendor1::String;
        using Vendor2::String;

will trigger a "duplicate declaration" compiler error.

There are several other aspects of using declarations that are worth
learning about;  these can be found in a good C++ reference book.


CLARIFICATION OF NEW/DELETE EXAMPLE IN NEWSLETTER #003

In the previous issue of the newsletter, there was an example:

        int* ip;

        ip = new int[100];

        delete ip;

This code will work with many compilers, but it should instead read:

        int* ip;

        ip = new int[100];

        delete [] ip;

This is an area of C++ that has changed several times in recent
years.  There are a number of issues to note.  The first is that new
and delete in C++ have more than one function.  The new operator
allocates storage, just like malloc() in C, but it is also responsible
for calling the constructor for any class object that is being
allocated.  For example, if we have a String class, saying:

        String* p = new String("xxx");

will allocate space for a String object, and then call the constructor
to initialize the String object to the value "xxx".  In a similar way,
the delete operator arranges for the destructor to be called for an
object, and then the space is deallocated in a manner similar to the C
function free().

If we have an array of class objects, as in:

        String* p = new String[100];

then a constructor must be called for each array slot, since each is a
class object.  Typically this processing is handled by a C++ internal
library function that iterates over the array.

In a similar way, deallocation of an array of class objects can be
done by saying:

        delete [] p;

It used to be that you had to say:

        delete [100] p;

but this feature is obsolete.  The size of the array is recovered by
the library function that implements the delete operator for arrays.
The pointer/size pair can be stored in an auxiliary data structure or
the size can be stored in the allocated block before the first actual
byte of data.

What makes this a bit tricky is that all of this work of calling
constructors and destructors doesn't matter for fundamental data types
like int:

        int* ip;

        ip = new int[100];

        delete ip;

This code will work in many cases, because there are no destructors to
call, and deleting a block of storage works pretty much the same
whether it's treated as an array of ints or a single large chunk of
bytes.

But more recently, the ANSI standardization committee has decided to
break out the new and delete operators for arrays as separate
functions, so that a program can control the allocation of arrays
separately from other types.  For example, you can say:

        void* operator new(unsigned int) {/* ... */ return 0;}

        void* operator new[](unsigned int) {/* ... */ return 0;}

        void f()
        {
                int* ip;

                ip = new int;           // calls operator new()

                ip = new int[100];      // calls operator new[]()
        }

and the appropriate functions will be called in each case.  This is
kind of like defining your own versions of the malloc() and free()
library functions in C.


USING C++ AS A BETTER C - PART 4

In C, when you write a function, all the declarations of local
variables must appear at the top of the function or at the beginning
of a block:

        void f()
        {
                int x;
                /* ... */
                while (x) {
                        int y;
                        /* ... */
                }
        }

Each such variable has a lifetime that corresponds to the lifetime of
the block it's declared in.  So in this example, x is accessible
throughout the whole function, and y is accessible inside the while
loop.

In C++, declarations of this type are not required to appear only at
the top of the function or block.  They can appear wherever C++
statements are allowed:

        class A {
        public:
                A(double);
        };
        void f()
        {
                int x;
                /* ... */
                while (x) {
                        /* ... */
                }
                int y;
                y = x + 5;
                /* ... */
                A aobj(12.34);
        }

and so on.  Such a construction is called a "declaration statement".
The lifetime of a variable declared in this way is from the point of
declaration to the end of the block.

A special case is used with for statements:

        for (int i = 1; i <= 10; i++)
                ...

        /* i no longer available */

In this example the scope of i is the for statement.  The rule about
the scope of such variables has changed fairly recently as part of the
ANSI standardization process, so your compiler may have different
behavior.

Why are declaration statements useful?  One benefit is that
introducing variables with shorter lifetimes tends to reduce errors.
You've probably encountered very large functions in C or C++ where a
single variable declared at the top of the function is used and reused
over and over for different purposes.  With the C++ feature described
here, you can introduce variables only when they're needed.

Another benefit is given below, in the section on performance.


C++ AND JAVA

You may have heard recently of the programming language Java, being
pushed by Sun Microsystems as the language for Internet programming on
the World Wide Web.  There is a lot of hoopla about this at present.
However, it's interesting to look at Java simply as a language,
divorced from its Internet context.  C++ was based on C, and Java is
based at least in part on C++.

Giving a detailed comparison of the languages is beyond the scope of
the newsletter, but if you wish to find out more, there are several
places to look.  Sun has a Web site:

        http://java.sun.com

with useful information in it, and an anonymous FTP site as well:

        java.sun.com

Another Web site with pointers to many Java resources is:

        http://www.gamelan.com

I've also looked at the book "Java!" by Tim Ritchey, which appears to
have a lot of useful information that gives some context to the
language and its use.  There are many more Java books in the works
that will be appearing in the next few months.


WRITING ROBUST C++ CODE - PART 2

Imagine that you want to devise a way to represent calendar dates for
use in a C program.  You come up with a struct:

        struct Date {
                int month;
                int day;
                int year;
        };

and a program using the Date struct can initialize a struct like so:

        struct Date d;

        d.month = 9;
        d.day = 25;
        d.year = 1956;

And you devise various functions, for example one to compute the
number of days between two dates:

        long days_b_dates(struct Date* d1, struct Date* d2);

This approach can work pretty well.  

But what happens if someone says:

        struct Date d;

        d.month = 9;
        d.day = 31;
        d.year = 1956;

and then calls a function like days_b_dates()?  The date in this
example is invalid, because month 9 (September) has only 30 days.
Once an invalid date is introduced, functions that use the date will
not work properly.  In C, one way to deal with this problem would be
to have a function to do integrity checking on each Date pointer
passed to a function like days_b_dates().

In C++, a simpler and cleaner approach is to use a constructor to
ensure the validity of an object.  A constructor is a function called
when an object comes into scope.  So I could say:

        #include <assert.h>

        class Date {
                int month;
                int day;
                int year;
                static int isleap(int);
        public:
                Date(int, int, int);
        };

        const char days_in_month[12] = {31, 28, 31, 30, 31, 30,
            31, 31, 30, 31, 30, 31};

        // return 1 if year is a leap year, else 0
        int Date::isleap(int y)
        {
                if (y % 4)
                        return 0;
                if (y % 100)
                        return 1;
                if (y % 400)
                        return 0;
                return 1;
        }

        // constructor for Date class
        Date::Date(int m, int d, int y)
        {
                assert(m >= 1 && m <= 12);
                assert(d >= 1);
                assert(y >= 1800 && y <= 2099);
                assert(d <= days_in_month[m-1] ||
                    (m == 2 && d == 29 && isleap(y)));

                month = m;
                day = d;
                year = y;
        }

        Date d(9, 25, 1956);

This logic does a complete check of the date.  It ensures that a Date
object has internal integrity.  Note that the three data members of
the Date object are private to the class, meaning that a random user
of a Date class object cannot change them, and instead must rely on the
constructor for setting the value of a Date object.


PERFORMANCE

Suppose that you have a function to compute factorials (1 x 2 x ... N):

        double fact(int n)
        {
                double f = 1.0;
                int i;
                for (i = 2; i <= n; i++)
                        f *= (double)i;
                return f;
        }

and you need to use this factorial function to initialize a constant
in another function, after doing some preliminary checks on the
function parameters to ensure that all are greater than zero.  In C
you can approach this a couple of ways.  In the first, you would say:

        /* return -1 on error, else 0 */
        int f(int a, int b)
        {
                const double f = fact(25);

                if (a <= 0 || b <= 0)
                        return -1;

                /* use f in calculations */

                return 0;
        }

This approach does an expensive computation each time, even under
error conditions.  A way to avoid this would be to say:

        /* return -1 on error, else 0 */
        int f(int a, int b)
        {
                const double f = (a <= 0 || b <= 0 ? 0.0 : fact(25));

                if (a <= 0 || b <= 0)
                        return -1;

                /* use f in calculations */

                return 0;
        }

but the logic is a bit torturous.  In C++, using declaration
statements (see above), this problem can be avoided entirely, by
saying:

        /* return -1 on error, else 0 */
        int f(int a, int b)
        {
                if (a <= 0 || b <= 0)
                        return -1;

                const double f = fact(25);

                /* use f in calculations */

                return 0;
        }

-------------------------

Copyright (c) 1996 Glen McCluskey.  All Rights Reserved.

This newsletter may be further distributed provided that it is copied
in its entirety, including the newsletter number at the top and the
copyright and contact information at the bottom.

Glen McCluskey & Associates
Professional C++ Consulting
Internet: glenm@glenmccl.com
Phone: (970) 490-2462
Fax: (970) 490-2463
FTP: rmii.com /pub2/glenm/newslett (for back issues)
Web: http://www.rmii.com/~glenm
