advance-c

463
What Is C++? C++ is a general-purpose, platform-neutral programming language that supports object-oriented programming and other useful programming paradigms, including procedural programming, object-based programming, generic programming, and functional programming. C++ is viewed as a superset of C, and thus offers backward compatibility with this language. This reliance on C provides important benefits: Reuse of legacy C code in new C++ programs Efficiency Platform neutrality Relatively quick migration from C to C++ Yet it also incurs certain complexities and ailments such as manual memory management, pointers, unchecked array bounds, and a cryptic declarator syntax, as described in the following sections. As opposed to many other programming languages, C++ doesn't have versions. Rather, it has an International ANSI/ISO Standard, ratified in 1998, that defines the core language, its standard libraries, and implementation requirements. The C++ Standard is treated as a skeleton on which vendors might add their own platform-specific extensions, mostly by means of code libraries. However, it's possible to develop large- scale applications using pure standard C++, thereby ensuring code portability and easier maintenance. The primary reason for selecting C++ is its support of object-oriented programming. Yet even as a procedural programming language, C++ is considered an improvement over ANSI C in several aspects. C programmers who prefer for various reasons not to switch to object-oriented programming can still benefit from the migration to C++ because of its tighter type-safety, strongly typed pointers, improved memory management, and many other features that make a programmer's life easier. Let's look at some of these improvements more closely: Improved memory management. In C, you have to call library functions to allocate storage dynamically and release it afterwards. But C++ treats dynamically allocated objects as first-class citizens: it uses the keywords new and delete to allocate and deallocate objects dynamically. User-defined types are treated as built-in types. For example, a struct or a union's name can be used directly in declarations and definitions just as a built-in type: struct Date { int day; int month; int year; }; Date d; //In C, 'struct' is required before Date void func(Date *pdate); //ditto Pass-by-reference. C has two types of argument passing: by address and by value. C++ defines a third argument-passing mechanism: passing by reference. When you pass an argument by reference, the callee gets an alias of the original object and can modify it. In fact, references are rather similar to pointers in their semantics; they're efficient because the callee doesn't get a copy of the original variable, but rather a handle that's bound to the original object. Syntactically, however, references look like variables that are passed by value. Here's an example: Date date;

Upload: siqbal2

Post on 02-Nov-2014

33 views

Category:

Documents


0 download

TRANSCRIPT

What Is C++?C++ is a general-purpose, platform-neutral programming language that supports object-oriented programming and other useful programming paradigms, including procedural programming, object-based programming, generic programming, and functional programming. C++ is viewed as a superset of C, and thus offers backward compatibility with this language. This reliance on C provides important benefits:

Reuse of legacy C code in new C++ programs Efficiency Platform neutrality Relatively quick migration from C to C++

Yet it also incurs certain complexities and ailments such as manual memory management, pointers, unchecked array bounds, and a cryptic declarator syntax, as described in the following sections. As opposed to many other programming languages, C++ doesn't have versions. Rather, it has an International ANSI/ISO Standard, ratified in 1998, that defines the core language, its standard libraries, and implementation requirements. The C++ Standard is treated as a skeleton on which vendors might add their own platform-specific extensions, mostly by means of code libraries. However, it's possible to develop largescale applications using pure standard C++, thereby ensuring code portability and easier maintenance. The primary reason for selecting C++ is its support of object-oriented programming. Yet even as a procedural programming language, C++ is considered an improvement over ANSI C in several aspects. C programmers who prefer for various reasons not to switch to object-oriented programming can still benefit from the migration to C++ because of its tighter type-safety, strongly typed pointers, improved memory management, and many other features that make a programmer's life easier. Let's look at some of these improvements more closely:

Improved memory management. In C, you have to call library functions to allocate storage dynamically and release it afterwards. But C++ treats dynamically allocated objects as first-class citizens: it uses the keywords new and delete to allocate and deallocate objects dynamically. User-defined types are treated as built-in types. For example, a struct or a union's name can be used directly in declarations and definitions just as a built-in type:

struct Date { int day; int month; int year; };

Date d; //In C, 'struct' is required before Date void func(Date *pdate); //ditto

Pass-by-reference. C has two types of argument passing: by address and by value. C++ defines a third argument-passing mechanism: passing by reference. When you pass an argument by reference, the callee gets an alias of the original object and can modify it. In fact, references are rather similar to pointers in their semantics; they're efficient because the callee doesn't get a copy of the original variable, but rather a handle that's bound to the original object. Syntactically, however, references look like variables that are passed by value. Here's an example:

Date date;

void func(Date &date_ref); //func takes Date by reference func(date); // date is passed by reference, not by value

Default argument values. C++ allows you to declare functions that take default argument values. When the function call doesn't provide such an argument, the compiler automatically inserts its respective default value into the function call. For example:

void authorize(const string & username, bool log=true); authorize(user); // equivalent to: authorize(user, true);In the function call above, the programmer didn't provide the second argument. Because this argument has a default value, the compiler silently inserted true as a second argument.

Mandatory function prototypes. In classic C, functions could be called without being previously declared. In C++, you must either declare or define a function before calling it. This way, the compiler can check the type and number of each argument. Without mandatory prototypes, passing arguments by reference or using default argument values wouldn't be possible because the compiler must replace the arguments with their references or add the default values as specified in the prototype.

A well-formed C++ program must contain a main() function and a pair of matching braces:

int main() {}Though perfectly valid, this program doesn't really do anything. To get a taste of C++, let's look at a more famous example:

#include int main() { std::cout>=

In addition, the following operators can be overloaded both in their unary and binary forms:

+

-

*

&

Derived classes inherit the overloaded operators of their base classes in the same manner as ordinary member functions are inherited (the assignment operator is an exception). An overloaded operator must take at least one argument of a user-defined type. This rule ensures that users cannot alter the meaning of expressions that contain only fundamental types. For example:

int i,j,k; k = i + j; //always uses built-in = and +An overloaded operator extends a built-in one, so you cannot introduce new operators into the language. Neither the precedence nor the number of arguments of an operator can be altered. For example, an overloaded && must have exactly two arguments, as does the built-in && operator. In addition, you can't change the precedence of an operator. A sequence of two or more overloaded operators, t2destroy(); //vector of pointersA partial specialization of a class template must appear after the primary template. The following partial specialization of Vector defines an overriding implementation that the compiler uses for every template argument that is a pointer type (the definition of the primary template is repeated for convenience):

template class Vector //primary { public: size_t length() const; void push_back(const T &r); //.. };

template class Vector //partial specialization { public: void push_back(const T *p); size_t length() const {return vecp.size();} private: std::vector vecp; };

A partial specialization looks like a primary template, except that it contains a set of template arguments (T* in this example) after the class name, right before the opening brace. Consequently, whenever you instantiate a Vector whose template argument is a pointer, the compiler will choose the partial specialization instead of the primary template. It's possible to define a set of partial specializations, each applying to a different subset of template arguments. For example, you can define a partial specialization for all pointer types using as the argument list, a set of pointers to volatile objects that is used in event-driven systems:

class Vector //another partial specialization { public: void push_back(const volatile T *p); //.. };And so on. The partial specializations must be ordered: a least specialized version should be defined before a more specialized version. Intuitively, we know what "a more specialized version" means. However, there is also a formal definition: a partial specialization X is more specialized than a partial specialization Y if every valid template argument for X is also a valid argument for Y, but not vice versa. Let's see how this formalization works:

Vector ;//primary #1 Vector; //partial specialization #2 Vector;//another partial specialization #3Let's use three arguments: int, int *, volatile int * to instantiate Vectors:

Vector vi; Vector vpi; Vector vpiv;; vi uses the primary template because the argument int is not a pointer type. Notice that if we hadn't defined the partial specializations, the compiler would have picked the primary template, substituting T for 'int *'.What about vpi? Here, the argument can match the two partial specializations #2 and #3. However, the compiler picks the Vector partial specialization because it's a better match for the argument's type. Vector is the most specialized version because pointers to volatile objects are a subset of all possible pointer types while every pointer to non-volatile can be converted to volatile T*, the reverse isn't true. Therefore, the set of is more specialized than

Explicit Specializations of a Class Template An explicit specialization of a class template provides an alternative definition of the primary template. It's used instead of the primary definition (or a partial specialization) if the arguments match those that are given in the explicit specialization. When is an explicit specialization useful? Consider the Vector template: The code generated by the compiler for the specialization Vector is very inefficient. Instead of storing every Boolean value in a single bit, it occupies at least an entire byte. Following is an example of an explicit specialization Vector that manipulates bits rather than bytes:

template class Vector //explicit specialization { private: size_t sz; unsigned char * buff; public: explicit Vector(size_t s = 1) : sz(s), buff (new unsigned char [(sz+7U)/8U] ) {} Vector (const Vector & v); Vector& operator= (const Vector& v); ~Vector(); //...other member functions }; Vector< bool> bits(8); bits[0] = true; //assign bool seventh = bits[7];The template prefix indicates an explicit specialization of an existing primary template. The template arguments for an explicit specialization are specified in the second pair of angle brackets those that immediately follow the class name. The specialization hierarchy of Vector that has been defined thus far is as follows:

template class Vector //primary template #1 {...}; template class Vector //partial specialization #2 {...}; template class Vector //partial specialization #3 {...}; template class Vector //explicit specialization #4 {};An explicit specialization differs from a partial specialization in that it applies to a specific template argument, say bool, whereas the latter applies to a set of template arguments pointers, for example. An explicit specialization should be declared after the primary template and its partial specializations. NOTE As a side note, it's worth mentioning that the Standard Template Library already defines a specialization of std::vector for manipulating bits optimally. Alas, during the past five years, std::vector has fallen from grace in the eyes of the C++

standardization committee as well as other people for various reasons that discussed in The Standard Template Library (STL) and Generic Programming. A function template declaration contains the keyword template, followed by a list of template parameters and a function declaration. As opposed to ordinary functions, which are usually declared in a header file and defined in a separate .cpp file (unless the template is exported), a function template is defined inside a header file (which means that it may be #included in multiple compilation sessions):

//++file max.h template T max( T t1, T t2)//definition in a header file { return (t1 > t2) ? t1 : t2; }Unlike class template parameters, function template parameters are implicitly deduced from the types of their matching arguments. In the following example, the compiler instantiates three distinct specializations of max(), according to the types of the arguments used in the place of instantiation:

int i = 0, j = 8; char c = 'a', d = 'z'; std::string s1 = "first", s2 = "second"; int nmx = max(i, j); char cmx = max(c, d); // int max (int, int); // char max (char, char); // std::string max (std::string,

std::string smax = max(s1, s2); std::string);

This implementation of max() isn't ideal, to say the least. The arguments and the return type are passed by value. When the template argument is a class object, say std::string, this causes the creation of multiple temporaries. Therefore, it's better to pass template parameters by reference. As usual, if the template function doesn't modify its arguments, pass them as reference to const:

template const T& max( const T & t1, const T & t2)//definition { return (t1 > t2) ? t1 : t2; }In many cases, the implicit type deduction of a function template is satisfactory. However, when the function template takes no arguments or when you want to override the default type deduction, it is possible to specify the template's arguments explicitly as follows:

max (true, false); //override the default deduction of type bool template bool func() // no arguments { T t1, t2; return t1==t2;

}

// enforce a compile-time constraint: "T must be a comparable type" bool res=func(); //explicit argument specification res=func();//dittoOverloading and Partial Specialization As opposed to class templates, a function template doesn't support partial specializations; instead, you use overloading. Let's look at an example:

const char *p1="string1"; const char *p2="string2"; const char *maxstr max(p1, p2);//bad!The max() call causes the compiler to generate the following specialization:

const char * max(const char *, const char *);The problem is that this specialization compares pointer values, instead of a performing a lexicographical comparison. To override this default behavior, you can add an overloaded version of max() that will be used for arguments of type char * (and their cv-qualified flavors):

const char * max(const char *p1, const char *p2) { int val=std::strcmp(p1, p2); if (val>0) return p1; return p2; }

int main() { double d=max(5.0, 6.0); //1 const char * p=(p1,p2);//2 }The first max() call in main() uses the definition of the primary function template max(), generating the following specialization from it:

const int& max(const int&, const int&);However, the second max() call uses the const char* overloaded version.

It looks as if we're splitting hairs here. Whether it's called partial specialization or overloading, the compiler picks the right version of the function, so is there really any difference between the two mechanisms? Yes, there is. In fact, there's more than one. Partial specialization doesn't introduce a new template; it merely extends an existing template. However, an overloaded version introduces a new function, completely independent of any other overloads. When the compiler has to choose which template version to use, it considers all of the overloaded versions together, using the overload resolution rules to select the best fit. By contrast, when a class template is looked up, only the primary template is considered first. Then, if the compiler finds a partial specialization of that template with template arguments that match the instantiation, the partial specialization's definition is used. There are other differences between partial specialization and overloading:

You can specialize a member template of a class (i.e., a member function that is a template) without changing that class' definition. However, it's impossible to add an overloaded version of a member function without editing the class' definition. When selecting templates, the argument matching rules are stricter than those of function overloading. Therefore it is possible to define specializations that rely on subtle parameter differences, such as T* vs. const T*. When you declare a function template f(T) as a friend, all instances thereof, say f, f etc., are granted the same privileges. However, you have to declare each overloaded version as a friend.

Considering these differences, the proposal to add partial specialization of function templates to C++ seems reasonable. However, permitting both partial specializations and overloading of function templates could lead to various ambiguity problems. At present, this issue is still being reviewed by the standardization committee.

Member TemplatesTwo terms in the C++ template jargon, template member and member template, sound confusingly similar. However, they arent interchangeable. A template member is a member declared inside a class template, for example:

template class C { T * p; // a template member void f(T ") const; // another template member };By contrast, a member template is a template declared inside a class or a class template. Templates and Implicit Conversions Member templates are useful for solving an onerous problem: the lack of implicit conversions between different specializations of the same class template. Lets look at an edifying example. In the following code fragment, the implicit conversion between numeric built-in types enables the programmer to use two variables of different types within the same expression:

int num=1; double dn=num; //1 implicitly converted to 1.0 if (num==dn) //same hereThese implicit conversions are trivial so you rarely pay attention to the fact that the compiler actually converts int to double here. However, when templates are involved, such implicit conversions dont exist:

vector ai; vector ad; ai.push_back(10); ad=ai;//compilation errorYou cant assign vector to vector because there is no standard conversion between these two independent specializations. In most cases, this is well and good; assigning vector> s) cout and * while performing useful tasks under the hood such as allocating and deallocating storage automatically. In this regard, auto_ptr is an example of a smart pointer, although the restrictions on its usage usually drive programmers to either roll their own smart pointers or use third-party class libraries that offer more functionality. That said, auto_ptr excels at what it was originally designed for, namely simplifying the interaction of dynamic memory management and exception handling. When used properly, it can save you from a great deal of bugs and avoid code maintenance problems. The ability to extend the type system of a programming language by defining new types is called abstract data typing, or data abstraction for short. While classes are a common data abstraction mechanism, they are not the only one. In the following sections I will explain the properties of enum types, another data abstraction mechanism, and demonstrate some of their advanced uses. Properties of enum Types An enum is a set of integral constants with symbolic names called enumerators. For example:

enum ProcessState //ProcessState is an enum tag { //enumerators: stopped, resumed, zombie, running };By default, the first enumerator's value is 0. Each consecutive enumerator is incremented by 1. Therefore, the enumerators stopped, resumed, zombie, and running have the values 0, 1, 2, and 3, respectively. You may override the default values, like this:

enum FlightMenu { standard=100, vegetarian=200,

vegetarian_ovo_lacto, //default, equals 201 vegetarian_low_fat,//equals 202 diabetic=300, kosher=400 }; enums are strongly-typed. The only values you may assign to an enum variable are its enumerators: ProcessState state=stopped;//fine state=1; //error, type mismatch state=kosher; //error, type mismatchUsually, the actual values of enumerators are immaterial. For instance, the following switch statement will function properly even if you change values of FlightMenu's enumerators:

switch(menu) { case standard: std_meals++; break; case vegetarian: veg++; break; //...all the rest . . . default: coutD2 because A and B are considered "deeper" base classes than D they appear on the base specifier list D which is itself a base class of D2. Fig.2 shows the output of d2's construction order: Figure 2 Notice that the order of destruction is always the opposite of the construction order; when d2 is destroyed, the destructors are called in the following order: D2->D->B->A (see fig. 3): Figure 3 If you change the base specifier of D to:

struct D: public B, public A {int x;};The construction order of d2 would change accordingly to B->A->D->D2 as would the destruction order (see fig. 4): insert Figure 4 Construction and Destruction of Virtual Inheritance When virtual inheritance (VI) is used, things become slightly more complicated. Remember that the raison d'etre of VI is to ensure that a single subobject of a virtual base class is present in the entire derivation

lattice. Most compilers use some form of indirection, namely storing a pointer to the virtual base subobject, to ensure this. To guarantee that all handles or pointers indeed point the same virtual subobject, it must be constructed before any non-virtual base classes listed on the same base specifier list. In other words, the same ordering rules that I presented in the previous section still apply except than in the presence of VI, virtual bases have a higher precedence. For example, if you change the base specifier list of D to:

struct D: public A, virtual public B //deepest base specifier list {}; struct D2: D2 d2The construction order of d2 will become B->A>D->D2. This is because the construction algorithm now works as follows:

public D {}; /second base specifier list

Construct the virtual bases using the previous depth-first left-to-right order of appearance. Since there's only one virtual base B, it's constructed first. Construct the non-virtual bases according to the depth-first left-to-right order of appearance. The deepest base specifier list contains A. Consequently, it's constructed next. Apply the same rule on the next depth level. Consequently, D is constructed. Finally, D2's constructor is called.

The destruction order, as always, is the opposite of the construction order. Therefore, B's destructor will be the last to run. Fig. 5 shows the complete call chain of d2's bases' construction and destruction: Figure 5 What happens if D becomes a virtual base class as well?

struct D: public A, virtual public B //deepest base specifier list {}; struct D2: D2 d2First, locate the virtual bases. This time we have B and D. Because B appears in a deeper base specifier list, it's constructed first. D should be constructed next because it's also a virtual base. However, because it has two base classes A and B, A is constructed next (you can't construct an object without constructing its base classes and embedded objects first). Once all the base classes of D have been constructed, D's construction takes place. The construction list now becomes B->A->D. Finally, the constructor of the most derived object, D2, is called. The result, amazing as it may seem, is the same as before. Making B a non-virtual base would change the construction order into A->B->D->D2. This is because A and B must be constructed before D. Summary The seemingly complicated rules of construction order in the presence of multiple inheritance aren't truly so unintuitive if you remember the following principles: virtual bases have the highest precedence. Depth comes next, and then the order of appearance leftmost bases are constructed first. These rules aren't a C++ specific whim; in fact every programming language that supports multiple interface inheritance uses more or less the same rules.

virtual public D {};

I've already discussed several flavors of operator new, including the ordinary versions of new and placement new. Standard C++ has another version of this operator, called nothrow new. In the following sections I will describe its raison d'tre and usage. A Bit of History Until the early 1990s, operator new behaved very much like standard C's malloc(), as far as allocation failures were concerned. When it failed to allocated a memory block of the requested size, it would return a NULL pointer. In those days, a typical new expression would look like this:

//pre-standard C++ CWindow * p; p = new DerivedWind; if (!p) //NULL indicates a failure { cout operator to access its members. To add insult to injury, programmers who pass arguments as pointers often allocate the arguments dynamically, even when there's no reason to do so.

Declaring value arguments as const is a close relative of the first gaffe. Raise your hand if you haven't seen a hyper-corrective (yet utterly wrong) function declaration of this kind:

void doStuff(const int action_code);Programmers who declare a value argument as const don't realize that by doing so, they expose themselves to mockery. Even if their aim is to document the fact that the function isn't allowed to modify its argument, declaring a value argument const is redundant and wrong. Remember: the const in this context applies to a local copy of the argument passed to the callee. It doesn't apply to the caller's argument. Hence, declaring value arguments as const is always a bad idea. Bottom line: never declare value arguments as const. Although it might seem harmless, it's plain wrong, and it might suggest that you don't understand the C++ argument passing mechanism at all. The "One Container Fits All" Syndrome Novices who've tasted a bit of STL find it exciting. Frankly, who doesn't? The problem is that the novices think that one container class can solve every programming task. I can't count how many times I've seen rookies looking for a "missing" operator [] in std::list. Worst yet, some of them even insist on rolling their own (buggy) list class, never forgetting to define operator [] as a member function! Bottom line: if you find the diversity of STL containers daunting at first, that's okay. You don't have to master all of them at once. However, at the minimum, you should familiarize yourself with std::vector and std::list. If you can't spare a few moments to study more than one container class, remember this rule of thumb: if it needs operator [], then it mustn't be std::list. Writing Redundant Code This category includes two common style blunders. Although they aren't dangerous, they clutter up the code, increase compilation time and make code maintenance more difficult than it should be. The empty constructor idiom. One of the first things C++ programmers learn is that every class has a constructor. To some extent, this is true. However, it doesn't mean every class must have a user-defined constructor. As a rule of thumb, if a constructor looks like this:

class Widget { Widget() {} //empty braces and no member initialization list; useless //.. };Then this constructor shouldn't be there in the first place. An empty constructor is pointless, and in some cases, it could even cause your code to break in mysterious ways or cause compilation errors. Avoid it. It's easy.

The vacuous this-> notation. How many member functions that look like this have you seen before?

Circle& Circle::operator=(const Circle & c) { //redundant this-> appears in every line this->xPos = c.xPos; this->yPos = c.yPos; this->radius = c.radius; this->ptr=new Something; return *this; }This is a common idiom. Some poorly designed frameworks even advocate this coding style, although in practice, it's completely useless. Remember: the this-> prolog doesn't make the code clearer in any way. The compiler doesn't need it either. In fact, omitting this-> from this code would make it shorter, and easier to maintain and understand. As a bonus, it's even healthier, as it could reduce the risk of a carpal tunnel due to excessive and useless typing. Isn't this version clearer?

Circle& Circle::operator=(const Circle & c) { xPos = c.xPos; yPos = c.yPos; radius = c.radius; ptr=new Something; return *this;

}Summary From my experience, style gaffes that don't bite are the hardest to eradicate. When a programmer goofs badly, say by deleting a pointer twice or writing to a file that isn't open, the implementation's response is a crash. However, passing huge arguments by value doesn't lead to such catastrophic sanctions. The compiler is silent, the code more or less does what it should do -- why kick the habit then? And yet, these stylistic mistakes exact a toll in the long run. Avoiding them is the first step towards becoming a pro.

friend declarations have a few unintuitive properties that might confuse both beginners and experiencedprogrammers. Some of the declarations use special syntactic forms, while others have an unusual semantics. Read all about friend declarations here. The Essence of Friendship A class may declare a function or another class as its friend. A friend has access to all the class' members, including private and protected ones. The name of a friend is not in the scope of the class. Here are two examples of friend declarations:

bool operator < (const A&, const A&);

class A { private: int x; public: A(); friend bool operator < (const A&, const A&);//friend function friend class B; }; class B { public: void func(int); };A friend declaration of a function consists of the keyword friend followed by the function's declaration. This way, it's possible to specify a single friend function from a set of overloaded functions:

int func(); //1 int func(const char *); //2 struct A { friend int func();// applies only to #1 };Any member of class B has full access to A's members. Similarly, the global overloaded operator < has access to every member of A. Notice that the access specifiers private, protected and public have no effect on friend declarations; the meaning of the friend declaration is the same in all three cases. However, it's customary to declare friends as public. Friendship is neither transitive nor inherited. Thus, if A is declared as friend in B, B has no special privileges when referring to A's members. Similarly, classes derived from A cannot access non-public members of B, unless these derived classes are also declared as friends in B:

class D: public A { }; struct S{}; class B

{ friend class A; //base friend class D; //derived friend struct S; }A friend declaration of a class must contain an elaborated type specifier, i.e. one of the keywords struct, class, or union followed by the type's name, unless the befriended class has been declared previously:

struct G{}; class F { friend struct B; //OK, using elaborated type specifier friend G; //OK, G has already been declared }; struct B{};Definitions in a friend Declaration Oddly enough, C++ permits definitions of functions within a friend declaration. However, you cannot define types within a friend declaration:

struct A { friend bool operator < (const A& a1, const A& a2) { return a1.i < a2.i;} //OK, function defined within a friend declaration friend class X {}; //error, type definition isn't allowed here private: int i; };Now there's an interesting question: What is the status of a function defined within a friend declaration? Outwardly, the operator < defined in class A looks like a member function of this class. However, this doesn't make sense at all, since every member function has unlimited access to members of its class anyway. Another hint pertaining to the status of such a function is the parameter list. The overloaded operator < takes two parameters, as does the global version thereof which you saw earlier. All right, I won't keep you in suspense any longer: the C++ standard says that every function defined within a friend declaration is never a member function of the class granting friendship. Thus, defining the overloaded operator< inside a friend declarations is the same as defining the overloaded operator< outside the class. C++ imposes the following restrictions on a function defined within a friend declaration:

The function shall have namespace scope

The function shall not have a qualified name:

class X { friend int A::func() {return 0;} //error }

The class granting friendship shall not be a local class.

The common practice is to define such functions outside the class and use their prototype in a friend declaration. The Standard Library uses this style, as does every decent textbook. Combining the function definition and the friend declaration is bad programming practice, for at least two reasons:

Readability. Most readers (and quite a few C++ compilers, it appears) might mistake it for a member function. Decoupling. Since the function isn't an integral part of a class' interface, there's no reason to define it inside the class -- doing so would force clients to recompile the class whenever the function is modified.

Fine-tuned friendship A class may grant friendship to a member function of another class. In this case, other member functions of the latter class have no special privileges:

struct A { void f(struct B& ); void g(struct B& ); }; struct B { friend void A::f(struct B&); };A function first declared in a friend declaration has external linkage. Otherwise, the function retains its previous linkage:

static int func(); struct A { friend int func(); //func retains its static linkage friend int func2(); //func2 assumed to be extern };

In the second part of this article I will discuss template friend declarations, showing how to declare class templates, function templates, members of a class template and specializations of a class template as friends. Continuing our discussion of friend, it's time to see how to befriend templates. C++ enables you to control precisely which specialization(s) of a given template are granted friendship. However, this fine-grained control sometimes necessitates intricate syntax. Declaring non-template Functions and Classes as friends of a Class Template Declaring an ordinary class or function as friends of a class template isn't different from the friend declarations I've shown before. Here is an example of a class template that declares class Thing and func() as its friends:

class Thing {}; void func (int);

template class X { public: //... friend class Thing; friend void func (int); }; X xi; X xs;In each specialization of X, func(), and Thing are friends. Class Templates When you declare a class template X as friend of class Y, every specialization X is a friend of Y. If Y itself is a class template, every specialization of X is a friend of every specialization of Y:

template class D{/*...*/}; template class Vector { public: //... template friend class D; };Here, every specialization of D is a friend of every specialization of Vector. Template Specializations

You may restrict friendship to a particular specialization of a class template. In the following example, the class template Vector declares the specialization C as its friend:

template class C{}; template class Vector { public: //... friend class C; };In every specialization of Vector, e.g., Vector, Vector etc., C is friend. However, other specializations of C such as C, C etc., aren't. Function Templates You may also declare a function template as friend. Suppose you want the overloaded operator== to be a function template used by Vector. This ensures that for every Vector, the compiler will generate a matching operator==. Declaring a function template as friend consists of three steps. First, forward declare both the class template granting friendship and the function template:

template class Vector; //forward declaration of class template // forward declaration of friend function template template bool operator == (const Vector& v1, const Vector& v2);Next, declare the function template as friend inside the class template:

template class Vector { public: friend bool operator== (const Vector& v1, const Vector& v2); };Finally, define the function template:

template bool operator== (const Vector& v1, const Vector& v2) { //.. }

You can avoid these three steps by defining the function template within the friend declaration. In this case, the forward declarations aren't necessary. NOTE Recall, however, that this technique has a few drawbacks discussed in part I.

template class Vector { public: //defining the friend function template inside the class friend bool operator== (const Vector& v1, const Vector& v2) {//.. } };The template parameter T in operator== and Vector co-varies. That is, the compiler generates operator== for the specialization Vector, operator== for Vector, and so on. What if you want to declare only one specialization of a function template as friend? To do so, forward declare as before the class template and the function template. Then add a friend declaration to the class:

template class Vector; template bool operator == (const Vector& v1, const Vector& v2); template class Vector { public: friend bool operator== (const Vector& v1, //specialization const Vector& v2); };Notice that the template argument Date appears in angle brackets after the function's name. Date also replaces the template parameter T in the function's parameter list. Finally, define the specialization somewhere in the program:

template bool operator== (const Vector& v1, const Vector& v2) { //.. }

Remember that a definition of a specialization is preceded by the sequence 'template '. Unlike a primary function template, a specialization of a function template cannot be defined within a friend declaration. As usual, you may declare multiple specializations of the same function template as friends:

template class Vector { public: friend bool operator== (const Vector& v1, const Vector& v2); friend bool operator== (const Vector& v1, const Vector& v2); friend bool operator== (const Vector& v1, const Vector& v2);

};Friendship and Design I've focused exclusively on the syntactic properties of friendship, but not on the design issues. friend declarations are traditionally frowned upon in the literature since they allegedly violate encapsulation. This criticism isn't justified, though. Unquestionable, judicious usage of friendship is necessary for robust design. However, in many cases, a friend declaration enables you to enhance encapsulation by restricting access to a class' implementations details. The alternative, i.e., defining a get() member function that every client can use indiscriminately can be much worse. I will discuss friendship and design in an upcoming article. C++ has three storage types:

automatic storage, also known as stack memory. static storage, for namespace-scope objects and local statics The free-store, or the heap, for dynamically-allocated objects.

In certain applications and frameworks, it's necessary to restrict object allocation to a specific storage type. For instance, certain Design Patterns require that an object be allocated on the free-store exclusively. Similarly, you may need to block free-store allocation of smart pointer objects, handles and iterators. Let's see how to restrict the storage type of a certain class. Blocking Static and Automatic Storage In general, smart pointer classes require that their bound objects be allocated on the free-store. Unfortunately, the compiler isn't aware of this constraint:

#include #include

int main () {

std::ofstream datafile; std::aut_ptr file_ptr(&datafile); //undefined behavior } auto_ptr's destructor always calls delete to destroy the pointer it owns. In this example, the result is disastrous, because the pointer points to an auto object.To enforce this constraint at compile time, declare the destructor of a class as private. As a result, any attempt to create static and auto objects of this class will fail:

class FreestoreOnly { private: ~ FreestoreOnly();//auto and static objects are blocked public: //.. };As we learned before, the compiler automatically calls the destructors of auto and static objects when they go out of scope or when the program terminates, respectively. For this to work, the destructor must be public. Otherwise, instantiating auto and static objects will fail:

FresstoreOnly global_fs; //compilation error:"Destructor for 'FreestoreOnly' is not accessible" int main { FresstoreOnly auto_fs; // same error here static FresstoreOnly local_static_fs; // and here }However, this raises another conundrum: how do you destroy such an object? An ordinary delete expression causes the same compilation error. Instead, delegate the destruction to another public member function:

class FreestoreOnly

{ private: ~FreestoreOnly (); public: void destroy() {delete this;} };

FreestoreOnly * p = new FreestoreOnly; //..use p p->destroy(); //destroy the objectCreating a shared_ptr that owns a FreestorOnly object is still possible. All you need to do is install an explicit deleter that contains:

p->destroy;NOTE Further information on shared_ptr and deleters is available here. Blocking Free-store Allocation What about the opposite, i.e., a class whose object may be created on the stack or static memory but not on the free-store? For enforce this restriction, override the global new and delete by declaring them as nonpublic members of the said class:

class AutoStatic { private: void * operator new (size_t) {return 0;} void operator delete (void*) {} public: //.. };How does it work? When you allocate an object of this class on the free-store, the compiler uses the overridden versions of new and delete, not the global ones. Because these operators are inaccessible, the compiler rejects this code:

int main() { AutoStatic auto1; //OK static AutoStatic local_static2; //OK

AutoStatic * p= new AutoStatic; // error: 'AutoStatic::operator new(unsigned int)' //is not accessible delete p2; //error: 'AutoStatic::operator delete(void *)' //is not accessible. }

There's still a catch here. AutoStatic doesn't override the array new and delete. Therefore, it's still possible to allocate arrays of this class on the free-store. To fix this loophole, override new[] and delete[] as well:

class AutoStatic { private: //array new and delete void * operator new [](size_t) {return 0;} void operator delete [] (void*) {}

//scalar new and delete void * operator new (size_t) {return 0;} void operator delete (void*) {}

//.. };You probably noticed that I included dummy definitions of the overriding new and delete. This is needed in some implementations that call these operators implicitly from constructors and destructors. These dummy definitions are therefore necessary to ensure portability. However, if your compiler and linker are content without these definitions, feel free to omit them. Class Hierarchies In a class hierarchy, a derived class uses the overridden new and delete that were declared in its base class, unless it declares its own overriding versions of these operators. Therefore, free-store allocation of classes derived from AutoStatic is an error, unless these classes provide accessible overrides of new and delete. Conclusions It's still possible to construct an AutoStatic object on the free-store by (mis)using placement new. However, the techniques shown here aren't meant to be bulletproof. Rather, the aim is to protect your code from innocent human errors while drawing clients' attention to a specific allocation policy. It's also advisable to document such restrictions in the class declaration, too. Books

C++ Common Knowledge, by Stephen Dewhurst, discusses techniques for restricting object allocation to specific storage types in item 36. This item is only one of many useful techniques discusses in his book. Imperfect C++: Practical Solutions for Real-Life Programming, by Matthew Wilson, discusses a related topic, namely enforcing compile-time constraints, as well as many memory management techniques.

Back in my school days, I developed a technique for doing my homework assignments only when I had to -the day before I had to turn them in. Sometimes, I would even do my assignments on the very day of submission! What is considered a highly-reproached attitude in school proves to be a useful technique in software design, called "lazy evaluation." Lazy evaluation means deferring a certain operation (object initialization, a function call, allocating a resource, etc.) until its truly needed.

Rationale Lazy evaluation has a few manifestations in software engineering. Copy-on-write, reference-counting, and singleton all rely on the deference of a time- or resource-consuming operation until theres no escape from it. The simplest form of lazy evaluation consists of localizing the declarations of variables. Other forms include late binding of a pointer to an object, and accessing an object via an intermediary function. The main advantage of lazy evaluation is that you avoid unnecessary overhead when possible. Often, the decision whether an object is necessary can only be done at runtime. If, for example, an application allows users to change the default language, theres no point in loading in loading the foreign language strings before the user has actually selected a different language (as do some poorly designed applications, without naming names). Similarly, if a user opens a text document without modifying it, theres no point in saving the unmodified document every ten minutes. However, performance isnt the only reason for adopting lazy evaluation. In some applications, it can simplify the programs structure by localizing the conditional operation to the relevant code section. For example, a media player doesnt need to load all of its codecs at startup. Its better to load them on demand, according the media file that the player is currently playing. This way, the program is easier to maintain and debug. Implementation Lets see some applications of this technique. One classic example is the declaration of loop variables. In K"R C and C89, you are forced to declare i before the IF statement:

//C89 or poor style C++ int i=0; if (some_condition) for (; i< MAX; i++) /assuming that no one has tampered with i { //..do something } else //no loop hereC++, in a stark deviation from ANSI C, permits declarations of local variables almost anywhere in a block. C99 adopted this feature from C++. Both C++ and C99 allow you to rewrite the previous code listing as follows:

if (some_condition) for (int i=0; i< MAX; i++) { //i is local to the for-loop } else //no loop hereThe loop executes only when a certain condition is met. Therefore, it makes sense to declare i only in the scope of that loop. Deferring the declaration of i has two advantages:

Name localization. The scope of this variable is restricted to the loops body, so it doesnt clash with other identifiers in the enclosing scope; nor is it possible to tamper with it outside the loop.

Performance. No memory is allocated for this variable and its initialization is elided if the condition is false.

Admittedly, for a built-in type, this overhead is negligible. However, replace i with a string object -- or better yet, a matrix object -- and witness the performance impact! Late Initialization In more realistic cases, you need to defer the construction of an object. Yet unlike with local variables, a definition of an object also entails its initialization so you cant defer it. To overcome this restriction, you should either move the definition to an inner scope in which that object is unconditionally needed, or use some form of indirection. Pointers are the most common form of indirection. Instead of defining a value object, define a pointer initialized to 0. Only when the object in question is needed do you allocate it dynamically and bind it to the pointer:

string * p=0; if (string_needed) { if (!p) p=new string; //.. use the string via p } return p;This style of programming eliminates the overhead of the string initialization at the expense of complicating the code. It also introduces new security risks such as forgetting to delete the string before the program terminates. For this reason, you want to use smart pointers instead of bare pointers:

std::tr1:shared_ptr p; if (string_needed) { if (!p) p=new std::string; //.. use the string via p } return p;If this usage pattern looks familiar, its no coincidence. The singleton pattern is based on similar principles, except that it wraps the conditionally-created resource in a function call. Pay as You Go Many years of C and Pascal programming taught us to declare everything in advance, in exactly one place. This practice had several pedagogical benefits such as forcing programmers to do some sort of design before actually writing the code, but modern programming languages have to cope with different challenges -- those that 1970s programming languages didnt have to deal with. Often, deferring the declaration (or at least the initialization) of an object can improve your codes performance, modularity, and readability.

Cache and CarryPremature optimization (PO) is evil, we can all agree. Alas, PO comes in many disguises that lure software designers.

One incarnation of PO is called caching. Caching means storing the result of a complex or time-consuming operation in an accessible location (an object, local file, CPU register). That value, known as the cached value, is then used instead of computing the result anew every time. Caching is indispensable in many applications. For instance, offline browsing, file searching, and database management systems are a few examples of caching. However, in many cases, the benefits of caching are offset by its adverse effect on performance and design complications. Dont Trust Your Instincts! Programmers intuitions regarding hot spots and performance bottlenecks are pretty poor. In most cases, they simply guess where the bottleneck might be instead of profiling their code using a professional tool. Caching is also subjected to such gut feelings and hunches. In one projects in which I took part, we had to design an application that harvested certain fields from a database, going through through millions of records every night. Some fields were stored directly within each record (ID, name, etc.) whereas other fields (called derived fields), were computed on demand (zip code, the number of members in the family). Our team leader decided to avoid the recurring daily computation of the derived fields by caching them. Instead of accessing the records in the primary database, we created a secondary database containing the caches data. Our application was supposed to access the secondary database instead of the primary database, thus saving precious processing time. Many of us felt uncomfortable with this approach. Although we werent trained as DBAs, we were familiar with the first rule of data modeling: Dont reduplicate data. Yet, reduplicating data is exactly what were supposed to do. Generally speaking, every caching operation is an instance of data reduplication. Caching Overhead While caching avoids the overhead of recurrent calculations of the same value, it incurs its own overhead. To maintain the secondary database, we needed to design a derivational mechanism for detecting changes in the primary database and updating the secondary database accordingly. We didnt know in advance which records were changed in the primary database, so we had to go through the entire primary database on every daily pass. Only once this cumbersome process was over could we actually process the data. What we learned the hard way wasnt exactly the latest news; every book about data modeling from the 1960s tells you that data reduplication is bad. Alas, our team leader didnt let the facts get in the way. Caching and Data Corruption The serious overhead of keeping the secondary database in sync with the primary database notwithstanding, we faced a more serious problem: data rot. While every change in the primary database was immediately effective (you could see the changes once the transaction had been committed), our secondary database was up to 24 hours behind. This meant that a small percentage of the data therein was always out of sync. Our customers werent keen on this solution, to put it mildly. They wanted 100% fresh and valid data. In short, they forced us to throw away the expensive and complicated caching mechanism wed designed and access the primary database directly instead. Lessons This true story can teach us a few general guidelines about caching.

Always make caching optional. In most cases, this is easier said than done, but you must always provide a means for bypassing the caching mechanism, no matter how expensive this process may be in terms of performance. Without this option, you will never be able to verify that caching is actually worth the trouble. Furthermore, the non-caching data retrieval mechanism is necessary for verifying the accuracy of the cached data, at least during debugging and testing. Heed customers requirements. Offline browsing is an exemplary caching mechanism. In some cases, such as Web sites that arent updated often, it works great. However, financial applications, news and traffic reports are bad candidates for offline browsing, unless youre interested in last weeks weather and stock rates. Therefore, when you design a caching mechanism, check whether context sensitivity is an issue; can the mechanism differentiate between static data (e.g., a Web page of Greek words and their English translations) and fast-changing data such as traffic reports. Remember: customers are willing to put up with slower data as long as its accurate.

Cache on Delivery Caching incurs two major challenges: the overhead of computing the result in advance, and ensuring that the cached value is up-to-date. In large scale applications, these challenges can be quite daunting. For instance, how do you tell that a cached derived field is in sync with the primary database if that field isnt present in the primary database? You simply have to compute that field anew, and compare it with the caches value! If that is the case, you dont need caching in the first place. The automatic storage management is one of the prominent advantages of STL containers over other options, such as built-in arrays. However, there are still cases when a containers storage policy requires "manual intervention." Such intervention may be necessary when you want to shrink the containers capacity or reset it completely. While STL doesnt define explicit member functions for these operations, the common practice is to use the self-swapping idiom for these purposes. Lets see how. Size and Capacity In STL parlance, size and capacity are not the same thing. A containers size is the number of elements currently stored in it. The containers capacity reflects the number of elements that the container may store without reallocating. The capacity can be the same as the size or larger, but never smaller than the size. To observe these values, use the member functions capacity() and size(). capacity() reports the number of elements that the container can hold without requiring reallocation. size() returns the number of elements currently stored in the container. The expression

container.capacity() - container.size();is the number of available "free slots" that can be filled with new elements without reallocation. Capacity Creep Services, daemons, and ordinary applications that remain alive for long periods often exhibit performance fluctuations due to the fact that containers are quick to grow but refuse to shrink. Consider a mail server that represents incoming messages as a vector of message object before dispatching them to the clients. At peak times, the vector contains thousands of messages, causing its capacity to grow accordingly. However, during off peak, the vector contains only a few messages. The problem is that the vectors capacity remains high, even if its not used for long periods. The precise memory management policy depends on the containers allocator; however, in practice, most STL implementations Im aware of adhere to a similar pattern, whereby the capacity only increases, but doesnt shrink. If the peak times are frequent and long lasting, retaining a high capacity value is the recommended policy. Otherwise, you may need to "trim" the container occasionally, ensuring that its capacity is in sync with its size. Trimming requires two steps. Before I demonstrate them, theres a crucial aspect of containers assignment and copying semantics I would like to discuss. When you copy a container, the targets capacity is the same as its size. For example, if you have a vector whose capacity and size are 500 and 1 respectively, a copy of this vector will have the same size as the source, but its capacity will be identical to its size. The following program demonstrates this behavior:

#include #include using namespace std; int main() { vector vi; vi.reserve(500); // capacity is 500 vi.push_back(5); //size is 1

cout