04 - abstract classes

The animal-cat-dog example in the previous lesson demonstrated how virtual functions work and their primary purpose - implementing different behavior that can be invoked through the same interface.

Constructing objects of the base type alone (animal) doesn't make much sense. We can make the animal constructor protected so that derived classes can call it but not any code outside the class (this is actually done sometimes). This would still allow inheriting but effectively block creation of objects of the base animal type.

But there is another problem. What to do with the implementation of the virtual function in the base type? Functions at the top level of inheritance hierarchy often can not have any meaningful body. But the code must compile.

In the case of animal the function simply returned a dummy string to satisfy the compiler. But what if in a hypothetical base class the return type of a virtual function had no default constructor? What if there was no way to return sensible special value?

The problem could be dealt with in a generic way by using exceptions or functions that perform some halt/exit but there is a much better solution through a dedicated language feature.

Pure virtual functions

A pure virtual function is a virtual function that has no body.

A class with at least 1 pure virtual function is an abstract class. Objects of abstract types can not be created (pointers and references to abstract types are still allowed).

#include <iostream>
#include <string>

class animal
{
public:
	// for historical reasons, there is no "pure" or "abstract" keyword in C++
	// instead, pure virtual functions are denoted with "= 0"
	virtual std::string sound() const = 0;
};

class cat: public animal
{
public:
	std::string sound() const override { return "meow"; }
};

class dog: public animal
{
public:
	std::string sound() const override { return "whoof"; }
};

void print_sound(const animal& a)
{
	std::cout << a.sound() << "\n"; // virtual function call
}

int main()
{
	cat c;
	dog d;
	print_sound(c);
	print_sound(d);
}

You can attempt to modify this example just like the the one in the previous lesson (change print_sound function to take argument by value) but this time, instead of causing undesirable object slicing the program will not compile. Since objects of abstract types can not exist, passing them by value (which requires creation, copy or slicing) is also impossible.

It's worth noting that in C++ you can still define bodies of pure virtual functions (the body has to be outside the class - there is no syntax support for both = 0 and {} in one place). The class will remain abstract, but the function can be called through usual means - usually inside bodies of the same virtual function in derived classes.

There is no requirement for implementing all pure virtual functions in derived classes. A derived class can implement only some of them, making itself also an abstract type. Even more, any further derived class can "repurify" a non-abstract class by specifying non-pure virtual functions as pure virtual functions (though IIRC this feature is not present in many languages and I don't know any example where the feature is useful).

Virtual functions vs other features

Polymorphism can be a lot of fun, but due to its dynamic nature some specific language features should not be combined with virtual functions or be combined in special ways to avoid creating unwanted surprises.

Default arguments

Function default arguments are evaluated at the point of each call. This means that for something like void f(int n = g()) everytime the function is called with no explicit parameters, g will be called to supply the parameter (as if the call was f(g())). Default arguments almost always are literals or other simple expressions that produce temporary objects but the problem still remains: default arguments are not inherited, which means they need to be respecified at each level of inheritance.

Respecifying default arguments at every level of inheritance is nothing more than code duplication. And we know that it's one of the worst things in programming. Even worse is the fact that if there is a mistake in the derived class, a different evaluation will happen depending whether the function is called in the context of base or some derived class.

The problem can be solved in a very clean way: just use overloading! Specifically, write additional non-virtual overloads only in the base class that supply default arguments. Below a beautiful example that just came to my mind:

/**
 * @class base class interface representing an arbitrary timer
 */
class timer
{
public:
	/**
	 * @brief start the timer
	 * @param precision number of ticks per second
	 * @param delay delay of the start in the unit of precision
	 * @return true if timer started, false if already running
	 * @details precision may be limited to max_precision()
	 * @sa max_precision
	 */
	virtual bool start(long precision, long delay) = 0;
	bool start(long precision) { return start(precision, 0); }
	bool start() { return start(max_precision()); }

	/**
	 * @brief stop the timer
	 * @return amount of passed time,
	 * empty optional if the timer wasn't running
	 */
	virtual std::optional<long> stop() = 0;

	/**
	 * @brief check whether the timer is running
	 * @return true if running, false otherwise
	 */
	virtual bool is_running() const = 0;

	/**
	 * @brief get maximum precision supported by the timer
	 * @return maximum precision supported
	 */
	virtual long max_precision() const = 0;
};

Overloading is actually more powerful than default arguments, in this specific example you can observe that the default value of precision comes from another virtual function! Another benefit is that any derived class needs only to write one overload (and using timer::start; to avoid hiding base class overloads if they are called directly in the context of some derived type).

Overloading

The same function can have multiple virtual overloads but this is generally a bad design because effectively it forms multiple chains of virtual functions that just happen to use the same name. Having to override multiple functions that differ very little signifies that the interface (base class) wasn't designed properly. And bad interfaces attract (and sometimes even force) suboptimal implementations.

In cases where there is a need for multiple, different inputs it would be much better to stick to the same approach as with default arguments: design only 1 virtual function and multiple non-virtual overloads that convert input data to match the one expected by the virtual overload.

Supporting only 1 input type may seem limiting, but it's much better to have an unchangeable set of input-converting non-virtual functions than expecting derived classes to additionally implement their own convertion (a great place for subtle bugs caused by differences in behavior).

Operator overloading

Operators which are defined as member functions can be virtual, they work just like any other function - the only difference is that they have special name and offer special syntax.

What do to when an operator should or must be implemented as non-member? Just call a virtual function inside it (sometimes this might require creating a virtual function just for the purpose of implementing the operator):

std::ostream& operator<<(std::ostream& os, const animal& a)
{
	return os << a.sound();
}

Stream insertion/extraction is very different from other binary operators though - it's not commutative. For something like a + b (where there are 2 objects from the same type hierarchy):

implementing it as a.func(b) will call implementation based on the dynamic type of a
implementing it as b.func(a) will call implementation based on the dynamic type of b

If such thing happens, using operator overloading was probably a bad decision (polymorphic classes rarely overload operators). If the implementation of the operation requires knowledge of dynamic types of both operands, visitor design pattern should be used instead.

Constructors

Virtual functions can be called in constructors (and destructors - these are covered in other tutorial, in the chapter about RAII), but there is a limitation. Because during construction the object may only be partially initialized (the dynamic type might be a type derived from the type of which currently a constructor is running) virtual calls in constructors are resolved only down to the level of the current class (analogical but reverse thing happens in destructors - object is partially destroyed). To illustrate:

#include <iostream>

class A
{
public:
	virtual void f() { std::cout << "A::f\n"; }
};

class B: public A
{
public:
	B() { f(); }

	void f() override { std::cout << "B::f\n"; }
};

class C: public B
{
public:
	void f() override { std::cout << "C::f\n"; }
};

int main()
{
	C c;
}

B::f

Inside the body of B::B, it's unknown whether the constructor is run to initialize an object of type B or as a part of initialization of an object which type inherits from B. Since the constructor can not assume what is the actual (dynamic) type of the constructed object (and even if it could, that part is still uninitialized), virtual call considers overriders only to the level of class B.

What if there are no overriders at the level of currently running constructor (that is, function remains pure virtual)? Well, nothing good:

Reminder: UB also includes situations such as "doesn't compile" and "doesn't link". Some pure virtual calls might be caught by the linker (missing symbol definition). Some might crash in a very friendly way - I have seen GCC providing implementations for pure virtual functions so that if they happen to be called through UB, the body of the function prints an explanatory message and kills the program (this is much better than manually searching the cause of an unknown crash).

CG C.82 recommends to avoid virtual function calls in constructors and destructors. If initialization of the object requires such things (which is not always a bad design), use the named constructor approach, as described in classes / static methods - write a static function that creates the object, calls necessary virtual functions and then returns it.