02 - declarations

Before any entity is used (formally, ODR-used) it must be defined. This was already noticeable with variables (their definition specifies a type) and a similar mechanism exists with functions. So far all functions in example programs were defined before main function, because main function used them.

However, a function definition is not required at the point of calling it. A declaration is enough. It is possible to declare a function and define it's body later:

#include <iostream>

// note 1: declarations end with ;
// note 2: declarations do not have to specify parameter names, only types
void greet();
int test(int, int);

int main()
{
	// ok, functions were declared
	greet();
	std::cout << test(2, 3) << "\n";
}

int test(int x, int y)
{
	return x * y;
}

void greet()
{
	std::cout << "hello, world\n";
}

Despite the fact that function parameter names can be skipped in the declaration, it's still recommended to write them. Function declaration is the place most readers will check for documentation comments and parameters with descriptive names are valuable information. Parameter names in the declaration do not have to be the same as in definition (only names in definition affect function body) but in pretty much every case they are written as identical for consistency.

As long as the declaration appears before the function is called, the function definition can be placed later in the code. In fact, it can also be in a separate file!

Is this the reason why #includes are needed to access standard library functions?

Yes. These files (called headers) contain all the declarations (and definitions) necessary to use the standard library, which can be implemented elsewhere. Note that header/source files are not strictly declaration/definition separation - the mechanism is a bit different (more like "interface/implementation") and explained in the preprocessor chapter.

ODR

Function declarations and function definitions are one of primary examples of ODR in practice.

ODR is extremely important so to make sure you understand, let's go over each point.

Every C++ entity must be declared at least once. Every definition is also a declaration.

This should be self-explanatory. A single definition is enough to act as both. These points were already used in examples in previous lessons where other functions were defined before main function.

Each declaration must be equivalent.

Because declarations may repeat (in practice they almost never do), there is a risk for code duplication. To prevent bugs, the compiler verifies that duplicate declarations are equivalent.

Every C++ entity can be defined at most once.

The rule specifies at most once, not exactly once because some entities can be declared and left undefined. As long as they are not ODR-used, a declaration is enough.

Every C++ entity must be defined before it's ODR-used.

An ODR-use is a use of the entity which requires its definition. What requires definition and what only a declaration is very case-specific. Calling functions only requires declaration. Obviously the final program must have the function defined - if you only declare it and try to build the program it will fail.

ODR is extremely important when separating code into multiple files. C and C++ build process (including the concept of translation units) requires ODR to be satisfied. More on multi-file projects later in the preprocessor chapter.

In practice, vast majority of ODR violations are caught by the compiler or the linker. Undefined behavior (which means anything can happen) also includes the possibility of build errors.

Advantages of declarations

Most of the code can only rely on declarations and definitions can be compiled separately. The main benefits of this are:

ability to organize code much more freely (e.g. putting function definition in a separate file)
ability to have cross dependencies within the code

The second point is very important. As the program gets more complex sooner or later you might find cycles: function A can call B which can call C which can call A. In some problems cycles of dependencies indicate flawed design but in some they are a necessity. A prime example of the second are mathematical and programming expressions - each can nest more subexpressions inside.

Below is an example of 2 cross-dependent functions which can not be defined without declaring something first. However you would try to reorder the program, at least 1 function needs to be declared first.

#include <iostream>

void bar(int x);

void foo(int x)
{
	if (x == 0)
		return;

	std::cout << "foo: " << x << "\n";
	bar(x - 1); // OK: bar declared
}

void bar(int x)
{
	if (x == 0)
		return;

	std::cout << "bar: " << x << "\n";
	foo(x - 1); // OK: foo declared (through foo definition)
}

int main()
{
	foo(5);
	std::cout << "\n";
	foo(4);
	std::cout << "\n";
	bar(3);
	std::cout << "\n";
}

Necessary declarations are often called forward declarations.

What would happen if one of these functions was called with a negative number?

Then both functions would end up calling each other endlessly. Potential outcomes are:

Value reaches end of signed integer range which is undefined behavior (only unsigned numbers have well-defined overflow).
Function call stack exhausts stack memory space which causes stack overflow which is also undefined behavior.

`(void)` declarations

History time. Initially, in C, there was no mechanism of function declarations. Code which called a function was implicitly declaring it, assuming such function exists and has int return type. If a function with such name was not found in compiled code (possibly originating from different file), it was a linker error. If a function with such name did exist, the linker would connect machine code of its call and definition, without checking whether provided arguments match function definition. At runtime, the function could start evaluating its parameters and if they did not match, it would perform improper read/write operations resulting in memory corruption.

It was a big problem that a mistake as simple as mismatched amount and/or types of arguments could result something as bad as undefined behavior. Function declarations have been added but they weren't initially as detailed as today - they only stated function name and return type.

// before C89 this was all that was possible
// formal name: non-prototype function declaration

// return type: void
// name: f
// parameters: UNSPECIFIED
void f();
// return type: void
// name: g
// parameters: UNSPECIFIED
void g();

// new syntax added in C89

// return type: void
// name: f
// parameters (1): int
void f(int x);
// return type: void
// name: g
// parameters (0)
void g(void);

// before C89: both function calls are well-defined
// but will cause undefined behavior at runtime
// C89: both function calls are ill-formed
// compile time error (argument type/amount mismatch)
f(3.14);
g(3.14);

The (void) thing does not specify an argument of type void (you can not have objects of this type). It's a special syntax to differentiate it from old non-prototype function declaration syntax. Without this rule, both new and old syntax would look the same for functions taking 0 parameters - this would break existing code by changing its meaning.

In other words, since C89 functions can be properly declared (with parameter types) but due to backwards compatibility and the fact that () already had a meaning, (void) is needed for functions taking 0 parameters.

In C++, there is no such problem. C++ has no non-prototype function declarations, () works as expected. (void) is supported only for compatibility with C code imported to C++.

// C
void g();     // unspecified amount of parameters
void g(void); // 0 parameters

// C++
void g();     // 0 parameters
void g(void); // 0 parameters, just ugly

Summing it up, writing (void) in C++ is a mistake. It comes from misunderstanding how function declaration syntax evolved in C and how it works in C++.

Exercise

Take the example of cross-dependent functions and swap their order so that a different function has to be declared first.