02 - instantiation

Anywhere a type or value appears it can be substituted with a template parameter. The types may be even incomplete if operations they are used with allow it. The only restriction is that the code formed by template subsitution mechanism is itself valid.

template <typename T>
T* f1(T* p) { return p; }

template <typename T>
T f2(T*) {}

template <typename T>
T& f3(T* p) { return *p; }

int main()
{
	void* p = nullptr;
	f1(p);    // ok: returns void*
	f2(p);    // ok: returns void
	// f3(p); // error: type void& is invalid and *p is invalid

	int x = 1;
	f1(&x); // ok: returns int*
	f3(&x); // ok: returns int&

	// will compile, but has undefined behavior
	// (no return statement in function returning int)
	// f2(&x);
}

As you can see, the f2 has return type T and no return statement. If the template parameter is void the function is valid. f3 looks like it could work on any pointer, yet because forming references to void is not allowed, compilation of this function template combined with void template parameter will fail.

The example leads to an interesting conclusion:

The act of using a template with specific template parameters is called template instantiation.

2-phase translation

Because of instantiation, the code of each template is analyzed twice:

First, when a definition is encountered. The compiler can not compile the code yet, but it checks the syntax and code that does not depend on template paramaters.
Second, when a template is instantiated. Template-parameter-dependent code is now turned into non-template code and compiled.

The first step happens once for every template entity in the given translation unit. The second step (instantiation) can happen multiple times. Each instantiation can result in different compiler errors.

The 2-phase nature is best demonstrated with another example. The following program will fail with every possible instantiation (no type has size zero and sizeof(void) is ill-formed) but the 1st phase will succeed:

template <typename T>
void f(T t)
{
	static_assert(sizeof(T) == 0);
	nonexistent_function(t); // also dependent on T
}

A program will compile successfully if this function template is never used. No instantiation - no errors.

Why is the non-existent function not causing any errors? How is the template parameter relevant to it? Searching for a function declaration/definition doesn't depend on the type of the argument, right?

It does. But not on the template argument itself, but on the namespace of the argument. ADL (argument dependent lookup) means that functions are searched in the global namespace but also in namespaces of the arguments. This rarely-thought-of mechanism is the reason why overloaded operators work - they are functions that are often defined in namespaces of the types they support.

What if the compiler is smart enough to figure it out that it's impossible to instantiate this function template successfully?

It can not be. The language specification dictates that statements dependent on template parameters (in any semantic way possible) are analyzed only at the second (instantiation) phase. It would also be computationally expensive to implement to handle more complex situations.

You can make this example fail in the first phase by modifying it to have static_assert(false). Such statement is not dependent on template parameters and will be caught in the first phase, before any instantiations.

When working with templates, you will typically encounter compiler errors coming from the second (instantiation) phase. GCC and Clang will include "in instantiation of" string with template parameters in the diagnostic message if the error comes from this phase.

When receiving instantiation errors, you need to think which might be incorrect:

the template definition
the template parameter (this specific instantiation)

In typical situation, it's a mistake how the template is being used (usually a wrong type parameter). Rarely you may write a template that accidentally limits possible uses and it's the template definition that requires changes. Subtle changes in the definition (e.g. T() vs T{}) may alter meaning for some types and have no effect for others.

Declarations

Just like any non-template entity, templates can be forward declared too:

1 2	template <typename T> const T& min(const T& x, const T& y);

In the case of templates it's done rarely because the template definition (and all specializations) must be present before the first instantiation.

The reason is simple: templates are ODR-used upon instantiation. The compiler can not enter second phase if it knows only the declaration of the entity. The primary use of declarations is resolving dependencies between templates themselves - a good example would be a set of function templates that can call each other recursively. For this reason declarations of template entities are generally rare.

Because of the requirement that a definition must be visible at the point of instantiation and the fact that templates are implicitly inline, they are typically put into header files.

Some library projects list all template declarations first and then write or include their definitions at the bottom of the file. This separates interface/implementation similarly to ordinary non-template code (2 separate files) while still technically delivering all necessary information to the compiler within a single header. It's a lot more code to write but it improves readability if the headers are a primary source of documentation and users of the library are expected to read comments within the source code. Having definitions separately reduces the noise while reading documentation comments.