03 - std::array

By the time you are reading this you should understand:

  • how arrays work (size and indexing)

  • how to pass and use arrays in functions

  • array limitations (no easy way to compare and copy)

The arrays presented so far are often called "raw arrays", "built-in arrays" or "C arrays". This is because they are one of fundamental parts of the C language, one which C++ inherited, including it's limitations. Since majority of rules regarding arrays can not be changed (too much code breakage), C++11 added an alternative.

Originally developed as boost::array, std::array is a wrapper type build on top of C arrays. It's core definition looks roughly like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
namespace std {

template <typename T, size_t N>
struct array
{
	T arr[N];

	constexpr const T& operator[](size_t n) const { return arr[n]; }
	constexpr       T& operator[](size_t n)       { return arr[n]; }

	constexpr size_t size() const noexcept { return N; }

	// [...] other functions and support for operators such as =, ==, !=
};

}

The type syntax of this standard library type is different from C arrays because here the stored type and array size are specified as template parameters. All typical operations have been defined to be very intuitive by reusing existing operators. The specific feature in play here is operator overloading which allows to define meanings for operators for user-defines types so that they can be used just like built-in types. The feature is explained in its own chapter.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include <array>
#include <iostream>

int main()
{
	std::array<int, 5> arr1 = {1, 2, 3, 4, 5};

	// std::array objects can be initialized and assigned from other objects
	auto arr2 = arr1;

	// comparison checks all elements
	std::cout << (arr1 == arr2) << "\n"; // 1

	// indexing works the same way
	arr2[0] = 2;
	std::cout << (arr1 == arr2) << "\n"; // 0

	// std::array offers additional functionalities:

	// member functions
	std::cout << arr2.size() << "\n"; // 5

	// ranged loops (known as for-each loops in other languages)
	// syntax sugar when compared to writing logic on an i variable
	for (int value : arr2)
		std::cout << value << " ";
}

Main benefits of using std::array are:

  • type safety - unline C arrays, this type does not decay

  • support for common operations such as comparison and copying - overloaded operators are defined as functions which perform these tasks

  • support for range-based loops and iterators

  • additional member functions such as .size()

Range-based loops

C++11 introduced syntax sugar that was already common in the programming world. Better known as for-each loops (range-based loops is strictly C++ terminology), it allows to write simplest loops in a shortened way.

The syntax is:

1
2
3
4
5
6
// since C++11
for (range-declaration : range-expression)
	loop-statement
// since C++20
for (init-statement range-declaration : range-expression)
	loop-statement

This is strictly syntax sugar - it doesn't rely on any particular magic feature of arrays, ranged-based loops are simply rewritten by the compiler to traditional for-loops.

details
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
{ // C++11 and C++14
	auto&& __range = range-expression;
	for (auto __begin = begin_expr, __end = end_expr; __begin != __end; ++__begin) {
		range-declaration = *__begin;
		loop-statement
	}
}
{ // C++17 and later (begin_expr and end_expr can be different types)
	init-statement // since C++20, if provided
	auto&& __range = range-expression;
	auto __begin = begin_expr;
	auto __end = end_expr;
	for (; __begin != __end; ++__begin) {
		range-declaration = *__begin;
		loop-statement
	}
}
  • if range-expression is a C-array:

    • begin_expr is __range

    • end_expr is __range + __array_size where __array_size is the size of the array

  • else if range-expression is a class type that has members named begin and end:

    • begin_expr is __range.begin()

    • end_expr is __range.end()

  • else:

    • begin_expr is begin(__range)

    • end_expr is end(__range)

Don't worry if you don't get all this code - the whole feature exists so that you don't have to know all the details.

In other words, the variables used in the loop are initialized to:

  • memory address range if the type is a C-array

  • result of begin() and end() if the type has such member functions

  • result of global functions otherwise (functions are expected to match range as their argument) (functions found by ADL) - this specific variant allows to write helper functions to iterate on foreign types (usually from an external library) when the type can not be modified (it's not your code)

Some examples to demonstrate:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
double c_arr[5] = {0.1, 0.2, 0.3, 0.4, 0.5};

for (double& d : c_arr)
	d *= 2;
// expanded
{
	auto&& __range = c_arr; // auto = double(&)[5] (reference to C-array)
	auto __begin = c_arr;   // auto = double* (decays array to pointer)
	auto __end = c_arr + 5; // auto = double* (also decays and shifts the pointer to point further in memory)
	for (; __begin != __end; ++__begin) {
		double& d = *__begin; // accesses memory pointed by __begin
		d *= 2;
	}
}

std::array<double, 5> std_arr = {0.1, 0.2, 0.3, 0.4, 0.5};

for (double& d : std_arr)
	d *= 2;
// expanded
{
	auto&& __range = std_arr;       // auto = std::array<double, 5>&
	auto __begin = std_arr.begin(); // auto = std::array<double, 5>::iterator
	auto __end = std_arr.end()      // auto = std::array<double, 5>::iterator
	for (; __begin != __end; ++__begin) {
		double& d = *__begin;       // iterators overload * to imitate pointers
		d *= 2;
	}
}

Can I loop backward using this syntax?

No. The shortest way would be to use reverse iterators, from rbegin() and rend(). There is no syntax sugar for these, so you would have to manually write the loop.

Range-based loops shorten code and eliminate possible errors caused by various mistakes with i and similar variables.

Array size

For C arrays I have mentioned that they must have a positive size (with the special case of 0 allowed by compiler extensions). std::array can have size 0 and will work just as expected:

  • .size() will return 0

  • .begin() will be == to .end()

  • any loop will terminate immediately (no iterations would be made)

How does is this possible if std::array contains a C-array inside? Are they implemented with compiler extensions?

No. They are implemented using template specialization which allows to provide separate definition for specific parameters. If the size parameter is 0, the definition is different. The main purpose of this specialization is to make it work consistently for any size parameter, even though size 0 has almost no practical value (but someone writing templates can accidentally create such arrays, without easily realizing it).

How about negative size?

std::array template parameter for size has type std::size_t so it's not possible. If you write a negative value it will be converted to an unsigned type, likely resulting in a huge value because of how signed-to-unsigned convertion works (modulo 2 arithmetic).

Passing std::array

std::array does not decay so you can write functions which accept is as a parameter, but it's quite limiting in other way - the function will accept arrays of only certain size (template parameters are a part of type information).

Thus, it's recommended to still use:

  • pointer + size: (const T*, std::size_t)

  • (C++20) (std::span<T>) which essentially is a struct containing pointer and size

Functions with such parameters will work for:

  • C-arrays (T[])

  • std::array

  • std::vector

  • any other container (not necessarily from standard library) that has contiguous storage

How do you pass std::array into a function? How to turn it to const T*?

f(arr.data(), arr.size()). The approach is the same for any container that follows standard library conventions. Different containers implement different data structures in memory, so not every function is offered by every container, but if a function of specific name is present, you can expect it to have the same semantics.

The size function is offered by pretty much every container (though elements may be laid out in memory very differently). The data function is offered by types which implement contiguous storage (only one block of stack or dynamically allocated memory) - most predominantly string types, std::array and std::vector.

By using a "pointer + size" (or C++20 span class) interfaces, you allow your functions to support a variety of containers without forcing external code to use any particular data structure implementation. What the function should care about is not any particular implementation but just contiguous storage.

Exercise

Take the code from previous exercise and rewrite C-arrays to std::array.