differences between C and C++

Below is a (hopefully complete) list of differences between C and C++.

I do not list features that are only in one language, only stuff that exists in both but has different meaning.

Keywords

  • C++ has no restrict keyword. There were some attempts to bring it but it's already complicated in C. In C++ due to language complexity it could very easily become a source of bug-generating optimizations if used incorrectly. Every major compiler offers __restrict extension though.

  • C++ has no meaning for register, the keyword remains reserved.

  • C++11 has changed the meaning of auto from storage specifier to a type specifier.

    • In C, auto is a hint for the compiler (now generally ignored, similarly to inline).

    • In C++11 and later, auto is the keyword for type inference, like var and let in other languages.

Character literals

In C, character literals like 'a' have type int. In C++ they have type char. Both languages support character literal prefixes and for prefixes allowed in both languages they have the same resulting type.

String literals

C allows to assign string literals to non-const pointers:

1
char* str = "abc"; // valid C, invalid C++

Attempting to modify such string is undefined behaviour.

Arrays

Both languages allow to initialize non-const character array with a string literal:

1
char arr[] = "abc";

There is no undefined behaviour when modifying such array - the array is simply a modifiable copy of the literal.

In C, character array of size one less than the size of the string literal may be initialized from a string literal; the resulting array is not null-terminated. This is not allowed in C++.

1
2
3
// C: ok, arr = { 'a', 'b', 'c' }
// C++: error: initializer (const char[4]) too long
char arr[3] = "abc";

In C++, references and pointers to arrays of unknown bound can be formed, but cannot be initialized or assigned from arrays and pointers to arrays of known bound.

In C, pointers to arrays of unknown bound are compatible with pointers to arrays of known bound and are thus convertible and assignable in both directions.

1
2
3
4
5
6
7
8
extern int a1[];
int (&r1)[] = a1;  // ok (C++ only code)
int (*p1)[] = &a1; // ok in C, ok in C++
int (*q)[2] = &a1; // ok in C, error in C++

int a2[] = {1, 2, 3};
int (&r2)[] = a2;  // error (C++ only code)
int (*p2)[] = &a2; // ok in C, error in C++

Loops

  • In C, the scope of loop statement is nested within the scope of init-statement.

  • In C++, the scope of init-statement and the scope of loop statement is one and the same.

1
2
3
4
5
6
for (int i = 0; i < n; ++i)
{
	// C: well-defined (name shadowing)
	// C++: ill-formed (redefinition of i)
	long i = 1;
}

Type definitions and usage

C requires to prefix every non-built-in type name with a keyword that describes what it is.

1
2
3
4
5
struct foo { int x; };
void func1(struct foo* f);

enum some_enum { e1, e2 };
void func2(enum some_enun e);

It's possible to create an alias to avoid this requirement. A strong convention is to use exactly the same name:

1
2
3
4
5
6
7
struct foo { int x; };
typedef struct foo foo;
void func1(foo* f);

enum some_enum { e1, e2 };
typedef enum some_enum some_enum;
void func2(some_enum e);

A lot of code combines the type definition and an alias into one statement:

1
2
typedef struct { int x; } foo;
typedef enum { e1, e2 } some_enum;

All of the above is allowed in C++ (for backwards compatibility) but not required.

A corner case where it is required are name clashes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>

#include <cerrno>

int file_size(const char* path, off_t& size)
{
	struct stat statbuf; // "struct" to use the type
	if (stat(path, &statbuf) != 0) // use the function
		return errno;

	size = statbuf.st_size;
	return 0;
}

Obviously using the same name for a type and a function is bad practice.

Empty types

C does not allow empty types.

1
struct empty {}; // invalid C, valid C++

Empty types in C++ are commonly used in tag dispatching and other tricks that leverage strong typing - usually found in templates. Empty types are also a subject for empty base optimization.

Empty parameter lists

In C, a function declaration with no expression between parenthesis declares a function with unspecified amount of arguments (known as a function without prototype). Calling such function with arguments that mismatch function definition results in undefined behavior. In C++ there is no such problem.

1
2
3
4
5
6
7
// C  : declaration of a function with unspecified amount and types of arguments
// C++: declaration of a function that takes 0 arguments
void func();

// C  : declaration of a function that takes 0 arguments
// C++: declaration of a function that takes 0 arguments, just ugly (kept for backwards compatibility)
void func(void);

Missing return

In both languages it is valid to have a function with non-void return that does not return on some control flow paths.

1
2
3
4
5
6
7
int func(int a, int b, int n)
{
	if (n > 0)
		return a / b;
	if (n < 0)
		return b / a;
}

However:

  • In C it is UB to read the value returned from such function if it reached non-return path.

  • In C++ it is UB to just reach the non-return path when executing the function (the stricter requirement is an effect of return value optimation which C does not have).

Writing such functions is obviously discouraged in both languages, all major compilers generate a warning.

Addresses of standard library functions

C explicitly allows to take adresses of standard library functions (with exceptions).

C++ explicitly disallows to take addresses of standard library functions (with exceptions). One of the reasons is that C++ allows or requires multiple overloads for many functions, many of which can be implemented through templates and can change with standard library updates. Workaround: make a wrapper around standard library function and use the address of the wrapper.

Main function

C++ forbids any use of the main function. This includes calling it and taking its address.

Unions

C allows unions for type punning.

C++ has constructors and destructors which complicate the situation. Unions allows only to access last assigned member and any other access is undefined behaviour.

1
2
3
4
5
6
7
8
9
// valid C, UB in C++
union {
	int n;
	char bytes[4];
} packet;

packet.n = 1;
if (packet.bytes[0] == 1) // accessing other member
	// ...

All major C++ compilers document that such behavior is not UB in their implementation and permit it for type punning (there are other, standard-compliant ways to do it though). C++ committee is aware of the problem that this part of the standard is a grey area; from what I know there is some work undergoing to permit such code if the union members are trivial types.

Aliasing

  • In both languages any (potentially cv-qualified) void* may alias.

  • In C, (potentially cv-qualified) signed/unsigned/unspecified char* may alias.

  • In C++, only (potentially cv-qualified) unsigned/unspecified char* may alias.

Want to know more? Read the article about strict aliasing TOWRITE.

Linkage rules

Names in the global scope that are const and not extern have external linkage in C but internal linkage in C++.

1
2
3
const int n = 1;
// C  : n can     be referred in other translation units
// C++: n can not be referred in other translation units (requires extern)

Expressions

From:

In C++, the conditional operator has the same precedence as assignment operators, and prefix ++ and -- and assignment operators don't have the restrictions about their operands.

In C, the ternary conditional operator has higher precedence than assignment operators. Therefore, the expression e = a < d ? a++ : a = d, which is parsed in C++ as e = ((a < d) ? (a++) : (a = d)), will fail to compile in C due to grammatical or semantic constraints in C.