Preprocessor Directives and the Build Process

Learn the fundamentals of the C++ build process, including the roles of the preprocessor, compiler, and linker.
This lesson is part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Free, Unlimited Access
Abstract art representing computer programming
Ryan McCombe
Ryan McCombe
Updated

This lesson is a quick introductory tour of header files and the build process within C++. It is not intended for those who are entirely new to programming. Rather, the people who may find it useful include:

  • those who have completed our introductory course, but want a quick review
  • those who are already familiar with programming in another language, but are new to C++
  • those who have used C++ in the past, but would benefit from a refresher

It summarises several lessons from our introductory course. Those looking for more thorough explanations or additional context should consider completing Chapter 7 of that course.

Previous Course

Intro to Programming with C++

Starting from the fundamentals, become a C++ software engineer, step by step.

Screenshot from Cyberpunk 2077
Screenshot from The Witcher 3: Wild Hunt

The Build Process

In C++, building code is broken into three steps:

  • The preprocessor executes all our preprocessor directives, such as #include, to generate translation units
  • The compiler translates each of these translation units into collections of machine code called object files
  • The linker connects the object files together, into a single coherent package (eg, an executable file, or a library)

When we declare a function without defining it we are, in effect, telling the compiler that we’re going to define the function elsewhere. We’re telling the compiler that the linker will take care of it.

So, as long as our function is defined somewhere in the translation units that are sent to the compiler, the linker will be able to complete the build process successfully.

We previously mentioned functions can be declared and defined in different places. This separation is most typically done to split our code into header files and source files.

Header Files

This is most common when dealing with class member functions

It is conventional for large classes to be split into two files - a "header" file to store the declarations, and a "source" file to store the definitions. Header files typically have the .h extension, whilst source files often use .cpp:

// Character.h
class Character {
  int Health{100};
  void TakeDamage(int Damage);
};
// Character.cpp
#include "Character.h"

void Character::TakeDamage(int Damage) {
  Health -= Damage;
}

This is a convention not widely adopted outside of C++, so it’s worth giving a slightly longer explanation.

One benefit of this approach is that it gives us a quick way to get an overview of a class. We only need to look at the header file to get a good sense of what it does.

However, more importantly, it allows us to adopt a convention where we only need to #include header files rather than the full implementations. This keeps compilation times to a minimum.

Preprocessor Directives

The way we communicate with the preprocessor is through preprocessor directives. We already covered the #include directive, but there are a few more

The #define directive

The #define directive lets us define a block of text, which we can then reuse across our file. The thing we define is typically called a macro. To distinguish macros from regular variables, we typically give them UPPERCASE names, but this is not required.

Note that preprocessor directives are not C++ statements, so we do not need to include semicolons.

In the following code, the preprocessor will replace INTEGER_TYPE with int

#define INTEGER_TYPE int

INTEGER_TYPE MyVariable{5};

As such, what the compiler sees will simply be:

int MyVariable{5};

There is a more advanced form of this, which allows us to pass arguments to our macros, effectively turning them into functions. This is something we cover later in this course.

Conditional Inclusion - #if, #ifdef, #endif and friends

The preprocessor includes support for conditional logic. This allows us to control which parts of our code get included or excluded from the compilation process.

#include <iostream>
#define DAY 7

int main() {
#if DAY >= 6
  std::cout << "It's the weekend";
#endif
}

Typically, our conditional inclusion is based on whether or not we have defined a macro at all. This is used to generate different versions of our software. For example, we’re likely to have a "developer" or "debug" version of our software.

This version will include extra code to help us visualize what is going on during the development process, but we want it to be removed from what we ship to our users.

We can set this up by wrapping any such code in a directive that checks if a specific macro is defined

#include <iostream>

int main() {
#if defined(DEBUG_MODE)
  std::cout << "Logging extra diagnostics";
#endif
}

Our build tools will give us the option to inject a preprocessor definition across our entire project, thereby letting us create these alternative builds in a simple, automated way.

The #if defined(SOMETHING) pattern is so common, there is a shortened form - #ifdef:

#ifdef DEBUG_MODE
 // ...
#endif

Conditional Inclusion (#if, #ifdef, etc) vs Runtime Conditionals (if, else, etc)

Note that conditional inclusion is not equivalent to standard conditional logic, such as an if statement. Conditional inclusion is done at compile time, whilst if statements are at run time. This has two implications:

  • Conditional inclusion has no performance cost at run time
  • Conditional inclusion entirely removes code from our build. This means our program is smaller, and those features cannot be discovered or re-enabled by users reverse-engineering our software

We also have some additional preprocessor directives that can help us build more advanced conditional logic:

  • #else - The basic equivalent of an else statement
  • #elif - The preprocessor equivalent of an else-if statement
  • #ifndef - "if not defined" - for example, #ifndef SOMETHING is equivalent to #if !defined(SOMETHING)

The following uses these directives in a more complex example, which includes nested conditionals:

#include <iostream>
#define DAY 7

int main(){
#ifndef DAY
  std::cout << "I don't know what day it is";
#elif DAY >= 6
  std::cout << "It's the weekend";
  #if DAY == 6
    std::cout << " (Saturday)";
  #else
    std::cout << " (Sunday)";
  #endif
#else
  std::cout << "It's a weekday";
#endif
}
It's the weekend (Sunday)

Two additional directives were added in the C++23 language specification, and may not yet be available in all compilers:

  • #elifdef - "else if defined" - for example, #elifdef SOMETHING is equivalent to #elif defined(SOMETHING)
  • #elifndef - "else if not defined" - for example, #elifndef SOMETHING is equivalent to #elif !defined(SOMETHING)

Header Guards and the #pragma once directive

When we use the #include directive, code from one file effectively gets copied and pasted into another. This can cause problems when a file gets included multiple times. This usually happens indirectly - for example:

  • if File A includes File B and File C
  • and File B includes File C

Then File A ends up including File C twice. Once directly, and once indirectly through File B

We prevent this using header guards, which are a form of conditional inclusion:

#ifndef MyClass
#define MyClass

class MyClass {
  // ...
};

#endif

The first time the above file gets included, MyClass is not defined within the preprocessor, so the entire contents of the file are copied across.

That includes a definition for MyClass so, after the first inclusion, MyClass is then defined. If the file is included a second time, the #ifndef directive detects the definition, so everything until #endif (effectively the whole file) gets excluded.

This pattern is so common that a dedicated directive is available to handle the use case - the #pragma once directive:

#pragma once

class MyClass {
  // ...
};

This is Weird

At this point, seems like the preprocessor is pretty much a programming language in its own right.

If the concept of a programming language within a programming language seems slightly off, you’re not alone in thinking that. It is a long-term goal of C++ to reduce the relevance of the preprocessor, by covering its use cases within the base C++ language.

For example, the C++17 spec introduced if constexpr statements. These are a variant of the basic if statement that eliminates branches of code at compile time and is, therefore, an alternative to the preprocessor’s conditional inclusion features

C++20 introduced modules, which allow us to organize our code into smaller parts without requiring any #include directives to bring it back together.

We cover both of these later in this course, but C++ development tends to be slower and more cautious than what we might be accustomed to from other languages.

It can take years for compilers to support new language features, and decades for them to become widely adopted within real-world projects. So, we need to be comfortable with the preprocessor for quite some time yet.

Summary

The C++ build process involves three main steps: preprocessing, compiling, and linking. Preprocessor directives, header files, and source files play crucial roles in organizing and manipulating code before compilation. Key takeaways:

  • The preprocessor executes directives and generates translation units
  • The compiler translates translation units into object files
  • The linker connects object files into an executable or library
  • Header files (.h) typically contain class and function declarations
  • Source files (.cpp) contain the corresponding definitions
  • Key preprocessor directives: #include, #define, conditional inclusion (#if, #ifdef, etc.), and #pragma once

Was this lesson useful?

Next Lesson

Understanding Reference and Pointer Types

Learn the fundamentals of references, pointers, and the const keyword in C++ programming.
Abstract art representing computer programming
Ryan McCombe
Ryan McCombe
Updated
A computer programmer
This lesson is part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Free, Unlimited Access
Whirlwind Tour of C++ Basics
A computer programmer
This lesson is part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Free, unlimited access

This course includes:

  • 125 Lessons
  • 550+ Code Samples
  • 96% Positive Reviews
  • Regularly Updated
  • Help and FAQ
Next Lesson

Understanding Reference and Pointer Types

Learn the fundamentals of references, pointers, and the const keyword in C++ programming.
Abstract art representing computer programming
Contact|Privacy Policy|Terms of Use
Copyright © 2024 - All Rights Reserved