When we write code, we often think about memory as a simple sequence of bytes. However, modern processors work with memory in larger chunks for efficiency. Two key concepts drive this behavior: cache lines and memory pages.
Cache lines, typically 64 bytes, are the smallest unit of data that can be transferred between the CPU cache and main memory. Similarly, memory pages are the smallest unit managed by the operating system's virtual memory system.
For optimal performance, we typically want our data to be aligned to minimise the frequency with which a single value crosses one of these boundaries. An example of a boundary cross might be a 4-byte integer where it’s first two bytes are at the end of one cache line, and the last two bytes are at the start of the next.
The boundary between our two cache lines might look like the following, where X
represents the integer we’re interested in, and A
and B
represent other arbitrary variables:
Line 1 | Line 2
A A X X | X X B B
Our systems can typically handle this - it can perform multiple reads to grab both blocks of memory, then take the appropriate bytes from each and combine them to to reconstruct our integer X
. However, this comes at a performance cost. Instead, we want to align our data to maximise the chances that it is stored entirely within the same cache line or page, eliminating the need for this additional processing.
Aligning data means we simply add additional bytes in strategic positions within the memory layout of our objects. These bytes, which contain no useful data and exist only to push subsequent bytes into later memory addresses, are called padding.
We could align our previous structure by adding 2 bytes of padding after A
, thereby pushing X
entirely onto the next line, where it can be accessed in a single read operation.
We’ll represent padding by underscores, _
, and the boundary between our cache lines would now look like this:
Line 1 | Line 2
A A _ _ | X X X X B B
Let’s see an example where our compiler will likely intervene, adding some padding to achieve a specific alignment:
#include <iostream>
struct MyStruct {
char A; // 1 byte
int B; // 4 bytes
};
int main() {
std::cout << sizeof(MyStruct) << " bytes";
}
Given instances of MyStruct
require 1 byte for the char
and 4
for the int
, we might expect the overall size to be 5
bytes. However, in most scenarios, 3 bytes of padding are added to objects of this type, bringing their total size to 8
:
8 bytes
This additional padding is added to ensure the B
integer is placed in its natural alignment - that is, a memory address divisible by 4.
As such, we can imagine the memory layout of an instance of MyStruct
looking like the following, where we have 1 byte assigned to storing the char
called A
, followed by 3 bytes of padding, and finally 4 bytes assigned to the int
value B
:
A _ _ _ B B B B
Natural alignment refers to placing data at memory addresses that match the size of the data type - 32-bit integers are typically aligned to 4-byte boundaries, 64-bit doubles to 8-byte boundaries, and so on.
This alignment strategy comes from the CPU's memory access patterns: modern processors are designed to read data most efficiently when it's placed at these aligned addresses.
This typically allows them to fetch the entire value in a single operation, rather than the more expensive process of multiple memory reads and then reconstructing the required value by combining them.
Some IDEs include tools that help us visualize the memory layout of our classes and structs. Below, we show Visual Studio’s implementation, which lists all of the data members of our struct.
To help us understand the memory layout of our type, each field also includes:
Documentation on using Visual Studio’s memory layout viewer is available here.
Let’s see another example, where we simply reorder the A
and B
members within our struct definition:
#include <iostream>
struct MyStruct {
int B; // 4 bytes
char A; // 1 byte
};
int main() {
std::cout << sizeof(MyStruct) << " bytes";
}
Perhaps surprisingly, the compiler adds 3 bytes of padding here too:
8 bytes
In this case, the padding is added to the end of our memory layout. It looks like this:
B B B B A _ _ _
The primary reason for this padding is to deal with the common scenario where multiple instances of our objects are stored contigously in memory, such as in a std::vector<MyStruct>
.
In that context, the memory layout of two objects in an array looks like this:
B B B B A _ _ _ B B B B A _ _ _
The additional padding was added to maintain alignment in scenarios like this. The B
integer in the first object is correctly aligned to byte offset 0
, whilst the B
in the second object is correctly aligned to byte offset 8
, and so on.
We can order the members of our type to make more efficient use of memory. That is, to reduce the amount of padding the compiler requires to maintain alignment.
For example, let’s consider the following struct:
#include <iostream>
struct MyStruct {
char A; // 1 byte
int B; // 4 bytes
char C; // 1 byte
};
int main() {
std::cout << sizeof(MyStruct) << " bytes";
}
Objects of this type only contain 6 bytes of useful data. However, to correctly align the integer B
(including for the array context), 6 additional bytes of padding are required, taking its size to 12:
12 bytes
The memory layout of an instance of this struct looks like this:
A _ _ _ B B B B C _ _ _
By reordering our members, we can pack memory more efficiently. The following version of MyStruct
contains all the same data, but only requires 8 bytes of storage:
#include <iostream>
struct MyStruct {
int B; // 4 bytes
char A; // 1 byte
char C; // 1 byte
};
int main() {
std::cout << sizeof(MyStruct) << " bytes";
}
8 bytes
This is more efficient because only 2 bytes of padding are required to align the integer for use in arrays:
B B B B A C _ _
The way in which our compiler adds padding is typically configurable. The default settings are almost always preferred but, in rare situations, we may need to modify them.
For example, in memory-constrained environments, it may be desirable to remove packing completely. This will reduce the memory demands of our program, but can also degrade performance and may cause unexpected behavior, so we should proceed with caution here.
One way to control our packing settings is through the #pragma pack
directive. The following program does not add any padding, reducing the size of MyStruct
objects to 6 bytes:
#pragma pack(1)
#include <iostream>
struct MyStruct {
char A; // 1 byte
int B; // 4 bytes
char C; // 1 byte
};
int main() {
std::cout << sizeof(MyStruct) << " bytes";
}
6 bytes
A B B B B C
Documentation for the #pragma pack
directive as implemented by the MSVC compiler used by Visual Studio is available on their official site.
In addition to configuring padding through our compiler settings, we can add padding on a case-by-case basis. Again, this is rarely necessary, but it has some use cases which we’ll cover later.
One way to configure padding is through the alignas()
specifier. This allows us to explicitly set the alignment of a class or struct, or a data member within that class or struct.
Below, we align MyStruct
instances to 16
, adding 4 additional bytes of padding relative to what the compiler’s default alignment of this type would be (12
):
#include <iostream>
struct alignas(16) MyStruct {
char A; // 1 byte
int B; // 4 bytes
char C; // 1 byte
};
int main() {
std::cout << sizeof(MyStruct) << " bytes";
}
16 bytes
The memory layout of these objects will be:
A _ _ _ B B B B C _ _ _ _ _ _ _
We can also totally customise the memory layout of objects simply by adding additional, unused members to act as padding:
#include <iostream>
struct MyStruct {
char A; // 1 byte
char padB[7]; // 7 bytes
int B; // 4 bytes
char padC[3]; // 3 bytes
char C; // 1 byte
};
int main() {
std::cout << sizeof(MyStruct) << " bytes";
}
16 bytes
The memory layout of these objects will be:
A _ _ _ _ _ _ _ B B B B _ _ _ C
We won’t need to go this low-level with the objects we’re creating in this course. However, it’s important to know that it’s an option, and that the objects we’re working with may have customised their memory layout. As such, we should be careful with making any assumptions when serializing or copying complex objects.
For example, SDL intervenes in the layout of the pixel data associated with an SDL_Surface
. From the compiler’s perspective, this data is just another array of numbers. But in context, SDL knows this contiguous block of memory actually represents a two-dimensional image - a grid of pixel colors.
With this additional context in mind, SDL intervenes in the memory layout, adding additional padding in ways that make grid-based operations (like reading a rectangular area of pixels) as efficient as possible.
As we might expect, these padding and alignment behaviors have implications when it comes to serializing our objects. If we’re not mindful that these "gaps" exist between our variables, our serialization and deserialization code can contain serious bugs and result in data loss.
Below, we attempt to serialize MyStruct
without being aware that padding is added between A
and B
. We assume, therefore, that writing 5 bytes will capture all of the data:
#include <SDL.h>
#include <iostream>
struct MyStruct {
char A;
int B;
};
int main(int argc, char** argv) {
SDL_RWops* rw{
SDL_RWFromFile("example.bin", "wb")};
if (!rw) {
std::cerr << "Failed to open file: "
<< SDL_GetError();
return 1;
}
MyStruct Serialized{'A', 42};
// Assume MyStruct is 5 bytes
SDL_RWwrite(rw, &Serialized, 1, 5);
SDL_RWclose(rw);
std::cout << "Serialized: A = "
<< Serialized.A
<< ", B = " << Serialized.B;
return 0;
}
Original: A = A, B = 42
If we later read this file using the same assumptions, we’ll see our B
integer doesn’t have the correct value:
#include <SDL.h>
#include <iostream>
struct MyStruct {
char A;
int B;
};
int main(int argc, char** argv) {
SDL_RWops* rw{
SDL_RWFromFile("example.bin", "rb")};
if (!rw) {
std::cerr << "Failed to open file: "
<< SDL_GetError();
return 1;
}
MyStruct Deserialized;
SDL_RWread(rw, &Deserialized, 1, 5);
SDL_RWclose(rw);
std::cout << "Deserialized: A = "
<< Deserialized.A
<< ", B = " << Deserialized.B;
return 0;
}
Deserialized: A = A, B = -859045846
To solve this problem, we need to approach alignment of class and struct instances differently.
The standard way of serializing and deserializing objects in a way that respects alignment across a variety of platforms is to handle their data members as individual values.
Rather than serializing a MyStruct
object in a single operation, we’d serialize each of its variables individually. In large programs, this is typically done by adding dedicated serialization and deserialization methods to our class or struct:
// MyStruct.h
#pragma once
#include <iostream>
#include <SDL.h>
class MyStruct {
public:
char A;
int B;
void Save(const std::string& path) const {
SDL_RWops* Handle{SDL_RWFromFile(
path.c_str(), "wb")};
if (!Handle) {
std::cout << "Error opening file: "
<< SDL_GetError();
return;
}
SDL_RWwrite(Handle, &A, sizeof(char), 1);
SDL_RWwrite(Handle, &B, sizeof(int), 1);
SDL_RWclose(Handle);
}
void Load(const std::string& path) {
SDL_RWops* Handle{SDL_RWFromFile(
path.c_str(), "rb")};
if (!Handle) {
std::cout << "Error opening file: "
<< SDL_GetError();
return;
}
SDL_RWread(Handle, &A, sizeof(char), 1);
SDL_RWread(Handle, &B, sizeof(int), 1);
SDL_RWclose(Handle);
}
};
Elsewhere in our program, we can now instruct MyObject
instances to save their state to a file using the Save()
method, or load their state from a file using the Load()
method:
#include <iostream>
#include "MyStruct.h"
int main(int argc, char** argv) {
MyStruct MyObject{'A', 42};
MyObject.Save("example.bin");
std::cout << "Serialized: A = "
<< MyObject.A << ", B = " << MyObject.B;
MyObject.Load("example.bin");
std::cout << "\nDeserialized: A = "
<< MyObject.A << ", B = " << MyObject.B;
return 0;
}
Serialized: A = A, B = 42
Deserialized: A = A, B = 42
In this lesson, we've seen how memory alignment affects our C++ programs and why padding is necessary. Understanding these concepts helps us write more efficient code and avoid common pitfalls when working with data serialization.
Key takeaways:
Learn how memory alignment affects data serialization and how to handle it safely
Learn C++ and SDL development by creating hands on, practical projects inspired by classic retro games