std::vector
std::vector
Inevitably, we will want to store a group of related objects. That might be a collection of characters in a party, a collection of buttons on a UI, or countless other use cases.
Let's imagine we have these objects, which we want to store and manage as a single collection:
class Character {};
Character Frodo;
Character Gandalf;
Character Gimli;
Character Legolas;
Character Aragorn;
Programmers have invented a huge range of options for storing collections of objects. These containers are broadly called data structures, and which data structure we should use depends on our specific requirements. We’ll cover a much wider range of data structures in the next course. Here, we’ll introduce the most common choice - the dynamic array.
Under the hood, arrays are contiguous blocks of memory, big enough to store multiple objects in sequence.
Arrays broadly fall into two categories - static arrays, and dynamic arrays.
With static arrays, we need to know at compile time how much space we need. For example, if we create a static array for holding 5 characters, and we need to add a 6th, we’re out of luck.
Dynamic arrays can resize themselves at run time. If we try to add a 6th character to a dynamic array that can only hold 5, the dynamic array will do a load of memory management and related tasks to make itself bigger. That all happens behind the scenes - from our perspective, it just works.
In this lesson, we're working with dynamic, resizable arrays. In the next course, we cover static arrays.
There are hundreds of implementations of arrays in C++ that we can use, and we can even create our own once we learn more advanced topics. In this lesson, we’re using the standard library’s implementation of dynamic arrays, which is called std::vector
.
The C++ standard library using the "vector" name for their implementation of dynamic arrays is unfortunate. When we’re working on projects that involve maths and physics, such as a video games, the constructs we use to represent things like positions, movement and forces are also called vectors.
These vectors are not related to arrays at all - the confusing naming conflict is just something we need to adapt to.
std::vector
To use std::vector
, we need to #include <vector>
. We can then declare a std::vector
by giving it a name, and specifying the type of objects it will contain within angled brackets: <
and >
. The following example shows how we can declare a std::vector
called MyVector
that stores int
objects:
#include <vector>
std::vector<int> MyVector;
When we see chevrons - <
and >
- in our code, we’re typically using an advanced C++ feature called templates. Templates allow classes and functions to be written without specifying all of the data types that the class or function will be using.
In this case, the standard library authors created a container called std::vector
that can store a collection of any type of object. As developers using std::vector
, we just need to provide the type we want to store between the chevrons. This type we provide is referred to as a template argument.
When we pass int
as the template argument - that is, std::vector<int>
- we create an array for storing int
values. But we can replace int
with any type:
std::vector<float> FloatCollection;
std::vector<bool> BoolCollection;
Templates are particularly powerful as they allow classes and functions to be compatible with types that aren’t known or don’t exist at the time the template was written.
For example, the authors of std::vector
couldn’t know we would want to store a collection of SomeCustomType
objects, but their code supports it anyway:
struct SomeCustomType{
// ...
};
std::vector<SomeCustomType> MyVector;
We cover templates in much more detail, including how to create our own, in the advanced course.
We can provide a std::vector
with some initial values:
#include <vector>
std::vector<int> MyVector { 1, 2, 3, 4, 5 };
When we are initializing the values at the same time we declare the array, we can optionally remove the template argument. The compiler can infer what type the vector will be storing, based on the type of objects we provided for its initial values:
#include <vector>
std::vector MyVector { 1, 2, 3, 4, 5 };
To do this, the compiler is using Class Template Argument Deduction (CTAD), which we’ll cover more in our advanced course.
We can access the members of our array with the subscript operator, which uses []
syntax. For example, to access an object within MyVector
, we’d use MyVector[x]
, where x
is the index of the element we want to access.
The index of an element within an array is simply its position within the array. However, in most programming contexts, indexing is zero-based, meaning we start counting from 0
. This means the first element of the vector is at index 0
, the second element is at index 1
, and so on.
For example, to access the elements of our array, we would do this:
#include <vector>
std::vector MyVector { 1, 2, 3, 4, 5 };
int FirstElement { MyVector[0] };
int SecondElement { MyVector[1] };
int ThirdElement { MyVector[2] };
int LastElement { MyVector[4] };
As with all values, the index can be derived from any expression that evaluates to an integer:
#include <iostream>
#include <vector>
int CalculateIndex() { return 1 + 1; }
int main() {
std::vector MyVector{
"First", "Second", "Third"};
// Log out the element at index 0
std::cout << MyVector[3 - 3] << '\n';
// Log out the element at index 1
int Index{1};
std::cout << MyVector[Index] << '\n';
// Log out the element at index 2
std::cout << MyVector[CalculateIndex()];
}
This code logs out elements at indices 0
, 1
, and 2
in order:
First
Second
Third
The size of an array refers to the number of elements it currently contains. This is also sometimes called the length of the array.
The size()
function returns the current size of our std::vector
:
#include <iostream>
#include <vector>
int main() {
std::vector MyVector{
"First", "Second", "Third"};
std::cout << "Size: " << MyVector.size();
}
Size: 3
Note that because indexing is zero-based, the last element of a vector is at an index of one less than its size()
. For an array of size 3
, the last element is at index 2
.
We can get the last element by applying this arithmetic within the subscript operator, or by using the back()
function:
#include <iostream>
#include <vector>
int main() {
std::vector MyVector{
"First", "Second", "Third"};
std::cout << "Last Element: "
<< MyVector[MyVector.size() - 1];
// Equivalently:
std::cout << "\nLast Element: "
<< MyVector.back();
}
Last Element: Third
Last Element: Third
Once our array is created, we’ll often want to add more items to it. Typically, we want to add items to the end of arrays, as this has performance benefits over adding them to the start. We’ll cover the performance characteristics of containers in more detail in the next course.
std::vector
and many other data structures provide two ways of adding objects:
Creating an object in place has performance benefits over creating it elsewhere and then moving it. Therefore, where possible, we should prefer emplacing over pushing.
With std::vector
, we’re adding items to the back of the container, so the respective functions are called push_back()
and emplace_back()
push_back()
If an object has already been constructed, and we want to add it to the back of our array, we can use push_back()
:
#include <iostream>
#include <vector>
class Character {
public:
std::string Name;
};
int main() {
Character Legolas{"Legolas"};
std::vector<Character> MyVector;
MyVector.push_back(Legolas);
std::cout << "First character: "
<< MyVector[0].Name;
}
First character: Legolas
emplace_back()
If the object we want to add does not already exist, we can construct it right inside of the array, using emplace_back()
.
The arguments we call emplace_back()
with will be passed to the constructor of the object type the vector stores:
#include <iostream>
#include <vector>
class Character {
public:
std::string Name;
};
int main() {
std::vector<Character> MyVector;
MyVector.emplace_back("Legolas");
std::cout << "First character: "
<< MyVector[0].Name;
}
erase()
The std::vector
type also offers an erase()
method to remove objects from our array. The erase method requires we provide an iterator, which is slightly more advanced than a simple index.
We cover iterators in detail in the advanced course, but for now, note that we can get an iterator corresponding to a std::vector
index using MyVector.begin() + x
, where x
is the index we’re interested in.
So, for example, to erase the second item (the item at index 1
) from our array we’d use:
MyVector.erase(MyVector.begin() + 1);
The size()
of our array will be reduced by 1
, and all of the objects that were after the element we erased get moved by one position to the left to fill in the gap:
#include <iostream>
#include <vector>
class Character {
public:
std::string Name;
};
int main() {
std::vector<Character> MyVector;
MyVector.emplace_back("Aragorn");
MyVector.emplace_back("Gandalf");
MyVector.emplace_back("Legolas");
std::cout << "Second character is: "
<< MyVector[1].Name;
MyVector.erase(MyVector.begin() + 1);
std::cout << "\nSecond character is now: "
<< MyVector[1].Name;
}
Second character is: Gandalf
Second character is now: Legolas
We can modify objects in a std::vector
in the usual ways. We can replace the object at a given index using the assignment operator, =
:
#include <iostream>
#include <vector>
int main() {
std::vector MyVector{1, 2, 3};
std::cout << "Second number: "
<< MyVector[1];
MyVector[1] = 42;
std::cout << "\nSecond number: "
<< MyVector[1];
}
Second number: 2
Second number: 42
We can also call functions (including operators) on our objects as needed:
#include <iostream>
#include <vector>
class Character {
public:
Character(std::string Name) : mName{Name} {}
std::string GetName() { return mName; };
void SetName(std::string Name){
mName = Name;
}
private:
std::string mName;
};
int main() {
std::vector<Character> MyVector;
MyVector.emplace_back("Roderick");
std::cout << "First character: "
<< MyVector[0].GetName();
MyVector[0].SetName("Anna");
std::cout << "\nFirst character: "
<< MyVector[0].GetName();
}
First character: Roderick
First character: Anna
Our above example uses vectors with values, but we can store pointers and references in arrays too:
#include <vector>
class Character {};
// A collection of characters
std::vector<Character>;
// A collection of character pointers
std::vector<Character*>;
// A collection of const character references
std::vector<const Character&>;
Arrays can also store other arrays. This creates "multidimensional arrays". For example, a 3x3 grid could be represented as an array of 3 rows, each row being an array of 3 items:
#include <vector>
#include <iostream>
std::vector<std::vector<int>> MyGrid {{
{ 1, 2, 3 },
{ 4, 5, 6 },
{ 7, 8, 9 }
}};
int main() {
std::cout
<< "Top Left: " << MyGrid[0][0];
std::cout << "\nBottom Right: "
// Alternatively: MyGrid.back().back()
<< MyGrid[2][2];
}
Top Left: 1
Bottom Right: 9
Remember, we can add a using
statement to alias complex types:
using Grid = std::vector<std::vector<int>>;
Grid MyGrid {{
{ 1, 2, 3 },
{ 4, 5, 6 },
{ 7, 8, 9 }
}};
We can treat a collection of objects like any other value. Below, we pass a collection to a function:
#include <iostream>
#include <vector>
using Grid = std::vector<std::vector<int>>;
void LogTopLeft(Grid GridToLog) {
std::cout << "Top Left: " << GridToLog[0][0];
}
int main() {
Grid MyGrid {{
{ 1, 2, 3 },
{ 4, 5, 6 },
{ 7, 8, 9 }
}};
LogTopLeft(MyGrid);
}
As with any other parameter, arrays are passed by value by default. The performance implications are particularly important here, as copying a collection of objects is inherently more expensive than copying a single object.
We can pass by reference in the usual way, with or without const
:
#include <iostream>
#include <vector>
using Grid = std::vector<std::vector<int>>;
void LogTopLeft(const Grid& GridToLog) {
std::cout << "Top Left: " << GridToLog[0][0];
}
void SetTopLeft(Grid& GridToLog, int Value) {
GridToLog[0][0] = Value;
}
int main() {
Grid MyGrid {{
{ 1, 2, 3 },
{ 4, 5, 6 },
{ 7, 8, 9 }
}};
SetTopLeft(MyGrid, 42);
LogTopLeft(MyGrid);
}
Top Left: 42
We can also generate and use pointers in the normal way. The following code replicates the previous example, using pointers instead of references:
#include <iostream>
#include <vector>
using Grid = std::vector<std::vector<int>>;
void LogTopLeft(const Grid* GridToLog) {
std::cout << "Top Left: "
<< (*GridToLog)[0][0];
}
void SetTopLeft(Grid* GridToLog, int Value) {
(*GridToLog)[0][0] = Value;
}
int main() {
Grid MyGrid {{
{ 1, 2, 3 },
{ 4, 5, 6 },
{ 7, 8, 9 }
}};
SetTopLeft(&MyGrid, 42);
LogTopLeft(&MyGrid);
}
Top Left: 42
A common task we have when working with arrays is to perform some action on every object in the collection. We can do this with a for
loop:
#include <iostream>
#include <vector>
int main() {
std::vector MyVector{1, 2, 3};
for (int i{0}; i < MyVector.size(); ++i) {
std::cout << MyVector[i] << ", ";
}
}
1, 2, 3,
size_t
typeThere is an issue with using int
values as the index of vectors: modern computers have enough memory to store a lot of objects in an array. This means that the size()
of the array, or the index used to reference a specific position, can be a larger number than the maximum value storable in an int
.
To deal with this problem, we have the size_t
data type. size_t
is guaranteed to be large enough to match the largest possible size of the array.
Because of this, it is the recommended way of storing array sizes and indices:
#include <iostream>
#include <vector>
int main() {
std::vector MyVector{1, 2, 3};
for (size_t i{0}; i < MyVector.size(); ++i) {
std::cout << MyVector[i] << ", ";
}
}
1, 2, 3,
Often, we usually don’t need to work with indices at all - we just want to iterate over everything in the array. For those scenarios, we can use this syntax:
#include <iostream>
#include <vector>
int main() {
std::vector MyVector{1, 2, 3};
for (int Number : MyVector) {
std::cout << Number << ", ";
}
}
1, 2, 3,
This is known as a range-based for loop. We cover these in more detail in the next course, including how to make our custom types compatible with this syntax.
Similar to functions, range-based for loops can receive their parameters by reference or value, with value being the default. This means each item in the collection is copied into the body of the loop. We should consider passing by reference instead, particularly if the type we’re using is expensive to copy:
#include <iostream>
#include <vector>
int main() {
std::vector MyVector{1, 2, 3};
for (int& Number : MyVector) {
std::cout << Number << ", ";
}
}
1, 2, 3,
Just like when we’re passing by reference into a function if the loop body isn’t going to modify that reference, we should consider marking it as const
:
#include <iostream>
#include <vector>
int main() {
std::vector MyVector{1, 2, 3};
for (const int& Number : MyVector) {
std::cout << Number << ", ";
}
}
1, 2, 3,
std::vector
and other arrays keep our objects in contiguous blocks of memory on our system. As we push or emplace objects into our array, it may run out of space at its current memory location.
size()
and capacity()
We can see how much of our capacity we’re currently using by comparing what is returned from the size()
and capacity()
methods:
size()
returns the number of elements currently in the array.capacity()
returns the number of elements the array can hold in its current memory location.Below, we see the std::vector
initially has a capacity of 5
, with all 5
spaces being taken. When we add a 6th item, the size()
increases to 6
, whilst the capacity()
is now 7
:
#include <iostream>
#include <vector>
int main() {
std::vector MyVector{1, 2, 3, 4, 5};
std::cout << "Capacity: " << MyVector.size()
<< "/" << MyVector.capacity();
MyVector.emplace_back(6);
std::cout << "\nCapacity: " << MyVector.size()
<< "/" << MyVector.capacity();
}
Capacity: 5/5
Capacity: 6/7
When we add elements beyond the current capacity, the std::vector
will do some work behind the scenes to let our array grow. It will:
In our previous example, this new location has enough space to store 7
objects. We’re currently only storing 6
, so we have room to add one more before everything needs to be moved to a larger memory block again.
Moving an array to a new location has a performance cost, so if we’re able to reduce unnecessary movements, we should consider it.
If we know approximately how many objects our std::vector
is likely to need to store, we can directly reserve()
enough capacity for them. Below, we ask our std::vector
to move to a location with room for 100
objects:
#include <iostream>
#include <vector>
int main() {
std::vector MyVector{1, 2, 3, 4, 5};
MyVector.reserve(100);
std::cout << "Capacity: " << MyVector.size()
<< "/" << MyVector.capacity();
}
Now, it has plenty of room to grow before it needs to move again:
Capacity: 5/100
Naturally, a std::vector
with a capacity for 100
objects will consume more system memory than one with a capacity of only 5
.
But if we think it’s eventually going to grow to 100
objects anyway, we may as well just reserve that space from the start and eliminate all the expensive moving.
To summarise our considerations:
reserve()
can help avoid unnecessary reallocations.In this lesson, we've explored the essentials of dynamic arrays, with a particular focus on std::vector
. You've learned how to create, access, modify, and efficiently manage collections of objects, laying a strong foundation for more advanced programming concepts.
std::vector
.std::vector
, including using push_back()
and emplace_back()
methods.std::vector
, and the implications of passing vectors to functions.std::vector
.size_t
in managing vector sizes.std::vector
manages its capacity, and how we can interact with that mechanism using capacity()
and reserve()
.In the next lesson, we introduce the main design pattern used to manage memory in complex applications. We also introduce smart pointers, the main mechanism used to implement this pattern. We cover:
std::make_unique()
: How to create unique pointers using std::make_unique()
,std::vector
Explore the fundamentals of dynamic arrays with an introduction to std::vector
Become a software engineer with C++. Starting from the basics, we guide you step by step along the way