The load factor of a hash table is the ratio of the number of elements stored in the hash table to the size of the underlying array (number of buckets). It is a measure of how full the hash table is and can have a significant impact on the performance of the hash table operations.
Load Factor = Number of Elements / Number of Buckets
For example, if a hash table has 10 elements and an array size of 20, the load factor would be 0.5 (10 /Â 20).
The load factor affects the performance of a hash table in the following ways:
Collision Resolution:
Memory Usage:
Rehashing:
Here's an example to illustrate the effect of load factor on performance:
#include <chrono>
#include <iostream>
#include <unordered_map>
int main() {
using namespace std::chrono;
// Create hash tables with different load factors
// Low load factor
std::unordered_map<int, int> map1(1000);
// High load factor
std::unordered_map<int, int> map2(100);
// Insert elements into the hash tables
for (int i = 0; i < 1000; i++) {
map1[i] = i;
map2[i] = i;
}
// Measure lookup time for map1
auto start1 = high_resolution_clock::now();
for (int i = 0; i < 10000; i++) {
map1.find(i);
}
auto end1 = high_resolution_clock::now();
auto duration1 =
duration_cast<microseconds>(end1 - start1);
// Measure lookup time for map2
auto start2 = high_resolution_clock::now();
for (int i = 0; i < 10000; i++) {
map2.find(i);
}
auto end2 = high_resolution_clock::now();
auto duration2 =
duration_cast<microseconds>(end2 - start2);
std::cout << "Lookup time with low load factor: "
<< duration1.count() << " microseconds\n";
std::cout << "Lookup time with high load factor: "
<< duration2.count() << " microseconds\n";
}
In this example, we create two hash tables with different load factors - map1
with a low load factor and map2
with a high load factor. We insert the same number of elements into both hash tables and measure the lookup time for each.
The output might resemble:
Lookup time with low load factor: 15 microseconds
Lookup time with high load factor: 28 microseconds
The actual values may vary depending on your system, but the general trend is that the lookup time is higher for the hash table with a high load factor due to more collisions and longer collision resolution.
In practice, it's important to choose an appropriate load factor that balances performance and memory usage. A common rule of thumb is to keep the load factor below 0.75 to maintain good performance. When the load factor exceeds this threshold, the hash table is typically resized to a larger capacity to reduce collisions and improve performance.
Answers to questions are automatically generated and may not have been reviewed.
std::hash
This lesson provides an in-depth look at hashing in C++, including std::hash
, collision strategies, and usage in hash-based containers.