Hashing and std::hash

Hash Table Load Factor

What is the load factor of a hash table, and how does it affect performance?

Abstract art representing computer programming

The load factor of a hash table is the ratio of the number of elements stored in the hash table to the size of the underlying array (number of buckets). It is a measure of how full the hash table is and can have a significant impact on the performance of the hash table operations.

Load Factor = Number of Elements / Number of Buckets

For example, if a hash table has 10 elements and an array size of 20, the load factor would be 0.5 (10 / 20).

The load factor affects the performance of a hash table in the following ways:

Collision Resolution:

  • As the load factor increases, the probability of collisions also increases.
  • With more collisions, the collision resolution mechanism (e.g., separate chaining or open addressing) has to work harder to resolve them.
  • This leads to longer chains or more probing steps, which can degrade the performance of lookup, insertion, and deletion operations.

Memory Usage:

  • A lower load factor means that the hash table has more empty buckets, resulting in wasted memory space.
  • On the other hand, a higher load factor may lead to more collisions but utilizes memory more efficiently.

Rehashing:

  • When the load factor exceeds a certain threshold (e.g., 0.75), the hash table may need to be resized to maintain its performance.
  • Resizing involves creating a new larger array and rehashing all the elements to redistribute them.
  • Rehashing is an expensive operation that can impact performance, especially if it occurs frequently.

Here's an example to illustrate the effect of load factor on performance:

#include <chrono>
#include <iostream>
#include <unordered_map>

int main() {
  using namespace std::chrono;

  // Create hash tables with different load factors
  // Low load factor
  std::unordered_map<int, int> map1(1000);

  // High load factor
  std::unordered_map<int, int> map2(100);
  
  // Insert elements into the hash tables
  for (int i = 0; i < 1000; i++) {
    map1[i] = i;
    map2[i] = i;
  }

  // Measure lookup time for map1
  auto start1 = high_resolution_clock::now();
  for (int i = 0; i < 10000; i++) {
    map1.find(i);
  }
  auto end1 = high_resolution_clock::now();
  auto duration1 =
    duration_cast<microseconds>(end1 - start1);

  // Measure lookup time for map2
  auto start2 = high_resolution_clock::now();
  for (int i = 0; i < 10000; i++) {
    map2.find(i);
  }
  auto end2 = high_resolution_clock::now();
  auto duration2 =
      duration_cast<microseconds>(end2 - start2);

  std::cout << "Lookup time with low load factor: "
    << duration1.count() << " microseconds\n";
  std::cout << "Lookup time with high load factor: "
    << duration2.count() << " microseconds\n";
}

In this example, we create two hash tables with different load factors - map1 with a low load factor and map2 with a high load factor. We insert the same number of elements into both hash tables and measure the lookup time for each.

The output might resemble:

Lookup time with low load factor: 15 microseconds
Lookup time with high load factor: 28 microseconds

The actual values may vary depending on your system, but the general trend is that the lookup time is higher for the hash table with a high load factor due to more collisions and longer collision resolution.

In practice, it's important to choose an appropriate load factor that balances performance and memory usage. A common rule of thumb is to keep the load factor below 0.75 to maintain good performance. When the load factor exceeds this threshold, the hash table is typically resized to a larger capacity to reduce collisions and improve performance.

Answers to questions are automatically generated and may not have been reviewed.

A computer programmer
Part of the course:

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Free, unlimited access

This course includes:

  • 125 Lessons
  • 550+ Code Samples
  • 96% Positive Reviews
  • Regularly Updated
  • Help and FAQ
Free, Unlimited Access

Professional C++

Comprehensive course covering advanced concepts, and how to use them on large-scale projects.

Screenshot from Warhammer: Total War
Screenshot from Tomb Raider
Screenshot from Jedi: Fallen Order
Contact|Privacy Policy|Terms of Use
Copyright © 2024 - All Rights Reserved