Skip to content

Commit 7492059

Browse files
[Entry] C++ unordered map: bucket_size() (#7671)
* [Edit] Python: Python CLI arguments * Update command-line-arguments.md * [Entry] C++ unordered map: bucket_size() * Apply suggestion from @avdhoottt * Apply suggestion from @avdhoottt ---------
1 parent 6f6568f commit 7492059

File tree

1 file changed

+198
-0
lines changed

1 file changed

+198
-0
lines changed
Lines changed: 198 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,198 @@
1+
---
2+
Title: 'bucket_size()'
3+
Description: 'Returns the number of elements in a specific bucket of an unordered map.'
4+
Subjects:
5+
- 'Computer Science'
6+
- 'Web Development'
7+
Tags:
8+
- 'Arrays'
9+
- 'Data Structures'
10+
- 'Hash Maps'
11+
- 'Methods'
12+
CatalogContent:
13+
- 'learn-c-plus-plus'
14+
- 'paths/computer-science'
15+
---
16+
17+
The **`.bucket_size()`** method returns the number of elements stored in a specific bucket of an `unordered_map`. In C++, an `unordered_map` uses a hash table internally where elements are distributed across multiple buckets based on their key's hash value. This method helps analyze the distribution of elements and can be useful for performance optimization and understanding collision handling in the hash table.
18+
19+
## Syntax
20+
21+
```pseudo
22+
unordered_map.bucket_size(n)
23+
```
24+
25+
**Parameters:**
26+
27+
- `n`: The bucket number to query. This value must be less than the total number of buckets returned by `.bucket_count()`. It is of type `size_type`, which is an unsigned integral type.
28+
29+
**Return value:**
30+
31+
The `.bucket_size()` method returns the number of elements in bucket `n` as an unsigned integer of type `size_type`.
32+
33+
## Example 1: Basic Bucket Size Check
34+
35+
This example demonstrates how to use `.bucket_size()` to check the number of elements in each bucket of an `unordered_map`:
36+
37+
```cpp
38+
#include <iostream>
39+
#include <unordered_map>
40+
#include <string>
41+
42+
int main() {
43+
// Create an unordered_map with string keys and integer values
44+
std::unordered_map<std::string, int> ages = {
45+
{"Alice", 25},
46+
{"Bob", 30},
47+
{"Charlie", 35},
48+
{"Diana", 28}
49+
};
50+
51+
// Get the total number of buckets
52+
unsigned int total_buckets = ages.bucket_count();
53+
std::cout << "Total buckets: " << total_buckets << "\n\n";
54+
55+
// Display the number of elements in each bucket
56+
for (unsigned int i = 0; i < total_buckets; i++) {
57+
std::cout << "Bucket " << i << " has " << ages.bucket_size(i) << " elements\n";
58+
}
59+
60+
return 0;
61+
}
62+
```
63+
64+
This example results in the following output:
65+
66+
```shell
67+
Total buckets: 5
68+
69+
Bucket 0 has 1 elements
70+
Bucket 1 has 1 elements
71+
Bucket 2 has 2 elements
72+
Bucket 3 has 0 elements
73+
Bucket 4 has 0 elements
74+
```
75+
76+
The output shows how elements are distributed across the buckets. Some buckets may be empty while others contain one or more elements depending on the hash function's distribution.
77+
78+
> **Note:** Output values are implementation-dependent and may vary across different compilers and systems.
79+
80+
## Example 2: Analyzing Load Distribution
81+
82+
This example shows how to use `.bucket_size()` to analyze the load distribution in an `unordered_map` storing product inventory data, which helps identify potential performance issues:
83+
84+
```cpp
85+
#include <iostream>
86+
#include <unordered_map>
87+
#include <string>
88+
89+
int main() {
90+
// Create an inventory map with product IDs and quantities
91+
std::unordered_map<std::string, int> inventory = {
92+
{"PROD001", 150},
93+
{"PROD002", 200},
94+
{"PROD003", 75},
95+
{"PROD004", 300},
96+
{"PROD005", 125},
97+
{"PROD006", 50},
98+
{"PROD007", 180},
99+
{"PROD008", 90}
100+
};
101+
102+
unsigned int total_buckets = inventory.bucket_count();
103+
int max_bucket_size = 0;
104+
int empty_buckets = 0;
105+
106+
// Analyze bucket distribution
107+
for (unsigned int i = 0; i < total_buckets; i++) {
108+
int current_size = inventory.bucket_size(i);
109+
110+
if (current_size > max_bucket_size) {
111+
max_bucket_size = current_size;
112+
}
113+
114+
if (current_size == 0) {
115+
empty_buckets++;
116+
}
117+
}
118+
119+
// Display distribution statistics
120+
std::cout << "Total buckets: " << total_buckets << "\n";
121+
std::cout << "Empty buckets: " << empty_buckets << "\n";
122+
std::cout << "Maximum elements in a bucket: " << max_bucket_size << "\n";
123+
std::cout << "Average load factor: " << inventory.load_factor() << "\n";
124+
125+
return 0;
126+
}
127+
```
128+
129+
This example results in the following output:
130+
131+
```shell
132+
Total buckets: 11
133+
Empty buckets: 5
134+
Maximum elements in a bucket: 2
135+
Average load factor: 0.727273
136+
```
137+
138+
This analysis helps understand how well the hash function distributes elements. A high maximum bucket size might indicate hash collisions that could affect performance.
139+
140+
## Codebyte Example: Identifying Collision Hotspots
141+
142+
This example demonstrates using `.bucket_size()` to identify buckets with multiple elements, which indicates hash collisions in a user authentication system:
143+
144+
```codebyte/cpp
145+
#include <iostream>
146+
#include <unordered_map>
147+
#include <string>
148+
149+
int main() {
150+
// Create a map storing user sessions with session IDs
151+
std::unordered_map<std::string, std::string> sessions = {
152+
{"session_a1b2", "user_101"},
153+
{"session_c3d4", "user_102"},
154+
{"session_e5f6", "user_103"},
155+
{"session_g7h8", "user_104"},
156+
{"session_i9j0", "user_105"},
157+
{"session_k1l2", "user_106"}
158+
};
159+
160+
unsigned int total_buckets = sessions.bucket_count();
161+
162+
std::cout << "Buckets with collisions (multiple elements):\n";
163+
164+
// Find and report buckets with more than one element
165+
for (unsigned int i = 0; i < total_buckets; i++) {
166+
unsigned int size = sessions.bucket_size(i);
167+
168+
if (size > 1) {
169+
std::cout << "Bucket " << i << " has " << size << " elements (collision detected)\n";
170+
171+
// Display which sessions are in this bucket
172+
std::cout << " Sessions in this bucket: ";
173+
for (auto it = sessions.begin(i); it != sessions.end(i); ++it) {
174+
std::cout << it->first << " ";
175+
}
176+
std::cout << "\n";
177+
}
178+
}
179+
180+
return 0;
181+
}
182+
```
183+
184+
This helps identify potential performance bottlenecks where multiple keys hash to the same bucket, requiring additional comparisons during lookup operations.
185+
186+
## Frequently Asked Questions
187+
188+
### 1. What is the bucket in `unordered_map`?
189+
190+
A bucket is a slot in the internal hash table where elements are stored based on their key's hash value. Each bucket can contain zero, one, or multiple elements, numbered from 0 to `bucket_count() - 1`.
191+
192+
### 2. What is the `unordered_map` in C++ with size?
193+
194+
An `unordered_map` is a hash table container that stores key-value pairs. The `.size()` method returns total elements, while `.bucket_count()` returns the number of buckets. Their ratio determines the load factor.
195+
196+
### 3. What is the `unordered_set` bucket size?
197+
198+
The `.bucket_size(n)` method in `unordered_set` works the same as in `unordered_map`, returning the number of elements in bucket `n` to help analyze distribution and collisions.

0 commit comments

Comments
 (0)