Skip to content

Commit c32d11e

Browse files
Create Hash-table.md
1 parent ba6d06f commit c32d11e

File tree

1 file changed

+131
-0
lines changed

1 file changed

+131
-0
lines changed

dsa/intermediate/Hash-table.md

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
# Hash Table
2+
3+
A hash table (or hash map) is a fundamental data structure in computer science, used for fast data retrieval. Here’s an in-depth look at hash tables:
4+
5+
## Overview
6+
A hash table is a collection of key-value pairs, where each key is mapped to a value. The key is transformed into an index in an array through a process called hashing. This index is used to quickly locate the value associated with the key.
7+
8+
## Key Concepts
9+
1. *Hash Function:* This function takes an input (the key) and returns an integer (the hash code), which is used as an index to store the corresponding value in the array. A good hash function distributes keys uniformly across the array.
10+
2. *Buckets:* Each index in the array is referred to as a bucket. In cases where multiple keys hash to the same index, these keys are stored in the same bucket.
11+
3. *Collision Handling:* Collisions occur when multiple keys hash to the same index. Common collision handling techniques include:
12+
- Chaining: Each bucket points to a linked list of entries that map to the same index.
13+
- Open Addressing: When a collision occurs, the algorithm searches for the next open bucket using a probing sequence (linear probing, quadratic probing, or double hashing).
14+
15+
## Operations
16+
**Insertion:**
17+
18+
Compute the hash code of the key using the hash function.
19+
Map the hash code to an index in the array.
20+
Store the key-value pair in the corresponding bucket. Handle collisions appropriately.
21+
22+
**Search:**
23+
24+
Compute the hash code of the key.
25+
Map the hash code to an index.
26+
Check the corresponding bucket. If using chaining, search the linked list. If using open addressing, follow the probing sequence until the key is found or an empty bucket is reached.
27+
28+
**Deletion:**
29+
30+
Compute the hash code of the key.
31+
Map the hash code to an index.
32+
Locate the key in the bucket and remove the key-value pair. Adjust the structure to maintain efficiency (e.g., rehash elements if necessary).
33+
34+
## Complexity
35+
Average Time Complexity: O(1) for insert, delete, and search operations in a well-implemented hash table with a good hash function and a load factor that avoids excessive collisions.
36+
37+
Worst-Case Time Complexity: O(n) for insert, delete, and search operations if all keys hash to the same index (highly unlikely with a good hash function).
38+
39+
## Load Factor
40+
The load factor (α) is the ratio of the number of entries to the number of buckets in the hash table. A high load factor can lead to more collisions, affecting performance. A common practice is to resize the hash table (rehash) when the load factor exceeds a certain threshold.
41+
42+
43+
​Load Factor = No. of entries / No. of Buckets
44+
45+
46+
**Rehashing**
47+
48+
When the load factor exceeds a threshold, the hash table is resized (usually doubled) and all existing keys are rehashed to the new array. This operation ensures that the hash table maintains efficient performance.
49+
50+
## Practical Considerations
51+
*Choosing a Hash Function:* A good hash function minimizes collisions and uniformly distributes keys. Common hash functions include division-remainder, multiplication, and universal hashing.
52+
53+
*Dynamic Resizing:* To maintain efficient operations, hash tables are dynamically resized (rehashing) when the load factor becomes too high.
54+
55+
*Memory Usage:* Hash tables can be memory-inefficient if they are sparsely populated. A balance between memory usage and performance needs to be struck.
56+
57+
## Applications
58+
59+
1. Database Indexing: Hash tables are used for indexing databases to allow quick retrieval of records.
60+
2. Caches: Hash tables underpin the implementation of caches.
61+
3. Symbol Tables in Compilers: Used for fast lookup of identifiers (variables, functions, etc.) during compilation.
62+
4. Sets: Hash tables can be used to implement set data structures for efficient membership checking.
63+
64+
## Implementation in CPP
65+
66+
```cpp
67+
#include <iostream>
68+
#include <vector>
69+
#include <list>
70+
#include <string>
71+
72+
class HashTable {
73+
private:
74+
int size;
75+
std::vector<std::list<std::pair<std::string, std::string>>> table;
76+
77+
int hashFunction(const std::string& key) const {
78+
std::hash<std::string> hashFunc;
79+
return hashFunc(key) % size;
80+
}
81+
82+
public:
83+
HashTable(int s) : size(s), table(s) {}
84+
85+
void insert(const std::string& key, const std::string& value) {
86+
int index = hashFunction(key);
87+
for (auto& kvp : table[index]) {
88+
if (kvp.first == key) {
89+
kvp.second = value;
90+
return;
91+
}
92+
}
93+
table[index].emplace_back(key, value);
94+
}
95+
96+
std::string search(const std::string& key) const {
97+
int index = hashFunction(key);
98+
for (const auto& kvp : table[index]) {
99+
if (kvp.first == key) {
100+
return kvp.second;
101+
}
102+
}
103+
return "Not Found";
104+
}
105+
106+
bool remove(const std::string& key) {
107+
int index = hashFunction(key);
108+
for (auto it = table[index].begin(); it != table[index].end(); ++it) {
109+
if (it->first == key) {
110+
table[index].erase(it);
111+
return true;
112+
}
113+
}
114+
return false;
115+
}
116+
};
117+
118+
int main() {
119+
HashTable ht(10);
120+
ht.insert("key1", "value1");
121+
ht.insert("key2", "value2");
122+
123+
std::cout << "Search key1: " << ht.search("key1") << std::endl; // Output: value1
124+
std::cout << "Search key2: " << ht.search("key2") << std::endl; // Output: value2
125+
126+
ht.remove("key1");
127+
std::cout << "Search key1: " << ht.search("key1") << std::endl; // Output: Not Found
128+
129+
return 0;
130+
}
131+
```

0 commit comments

Comments
 (0)