Skip to content

Commit 8fc24d8

Browse files
committed
hashing added
1 parent ba65c9c commit 8fc24d8

File tree

1 file changed

+318
-0
lines changed

1 file changed

+318
-0
lines changed

dsa/Algorithms/hashing.md

Lines changed: 318 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,318 @@
1+
---
2+
id: hashing
3+
title: Hashing
4+
sidebar_label: Hashing
5+
tags: [python, java, c++, javascript, programming, algorithms, data structures, tutorial, in-depth]
6+
description: In this tutorial, we will learn about Hash Tables, their uses, how they work, and hashing in general with detailed explanations and examples.
7+
---
8+
9+
# Hash Tables
10+
11+
Hash tables are a fundamental data structure in computer science used for fast data retrieval. This tutorial will cover the basics of hash tables, their uses, how they work, and the concept of hashing in general.
12+
13+
## Introduction to Hash Tables
14+
15+
A hash table, also known as a hash map, is a data structure that stores key-value pairs. It provides efficient insertion, deletion, and lookup operations, typically in constant average time complexity, $O(1)$.
16+
17+
## Uses of Hash Tables
18+
19+
Hash tables are widely used in various applications due to their efficiency and versatility. Some common uses include:
20+
21+
- **Databases**: Implementing indexes to speed up data retrieval.
22+
- **Caching**: Storing recently accessed data to quickly serve future requests.
23+
- **Dictionaries**: Implementing associative arrays or dictionaries, where each key is associated with a value.
24+
- **Symbol Tables**: Managing variable names in interpreters and compilers.
25+
- **Sets**: Implementing sets, which allow for fast membership testing.
26+
27+
## Working of Hash Tables
28+
29+
A hash table works by mapping keys to indices in an array. This mapping is achieved using a hash function. The hash function takes a key and returns an integer, which is used as an index to store the corresponding value in the array.
30+
31+
### Components of a Hash Table
32+
33+
1. **Hash Function**: Converts keys into valid array indices.
34+
2. **Buckets**: Array elements where key-value pairs are stored.
35+
3. **Collision Resolution**: Strategy to handle cases where multiple keys map to the same index.
36+
37+
![hashing](https://khalilstemmler.com/img/blog/data-structures/hash-tables/hash-table.png)
38+
### Hash Function
39+
40+
The hash function is crucial for the performance of a hash table. It should be fast to compute and distribute keys uniformly across the array. A common hash function for integers is `h(key) = key % N`, where `N` is the size of the array.
41+
42+
### Collision Resolution
43+
44+
Collisions occur when multiple keys hash to the same index. There are several strategies to resolve collisions:
45+
46+
- **Chaining**: Store multiple key-value pairs in a list at each index.
47+
- **Open Addressing**: Find another index within the array using methods like linear probing, quadratic probing, or double hashing.
48+
49+
## Hashing in General
50+
51+
Hashing is the process of converting input data (keys) into a fixed-size integer, which serves as an index for data storage and retrieval. Hashing is widely used beyond hash tables in various fields such as cryptography, data structures, and databases.
52+
53+
### Properties of a Good Hash Function
54+
55+
1. **Deterministic**: The hash function must consistently return the same hash value for the same input.
56+
2. **Uniform Distribution**: It should distribute hash values uniformly across the hash table to minimize collisions.
57+
3. **Efficient Computation**: The hash function should be quick to compute.
58+
4. **Minimally Correlated**: Hash values should be independent of each other to avoid clustering.
59+
60+
### Common Hash Functions
61+
62+
- **Division Method**: `h(key) = key % N`
63+
- **Multiplication Method**: `h(key) = floor(N * (key * A % 1))`, where `A` is a constant between 0 and 1.
64+
- **Universal Hashing**: Uses a class of hash functions and selects one at random to minimize the chance of collisions.
65+
66+
## Implementing Hash Tables
67+
68+
### Python Implementation
69+
70+
```python
71+
class HashTable:
72+
def __init__(self, size):
73+
self.size = size
74+
self.table = [[] for _ in range(size)]
75+
76+
def hash_function(self, key):
77+
return hash(key) % self.size
78+
79+
def insert(self, key, value):
80+
index = self.hash_function(key)
81+
for i, kv in enumerate(self.table[index]):
82+
if kv[0] == key:
83+
self.table[index][i] = (key, value)
84+
return
85+
self.table[index].append((key, value))
86+
87+
def lookup(self, key):
88+
index = self.hash_function(key)
89+
for kv in self.table[index]:
90+
if kv[0] == key:
91+
return kv[1]
92+
return None
93+
94+
def delete(self, key):
95+
index = self.hash_function(key)
96+
for i, kv in enumerate(self.table[index]):
97+
if kv[0] == key:
98+
del self.table[index][i]
99+
return
100+
101+
# Example usage
102+
ht = HashTable(10)
103+
ht.insert("apple", 1)
104+
ht.insert("banana", 2)
105+
print(ht.lookup("apple")) # Output: 1
106+
ht.delete("apple")
107+
print(ht.lookup("apple")) # Output: None
108+
```
109+
110+
### Java Implementation
111+
112+
```java
113+
import java.util.LinkedList;
114+
115+
class HashTable<K, V> {
116+
private class Entry<K, V> {
117+
K key;
118+
V value;
119+
Entry(K key, V value) {
120+
this.key = key;
121+
this.value = value;
122+
}
123+
}
124+
125+
private LinkedList<Entry<K, V>>[] table;
126+
private int size;
127+
128+
@SuppressWarnings("unchecked")
129+
public HashTable(int size) {
130+
this.size = size;
131+
table = new LinkedList[size];
132+
for (int i = 0; i < size; i++) {
133+
table[i] = new LinkedList<>();
134+
}
135+
}
136+
137+
private int hashFunction(K key) {
138+
return key.hashCode() % size;
139+
}
140+
141+
public void insert(K key, V value) {
142+
int index = hashFunction(key);
143+
for (Entry<K, V> entry : table[index]) {
144+
if (entry.key.equals(key)) {
145+
entry.value = value;
146+
return;
147+
}
148+
}
149+
table[index].add(new Entry<>(key, value));
150+
}
151+
152+
public V lookup(K key) {
153+
int index = hashFunction(key);
154+
for (Entry<K, V> entry : table[index]) {
155+
if (entry.key.equals(key)) {
156+
return entry.value;
157+
}
158+
}
159+
return null;
160+
}
161+
162+
public void delete(K key) {
163+
int index = hashFunction(key);
164+
table[index].removeIf(entry -> entry.key.equals(key));
165+
}
166+
167+
public static void main(String[] args) {
168+
HashTable<String, Integer> ht = new HashTable<>(10);
169+
ht.insert("apple", 1);
170+
ht.insert("banana", 2);
171+
System.out.println(ht.lookup("apple")); // Output: 1
172+
ht.delete("apple");
173+
System.out.println(ht.lookup("apple")); // Output: null
174+
}
175+
}
176+
```
177+
178+
### C++ Implementation
179+
180+
```cpp
181+
#include <iostream>
182+
#include <list>
183+
#include <vector>
184+
#include <string>
185+
186+
using namespace std;
187+
188+
class HashTable {
189+
private:
190+
int size;
191+
vector<list<pair<string, int>>> table;
192+
193+
int hashFunction(const string &key) {
194+
return hash<string>{}(key) % size;
195+
}
196+
197+
public:
198+
HashTable(int size) : size(size), table(size) {}
199+
200+
void insert(const string &key, int value) {
201+
int index = hashFunction(key);
202+
for (auto &kv : table[index]) {
203+
if (kv.first == key) {
204+
kv.second = value;
205+
return;
206+
}
207+
}
208+
table[index].emplace_back(key, value);
209+
}
210+
211+
int lookup(const string &key) {
212+
int index = hashFunction(key);
213+
for (const auto &kv : table[index]) {
214+
if (kv.first == key) {
215+
return kv.second;
216+
}
217+
}
218+
return -1; // Return -1 if key not found
219+
}
220+
221+
void delete_key(const string &key) {
222+
int index = hashFunction(key);
223+
table[index].remove_if([&key](const pair<string, int> &kv) { return kv.first == key; });
224+
}
225+
};
226+
227+
int main() {
228+
HashTable ht(10);
229+
ht.insert("apple", 1);
230+
ht.insert("banana", 2);
231+
cout << ht.lookup("apple") << endl; // Output: 1
232+
ht.delete_key("apple");
233+
cout << ht.lookup("apple") << endl; // Output: -1
234+
return 0;
235+
}
236+
```
237+
238+
### JavaScript Implementation
239+
240+
```javascript
241+
class HashTable {
242+
constructor(size) {
243+
this.size = size;
244+
this.table = new Array(size).fill(null).map(() => []);
245+
}
246+
247+
hashFunction(key) {
248+
let hash = 0;
249+
for (let char of key) {
250+
hash += char.charCodeAt(0);
251+
}
252+
return hash % this.size;
253+
}
254+
255+
insert(key, value) {
256+
const index = this.hashFunction(key);
257+
for (let [k, v] of this.table[index]) {
258+
if (k === key) {
259+
v = value;
260+
return;
261+
}
262+
}
263+
this.table[index].push([key, value]);
264+
}
265+
266+
lookup(key) {
267+
const index = this.hashFunction(key);
268+
for (let [k, v] of this.table[index]) {
269+
if (k === key) {
270+
return v;
271+
}
272+
}
273+
return null;
274+
}
275+
276+
delete(key) {
277+
const index = this.hashFunction(key);
278+
this.table[index] = this.table[index].filter(([k, v]) => k !== key);
279+
}
280+
}
281+
282+
const ht = new HashTable(10);
283+
ht.insert("apple", 1);
284+
ht.insert("banana", 2);
285+
console.log(ht.lookup("apple")); // Output: 1
286+
ht.delete("apple");
287+
console.log(ht.lookup("apple"));
288+
289+
// Output: null
290+
```
291+
292+
## Time Complexity Analysis
293+
294+
- **Insertion**: $O(1)$ on average
295+
- **Deletion**: $O(1)$ on average
296+
- **Lookup**: $O(1)$ on average
297+
298+
## Space Complexity Analysis
299+
300+
- **Space Complexity**: O(n) where n is the number of key-value pairs
301+
302+
## Advanced Topics
303+
304+
### Dynamic Resizing
305+
306+
When the load factor (number of elements/size of table) exceeds a threshold, the hash table can be resized to maintain performance. This involves creating a new table with a larger size and rehashing all existing elements.
307+
308+
### Cryptographic Hash Functions
309+
310+
In cryptography, hash functions are used to secure data. These hash functions are designed to be irreversible and produce a fixed-size hash value for any input.
311+
312+
### Bloom Filters
313+
314+
A Bloom filter is a space-efficient probabilistic data structure used to test whether an element is a member of a set. It uses multiple hash functions to map elements to a bit array.
315+
316+
## Conclusion
317+
318+
In this tutorial, we covered the fundamentals of hash tables, their uses, how they work, and the concept of hashing in general. We also provided implementations in Python, Java, C++, and JavaScript. Hash tables are a powerful and efficient data structure that play a crucial role in various applications. Understanding how they work and how to implement them will greatly enhance your programming skills and ability to solve complex problems efficiently.

0 commit comments

Comments
 (0)