-
Notifications
You must be signed in to change notification settings - Fork 292
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
190199e
commit db5b75a
Showing
1 changed file
with
53 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# Robin Hood Hashing Algorithm | ||
|
||
## Description | ||
|
||
Robin Hood Hashing is a variant of open addressing for hash tables, designed to minimize probe lengths and handle clusters more effectively. When inserting an element, if an existing element has a longer probe distance, they swap places to ensure that the “poorest” element (the one with the longest probe distance) gets closer to its initial hash location. This balancing strategy improves lookup times by reducing the maximum probe length and balancing load distribution in the hash table. | ||
|
||
## Key Features | ||
|
||
- **Efficient Collision Resolution**: Uses open addressing with probe length comparison to determine the optimal position for each element. | ||
- **Load Balancing**: Minimizes clustering by enforcing an even distribution of elements. | ||
- **Dynamic Swapping**: Ensures each element achieves the shortest possible probe distance. | ||
|
||
## Problem Definition | ||
|
||
Given a set of key-value pairs, the goal of Robin Hood Hashing is to create a hash table that: | ||
- Minimizes the maximum probe distance for any element. | ||
- Balances the distribution of elements across the table, especially at high load factors. | ||
|
||
- **Input**: A set of keys and values to be inserted into the hash table. | ||
- **Output**: A hash table with balanced probe lengths to ensure efficient access and minimized clustering. | ||
|
||
## Algorithm Review | ||
|
||
### 1. Hash Function | ||
- Compute the initial index for each key using `key % TABLE_SIZE`. | ||
|
||
### 2. Insertion | ||
- Insert each key-value pair at its computed index. | ||
- If a collision occurs and an existing element has a shorter probe distance than the new element, swap them. | ||
- Repeat this until the element with the highest probe distance is inserted at a suitable location. | ||
|
||
### 3. Search | ||
- Calculate the hash index and probe through the table. | ||
- If an element with a shorter probe distance than expected is found, the search is terminated early. | ||
|
||
### 4. Deletion | ||
- Mark elements as deleted rather than physically removing them to retain the probe distance logic for neighboring elements. | ||
|
||
## Time Complexity | ||
|
||
- **Insertion**: Average O(1) with probe length adjustments. | ||
- **Search**: Average O(1) by leveraging minimized probe lengths. | ||
|
||
## Applications | ||
|
||
Robin Hood Hashing is beneficial for scenarios involving: | ||
- **High Load Factor Hash Tables**: Maintains efficiency in dense hash tables. | ||
- **Data Deduplication**: Handles duplicate elements efficiently without excessive clustering. | ||
- **Real-Time Applications**: Minimizes lookup and insertion times, ideal for high-throughput systems. | ||
|
||
## Conclusion | ||
|
||
Robin Hood Hashing optimizes hash table performance by balancing probe lengths, improving access times, and reducing clustering. This makes it a valuable technique for efficient data storage and retrieval in applications requiring high load factors and balanced hash distribution. |