Glossary Term
Extendible hashing
Introduction and Overview of Extendible Hashing
- Extendible Hashing is a fast access method for dynamic files.
- It is a hash-based indexing technique.
- It allows for efficient insertion and retrieval of data.
- The hash function returns a string of bits.
- The first i bits of each string are used as indices to determine the location in the directory (hash table).
Key Insertion, Bucket Overflow, and Directory Organization
- Keys are inserted into the hash table based on their hashed values.
- If a bucket becomes full, it needs to be split.
- The local depth of a bucket determines the number of bits used for hashing.
- If the local depth is equal to the global depth, the directory needs to be doubled.
- If the local depth is less than the global depth, the bucket can be split without doubling the directory.
- The directory contains pointers to buckets.
- The global depth represents the number of bits used for hashing in the directory.
- The local depth represents the number of bits used for hashing in a bucket.
- After a bucket split, the local depth is incremented and used for redistributing the entries.
- The directory size is doubled when a bucket becomes full.
Example Implementation in Python
- The extendible hashing algorithm can be implemented in Python.
- The code uses the least significant bits for efficient table expansion.
- The directory is represented as a list of pages.
- Each page has a map of key-value pairs and a local depth.
- The get_page() function retrieves the page based on the hashed key.
- The put() function inserts a key-value pair into the appropriate page.
Advantages and Limitations of Extendible Hashing
- Extendible Hashing allows for efficient insertion and retrieval of data.
- It handles dynamic file sizes effectively.
- It provides a balanced distribution of keys across buckets.
- However, the depth cannot exceed the bit size of an integer.
- Doubling the directory or splitting a bucket may not allow entries to be rehashed to different buckets.
Performance and Comparison with Other Hashing Techniques
- Extendible hashing provides efficient search, insert, and delete operations.
- The directory structure allows for a balanced distribution of records.
- The number of disk accesses is minimized for most operations.
- The performance remains stable even with a large number of records.
- The space overhead of the directory structure is relatively small.
- Extendible hashing is more flexible than static hashing.
- It handles dynamic changes in the number of records effectively.
- Compared to linear hashing, extendible hashing has a simpler structure.
- Extendible hashing performs well in scenarios with frequent updates.
- Other hashing techniques may be more suitable for specific use cases.