Extendible hashing

Introduction and Overview of Extendible Hashing
– Extendible Hashing is a fast access method for dynamic files.
– It is a hash-based indexing technique.
– It allows for efficient insertion and retrieval of data.
– The hash function returns a string of bits.
– The first i bits of each string are used as indices to determine the location in the directory (hash table).

Key Insertion, Bucket Overflow, and Directory Organization
– Keys are inserted into the hash table based on their hashed values.
– If a bucket becomes full, it needs to be split.
– The local depth of a bucket determines the number of bits used for hashing.
– If the local depth is equal to the global depth, the directory needs to be doubled.
– If the local depth is less than the global depth, the bucket can be split without doubling the directory.
– The directory contains pointers to buckets.
– The global depth represents the number of bits used for hashing in the directory.
– The local depth represents the number of bits used for hashing in a bucket.
– After a bucket split, the local depth is incremented and used for redistributing the entries.
– The directory size is doubled when a bucket becomes full.

Example Implementation in Python
– The extendible hashing algorithm can be implemented in Python.
– The code uses the least significant bits for efficient table expansion.
– The directory is represented as a list of pages.
– Each page has a map of key-value pairs and a local depth.
– The get_page() function retrieves the page based on the hashed key.
– The put() function inserts a key-value pair into the appropriate page.

Advantages and Limitations of Extendible Hashing
– Extendible Hashing allows for efficient insertion and retrieval of data.
– It handles dynamic file sizes effectively.
– It provides a balanced distribution of keys across buckets.
– However, the depth cannot exceed the bit size of an integer.
– Doubling the directory or splitting a bucket may not allow entries to be rehashed to different buckets.

Performance and Comparison with Other Hashing Techniques
– Extendible hashing provides efficient search, insert, and delete operations.
– The directory structure allows for a balanced distribution of records.
– The number of disk accesses is minimized for most operations.
– The performance remains stable even with a large number of records.
– The space overhead of the directory structure is relatively small.
– Extendible hashing is more flexible than static hashing.
– It handles dynamic changes in the number of records effectively.
– Compared to linear hashing, extendible hashing has a simpler structure.
– Extendible hashing performs well in scenarios with frequent updates.
– Other hashing techniques may be more suitable for specific use cases.

Extendible hashing (Wikipedia)

Extendible hashing is a type of hash system which treats a hash as a bit string and uses a trie for bucket lookup. Because of the hierarchical nature of the system, re-hashing is an incremental operation (done one bucket at a time, as needed). This means that time-sensitive applications are less affected by table growth than by standard full-table rehashes.

Extendible hashing was described by Ronald Fagin in 1979. Practically all modern filesystems use either extendible hashing or B-trees. In particular, the Global File System, ZFS, and the SpadFS filesystem use extendible hashing.

« Back to Glossary Index

Extendible hashing

Company

Services

Support

Contact and Business Information

Quote Request Details

Extendible hashing

Request an article

Submit your RFP

Contact and Business Information

Quote Request Details