Glossary Term
Distributed hash table
History and Properties of Distributed Hash Tables
- DHT research was motivated by peer-to-peer systems like Freenet, Gnutella, BitTorrent, and Napster.
- Napster used a central index server, making it vulnerable to attacks and lawsuits.
- Gnutella used a query flooding model, which was less efficient than Napster.
- Freenet employed key-based routing, clustering similar files on the same set of nodes.
- DHTs combined the decentralization of Freenet and Gnutella with the efficiency of Napster.
- Autonomy and decentralization are key characteristics of DHTs.
- DHTs should be fault-tolerant, handling continuous node arrivals, departures, and failures.
- Scalability is important, allowing efficient operation with thousands or millions of nodes.
- DHTs minimize coordination between nodes, reducing the work required for membership changes.
- Some DHT designs focus on security and anonymity, although less common than in other P2P systems.
Structure and Keyspace Partitioning in Distributed Hash Tables
- DHTs consist of an abstract keyspace, such as a set of 160-bit strings.
- Keyspace partitioning schemes distribute ownership of the keyspace among nodes.
- Overlay networks connect the nodes and enable key lookup.
- To store a file, its filename is hashed to generate a key, which is sent to a participating node.
- Retrieving a file involves hashing the filename to find the node responsible for the key.
- Most DHTs use consistent hashing or rendezvous hashing to map keys to nodes.
- Consistent hashing and rendezvous hashing minimize remapping of keys during node changes.
- In consistent hashing, the circular keyspace is divided into contiguous segments owned by nodes.
- Rendezvous hashing assigns keys based on the highest random weight.
- Both algorithms ensure that adding or removing a node affects only adjacent keys.
Applications of Distributed Hash Tables
- DHTs serve as infrastructure for various services like anycast, distributed file systems, and domain name services.
- Notable networks using DHTs include BitTorrent's distributed tracker, the Kad network, and Freenet.
- DHT technology is adopted by BitTorrent and the Coral Content Distribution Network.
- DHTs enable peer-to-peer file sharing and content distribution systems.
- DHTs can be used for instant messaging, multicast, and building search engines like YaCy.
Overlay Networks and Algorithms in Distributed Hash Tables
- Each node maintains a set of links to other nodes (neighbors or routing table).
- These links form the overlay network.
- Nodes pick neighbors based on the network's topology.
- All DHT topologies ensure that for any key k, each node either owns k or has a link to a node closer to k.
- Key-based routing is used to forward messages to the owner of any key k.
- Aside from routing, algorithms exploit the overlay network structure for message dissemination.
- Structella implements flooding and random walks on a Pastry overlay.
- DQ-DHT implements a dynamic querying search algorithm over a Chord network.
- These algorithms are used for overlay multicast, range queries, and collecting statistics.
- They leverage the structure of the overlay network to optimize message distribution.
Security and Techniques in Distributed Hash Tables
- DHTs are inherently more resilient against hostile attackers compared to centralized systems.
- Byzantine fault tolerance can defend against the Sybil attack in DHTs.
- Whanau is a DHT designed to be resistant to Sybil attacks.
- Incorporating social trust relationships into the system design can circumvent the Sybil attack.
- Effective defenses against Sybil attacks are an ongoing research topic in security conferences.
- DHT security techniques surveyed by Urdaneta, Pierre, and van Steen.
- Novel architectures for P2P applications: the Continuous-Discrete Approach.
- Dipsea: A Modular Distributed Hash Table.
- Structured overlay for heterogeneous environments.
- Self-Chord: A Bio-Inspired P2P Framework for Self-Organizing Distributed Systems.