Skip to main content
Glossary Term

Trie

History, etymology, and pronunciation - Trie was first abstractly described by Axel Thue in 1912 - René de la Briandais described tries in a computer context in 1959 - Edward Fredkin independently described tries in 1960 and coined the term 'trie' - Some authors pronounce it as /ˈtriː/ (tree), while others pronounce it as /ˈtraɪ/ (try) - The term 'trie' comes from the middle syllable of 'retrieval' Overview and Operations - Tries are a form of string-indexed look-up data structure - They are used to store a dictionary list of words - Tries allow for efficient generation of completion lists - They are particularly useful for string-searching algorithms like predictive text and spell checking - Tries can be seen as tree-shaped deterministic finite automata - Tries support insertion, deletion, and lookup of string keys - Nodes in a trie contain links to other suffix child nodes or null - Each node has a parent except for the root node - The number of links in each node is equal to the number of characters in the alphabet - Nodes in the trie do not store their associated key, but their position defines the key Searching and Insertion - Searching in a trie is guided by the characters in the search string key - Each node in the trie contains a link to each possible character in the string - Following the string within the trie leads to the associated value for the key - A null link during the search indicates the key does not exist in the trie - The search operation in a standard trie takes O(dm) time, where m is the size of the string key and d is the alphabet size - Insertion into a trie involves using the characters in the string key as indexes to the children array - Each node in the trie corresponds to one call of the radix sorting routine - If a null link is encountered during insertion, a new node is created - The value of the terminal node is assigned to the input value - If the terminal node already had a value, it is replaced with the new value Replacing other data structures and Implementation strategies - A trie can replace a hash table - Searching for a node in a trie has a complexity of O(m), while an imperfect hash function may have a worst-case lookup speed of O(N) - Tries do not require a hash function and do not have collisions - Tries can sort string keys using alphabetical ordering - Tries are less efficient than hash tables when data is accessed on a secondary storage device - Tries can be represented using a vector of pointers or a singly linked list for each node - Techniques like alphabet reduction can reduce space complexity by using a smaller alphabet - Storing a vector of ASCII pointers as a bitmap reduces the size of nodes - Different implementation strategies have trade-offs between memory use and speed - Memory space can be reduced at the expense of running time Bitwise tries and Compressed tries - Bitwise tries address the space requirement issue in naive pointer vector implementations - Each character in the string key set is represented using individual bits - Vectorized CPU instructions are used to find the first set bit in a fixed-length key input - Bitwise tries are cache-local and highly parallelizable - They perform well on out-of-order execution CPUs - Radix tree, or compressed trie, is a space-optimized variant of a trie - Nodes with only one child get merged with their parents, reducing space and time metrics - Compressed tries work best when the trie remains static and keys are sparse - Packing the trie can further optimize space usage - Patricia trees are a binary encoding of string keys in a compressed binary trie Note: The content did not provide enough information to create a comprehensive fifth group.