Trie (Prefix tree)
https://leetcode.com/problems/implement-trie-prefix-tree/solutions/127843/implement-trie-prefix-tree/
Overview
Trie (pronounced “try”) or prefix tree is a data structure used for retrieval of a key in a dataset of strings.
Use cases
Autocomplete
Google autocomplete suggest in action.
Spell checker
// Spell checker used in word processor.
IP routing (longest prefix matching)
Longest prefix matching algorithm uses Tries in Internet Protocol (IP) routing to select an entry from a forwarding table.
T9 predictive text
T9, which stands for Text on 9 keys, was used on phones to input texts during the late 1990s.
Solving word games
Tries is used to solve word games like Boggle efficiently by pruning the search space.
Comparison with Hash Map
There are data structures, like balanced trees and hash tables, which give us the possibility to search for a word in a dataset of strings. Although hash table has $O(1)$ time complexity for looking for a key, it is not efficient in the following operations:
- Finding all keys with a common prefix.
- Enumerating a dataset of strings in lexicographical order.
Moreover, hash tables increase in size, there are lots of hash collisions and the search time complexity could deteriorate to $O(n)$, where $n$ is the number of keys inserted. Trie could use less space when storing many keys with the same prefix.
In this case, using a trie has only $O(m)$ time complexity, where $m$ is the key length.
Trie node structure
Trie is a rooted tree and its nodes have the following fields:
- Maximum of $R$ links to its children, where each link corresponds to one of $R$ character values from a dataset alphabet (e.g., 26 letters of the lowercase English alphabet).
- Boolean field which specifies whether the node corresponds to the end of the key, or is just a key prefix.
Representation of a key “leet” in trie.
Insertion
We insert a key by searching into the trie. We start from the root and search a link corresponding to the first key character.
- If the link exists, then we move down following the link and continue searching for the next key character.
- Else, we create a new node and link it with the parent node, matching the current key character. Repeat this step until we reach the end of the key and mark it as the end node.
Insertion of keys into a trie.
Insertion - Complexity analysis
$m$ is the key length.
- Time complexity: $O(m)$
In each iteration, we either examine or create a node in the trie till we reach the end of the key.
- Space complexity: $O(m)$
In the worst case, newly inserted key doesn’t share a prefix with the keys already inserted and we have to add $m$ new nodes.
Search
Each key is represented as a path from the root to an internal node or leaf. We start from the root and examine for a link corresponding to the key character.
- If the link exists, we move down following the link and continue searching for the next key character.
- If the link doesn’t exist, and we are at the end of the key, and the node is marked as the end, then the key is present in the trie.
- Else, the key is not present in the trie.
Search for “leet” in trie.
Search - Complexity analysis
- Time complexity: $O(m)$.
- Space complexity: $O(1)$.