Data Structures the Fun Way - 6. Tries and Adaptive Data Structures
Data structures optimized by additionally considering the characteristics of strings in binary search trees
🔖 6.1 Binary Search Trees Made of Strings
Limitations of Binary Search Trees
Binary search trees can store anything that can be sorted in order.
At first glance, binary search trees seem to provide a simple and efficient mechanism for searching string data.
However, you need to pay attention to the cost of each comparison.
In the worst case, the cost of string comparison operations is proportional to the length of the string itself.
Cost and Limitations of String Comparison
Sequential comparison of strings in binary search trees is more expensive than comparison of two numbers.
In many languages, there's also the point that there aren't many characters that can be included at each position of a string.
🔖 6.2 Tries
What is a Trie?
It's a tree-based data structure that divides strings into different subtrees based on prefixes.
Instead of dividing data into two sets at each node, tries divide tree branches based on prefixes so far.
It emerged to improve file searching on computers with slow memory access speeds.
Characteristics of Tries
Each node in a trie represents the prefix of the string compared so far.
Like binary search trees, tries also start from the root node, and each node can have two or more children.
Tries perform dramatic multiple branching at all nodes.
Unlike binary search trees, not all nodes in tries represent items.
Useful information such as the prefix of the current node can be stored in trie nodes.
Searching in Tries
It's efficient because you don't need to compare the entire string or the front part of the prefix.
When implementing in code, you can track by passing an index representing the position of the character to check at the current step to the recursive function.
Using a Trie wrapper can simplify by hiding the reference to the root node and the counter needed for the recursive function.
Unlike binary search trees, the number of comparisons in successful trie searches is independent of the number of strings stored in the trie.
Trie Search Code
function TrieNodeSearch(current, target, index) {
if (index === target.length) {
// ①
if (current.isEntry) {
return current;
} else {
return null;
}
}
const nextLetter = target[index]; // ②
const nextIndex = LetterToIndex(nextLetter); // ③
const nextChild = current.children[nextIndex]; // ④
if (nextChild === null) {
return null;
} else {
return TrieNodeSearch(nextChild, target, index + 1);
}
}
// LetterToIndex is a function that converts a specific character (nextLetter) to an index number
// current represents the current node, target is the string to search, and index is the position of the string currently being searched
Adding and Removing Nodes
Tries are also dynamic data structures that accurately represent data changes, like binary search trees.
Unlike insertion in binary search trees, multiple nodes can be added while inserting one item.
🔖 6.3 Why Tries are Important
Benefits of Tries
Tries become more efficient as more strings are added and the number of strings with common prefixes increases.
Since string comparison itself is costly, you can gain great benefits by checking and comparing based on common prefixes.
Examples of Trie Usage
Data structures for tracking structural labels such as serial numbers, model numbers, etc., made of short alphabets and numbers
Even with tens of billions of products, storage space can be saved by utilizing prefixes of serial numbers.
The key point is that when optimizing operation costs, you can better utilize the structure of strings.
Last updated