Data Structures the Fun Way - 9. Spatial Trees

Further improving nearest neighbor search based on tree-based data structures and the concept of spatial partitioning

🔖 9.1 Quad Trees

Limitations of Grids

  • Using many grid boxes requires considerable memory and searching many boxes

  • If grids are divided coarsely (?), many points enter boxes, resulting in similar outcomes to simple linear scans

  • For this, it's good to dynamically partition space according to data -> uniform quad trees

Uniform Quad Trees

  • Introduces the branching structure of trees into grids, where each node of the tree represents each region of space

  • Creating quad trees like art by recursively partitioning allocated space into smaller and smaller sub-regions

  • Adding points to nodes is done by traversing the tree to find the position of new points.

Removing Points

  • Similar to inserting points but follows a more complex process.

  • Like grids, you can add arbitrarily close points or insert duplicate points.

  • Wrapper functions check if points are within tree boundaries, then call recursive removal functions

Searching in Uniform Quad Trees

  • In uniform quad trees, search starts from the root node.

  • And first check if points closer than the current candidate can be included.

  • Like the example in the book, it's like asking a lazy neighbor where to throw the ball to get it over to our yard in the shortest distance.

🔖 k-d Trees

Structure of k-d Trees

  • k-d trees are built based on concepts similar to quad trees.

  • It's a spatial data structure that combines the spatial partitioning of quad trees with the characteristic of binary search trees branching into two children, taking advantage of both data structures.

  • k-d trees select one dimension and partition data based on that dimension.

  • Due to their flexible structure, more care is needed when handling spatial boundaries of nodes.

  • Due to complexity, each k-d tree node stores a considerable amount of information.

Strict Spatial Boundaries

  • Using strict boundaries allows k-d trees to be better applied to data.

  • Since the regions obtained through strict boundaries are much smaller, pruning can be done more powerfully.

  • Therefore, the likelihood that the minimum distance between the target point and boundary becomes larger increases.

Creating k-d Trees

  • Goes through a recursive process similar to binary search tree creation but with differences.

  • Initially starts with all data points, selects the dimension and value to partition, and divides data points into two subsets.

  • To avoid infinite recursion when there are duplicate data, use one of the last two tests.

  • The main difference is that k-d trees select only one dimension to partition at each level.

Operations of k-d Trees

  • Basic operations are insertion, removal, and search of points, using the same methods as quad trees.

  • All operations start from the top and move to appropriate branches using partition values.

  • The main difference is that instead of choosing which quadrant to search, you choose one of two children.

🔖 9.3 Why Quad Trees and k-d Trees are Important

Key Summary

  • Using quad trees allows adjusting grid resolution according to the density of data points in local regions by partitioning multiple dimensions at once.

  • k-d trees are concepts that borrow the core idea of binary trees to solve nearest neighbor search problems.

  • By using the method of selecting one dimension to partition at each node, it solves the problem of branching factor becoming large in trees representing multidimensional data.

  • Also, k-d trees can respond more flexibly to data structures as they adapt to the data structure and provide more pruning.

Last updated