Source: Geometric Approximation Algorithms

We are interested in approximating distance between points . For some sets , we write as all unordered pairs of points. Additionally, define

A pair decomposition of is a set of pairs such that

  1. For every ,
  2. For every ,

A pair is well-separated if . A well-separated pair decomposition (WSPD) is a set of pairs where each and is well-separated.

Quadtree Construction

Given a point set , construct a compressed quadtree with root . For some node , write as the corresponding hypercube. Define and as the set of points in . For two nodes , we write .

To compute the -WSPD of , call .

  1. If and , return
  2. If , swap and
  3. If , return
  4. Return where are children of

Lemma 1

Info

For some pair computed by , we have , where is the parent node of in .

A moment’s thought shows that and . At some point, we call through a sequence of arguments . WLOG assume that and . Take to be the last index where . Note is a descendent of and the larger cell is always refined first, thus . By symmetry this argument holds for .

Lemma 2

Info

Let be a cell of a grid of with diameter . For , there are at cells of distance at most from

Every cell with distance at most from is contained in some hypercube with side length . This hypercube has volume , while has volume . Thus by packing we can fit at most cells in this hypercube, bounding the number of candidate cells.

Lemma 3

Info

For the WSPD generated by , for any pair and , , we have

Note that , since and .

Correctness

Performing structural induction on , it is clear that this will produce a pair of subsets which covers every pair in . Thus covers every pair of .

For some output pair , if both are singletons it is clear that and are well separated. Otherwise , so . Therefore and this is a valid pair decomposition.

For every output pair , by construction we have . Thus we have a -WSPD.

Complexity

The quadtree can be constructed in . The remainder of the algorithm is linear in output size, thus it suffices to bound the number of pairs returned.

For some pair generated by , suppose this was called by , where is the parent of . Charge the cost of computing to . For all such in , any pair was considered but not added, thus , so . By Lemma 1, , thus we case on equality.

Case 1 . By Lemma 2, there are at most candidates for . Since has at most children, this charge occurs at most times.

Case 2 . Note , thus by identical argument, there are candidates for . This node also has at most children, so we have at most charges.

Case 3 . Consider the grid of cells about , and let be the cell containing . Note that , so . Similarly, there are candidates of . Since we only consider immediate children of , there are at most charges of this type.

Assuming is constant, we have charges per , and at most pairs generated, as has nodes. Additionally, the algorithm terminates in time.