Progressive Geometric Algorithms

Progressive algorithms are algorithms that, on the way to computing a complete solution to the problem at hand, output intermediate solutions that approximate the complete solution increasingly well. We present a framework for analyzing such algorithms, and develop efficient progressive algorithms for two geometric problems: computing the convex hull of a planar point set, and finding popular places in a set of trajectories.


INTRODUCTION
Traditional algorithms compute the complete solution to a problem without allowing for interaction or giving intermediate results. This was the most natural approach when interaction with computing devices was cumbersome, and in many applications it may still be the desired approach. Modern applications, however, often involve large data sets that need to be explored and analyzed interactively. In this paper we therefore study geometric algorithms that do not focus solely on computing the complete solution, but that produce meaningful partial solutions early on. This way the user can decide whether it is necessary to continue the computation and, perhaps, in which way the computation should proceed. This is useful in an exploratory setting with a human user, but also in a traditional setting where another algorithm uses the given algorithm as a subroutine.
We are interested in algorithms that report, on their way to computing a complete solution, partial solutions that approximate the complete solution increasingly well. We will call such algorithms progressive algorithms, and we will use the following framework. We have an error function that assigns to every partial solution S a non-negative value err (S). A partial solution S with err (S) = 0 is called a complete solution. A progressive algorithm is an algorithm that works in * Q.W. Bouts and K. Buchin are supported by the Netherlands Organisation for Scientific Research (NWO) under project no. 639.023.208 and 612.001.207, respectively. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org. SoCG'14, June 8-11, 2014, Kyoto, Japan. Copyright is held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-2594-3/14/06 ...$15.00. rounds, reporting partial solutions with smaller error guarantees in each round, until in the last round a complete solution is reported. The error guarantees are specified by a convergence function fconv : {1, . . . , k} → R 0 , where k is the number of rounds. More precisely, a k-round progressive algorithm with convergence function fconv with respect to error function err is an algorithm that outputs, after the r-th round (for 1 r k), a partial solution Sr with err (Sr) fconv(r). A more extensive discussion of our framework is given in Section 2.
Progressive algorithms are closely related to anytime algorithms, which were already studied in the AI and real-time systems communities in the nineties; see for example the early survey by Zilberstein [12]. Intuitively, anytime algorithms are algorithms that can be interrupted at any time to report an intermediate solution; the quality of the intermediate solution should improve as time progresses. Heuristics like simulated annealing and genetic algorithms are typical techniques to obtain anytime algorithms. Several notions of quality have been proposed for the intermediate solutions reported by anytime algorithms [12], including the certainty with which the reported solution is correct, the accuracy of the solution, and its specificity. Our progressive algorithms are similar to anytime algorithms with accuracy as the quality measure, where the accuracy is quantified by the error function. The main difference is that we focus on provable guarantees on the running time and the error. Indeed, the convergence function used in our framework is closely related to the performance profile used to analyze anytime algorithms, but in the anytime-algorithms literature the performance profile is typically determined experimentally while we require a mathematical worst-case analysis. Another difference is that we require a solution with zero error to be reported after a predetermined number of iterations. Thus progressive algorithms combine the ideas behind anytime algorithms with the rigorous analysis from algorithms research.

Our results.
In Section 2 we present our theoretical framework for progressive algorithms and how to analyze them, and we establish some basic results about progressive algorithms: we show how to turn progressive algorithms whose running time is good in an amortized sense into progressive algorithms with a good running time for each individual round, and we show how to combine (1 + ε)-approximation algorithms with exact algorithms to obtain efficient progressive algorithms. We also present a simple but optimal progressive sorting al-gorithm, which we use to discuss some limitations of our framework.
In Section 3 we turn our attention to computing the convex hull CH(P ) of a set P of n points in the plane. As partial solutions we allow convex polygons C ⊆ CH(P ), and we define the error of a partial solution C based on the relative widths of C and CH(P ): we define where width θ (C) denotes the width of C in direction θ. We describe an efficient progressive convex-hull algorithm that uses a divide-and-conquer approach. The algorithm runs in at most k := log n rounds, each taking O(n) time, and after the r-th round it reports a convex polygon Cr ⊆ CH(P ) with error O(1/2 2r ). A similar result can be obtained using coresets [1], by applying our technique to convert approximation algorithms into progressive algorithms.
In Section 4 we study the computation of popular places (sometimes also called convergence patterns [8,9]) in trajectory data [3]. We are given a collection T := {T1, . . . , Tm} of trajectories describing the movement of various entities, and we want to find the places visited by many entities. We assume each trajectory is specified by a finite sequence of points, which are the locations of the entity at certain time instances. (Often timestamps are also given for the locations, but this is not relevant for our problem.) Following Benkert et al. [3], we define a κ-popular place, for a given parameter κ, to be a unit square that is visited by at least κ entities. In the discrete setting a popular place is thus a unit quare containing input points from at least κ trajectories. The goal is now to compute the region Aκ(T ) ⊂ R 2 such that a unit square σ is popular if and only if its center lies in Aκ(T ). Benkert et al. give an output-sensitive algorithm that computes Aκ(T ) in O((n + |Aκ(T )|) log n) time, where n is the total number of input points and |Aκ(T )| denotes the combinatorial complexity of Aκ(T ). It follows from our results that |Aκ(T )| = O(nκ), which is tight in the worst case. We give a progressive algorithm running in k := log n rounds, each taking O(nκ) time, and with convergence function O(1/2 r ), where the error of a partial solution S ⊆ Aκ(T ) is defined as the size of the largest square σ * ⊆ Aκ(T ) that is disjoint from the partial solution.

THE FRAMEWORK
Our theoretical framework for progressive algorithms is as follows. Let D be the data set that forms the input to the problem. The set D defines a set S(D) of valid partial solutions, and we are given an error function err : S(D) → R 0 that assigns to every partial solution a non-negative value. A complete solution is a partial solution S with err (S) = 0.
A progressive algorithm is an algorithm that works in rounds, reporting partial solutions with smaller error guarantees in each round, until a complete solution is reported. The guarantees are specified by a convergence function fconv : {1, . . . , k} → R 0 , where k is the number of rounds and 0 = fconv(k) fconv(k − 1) · · · fconv(1). A k-round progressive algorithm with convergence function fconv (with respect to error function err ) is now defined as an algorithm that outputs after the r-th round, for r = 1, . . . , k, a partial solution Sr with err (Sr) fconv(r).
One performance measure for progressive algorithms is given by the convergence function, which specifies how rapidly the error decreases. Another important measure is the running time, which we can quantify in two ways.
• Let Tr(n) be the worst-case running time for round r. We can use Tmax(n) := maxr Tr(n), the maximum time per round, as efficiency measure. Using little time for every round ensures that the user never has to wait very long for the next partial solution to become available and makes the algorithm predictable. • Let T r (n) be the worst-case total running time for rounds 1, . . . , r. We can use T * max (n) := maxr T r (n)/r, the maximum amortized time per round, as efficiency measure. This gives the possibility to have shorter rounds in the beginning-and, hence, report the first partial solutions very quickly-while later rounds can take more time.
Ideally T k (n), the total time needed for all k rounds, is equal to the running time of the best known non-progressive algorithm for the problem at hand.
Trivially, any progressive algorithm with convergence rate fconv(r) that runs in k rounds and needs Tmax(n) time per round is also a progressive algorithm with convergence rate fconv(r) and T * max (n) = Tmax(n). Conversely, we can convert any algorithm with a good amortized time to an algorithm with a good maximum time per round, as shown next.
Theorem 1. If there exists a progressive algorithm with convergence function fconv(r) that runs in k rounds and whose maximum amortized time per round is T * max (n), then there also exists a progressive algorithm with convergence function fconv(r) that runs in at most k rounds whose maximum time per round is O(T * max (n)). Proof. Consider a progressive algorithm Alg whose maximum amortized time per round is T * max (n). Start running Alg, but do not report the solutions immediately when they become available. Instead, maintain the latest partial solution computed by Alg, and after each time step r · T * max (n), for r = 1, 2, . . . , k, report the currently maintained solution. Since r · Tmax(n) T r (n) for any r, and fconv is non-increasing, the solution reported by the new algorithm after round r is at least as good as the solution reported by the original algorithm after round r. Hence, fconv(r) is also a valid convergence function for the new algorithm.
A desirable property of progressive algorithms is that they are monotone. For example, we would like that the error is guaranteed to decrease (or at least not increase) from one round to the next. Note that this is not implied by the fact that the convergence function is non-increasing, as the convergence function only gives an upper bound on the error. In many cases we would prefer to have monotonicity not only on the error, but on the actual solution. For example, for a progressive convex-hull algorithm we would prefer to have Cr ⊆ Cr+1, where Cr denotes the convex polygon reported after round r. Thus the goal is to develop fast progressive algorithms with good convergence rate, which are ideally also monotone.
Relation to approximation algorithms.
Since progressive algorithms report solutions that approximate the complete solution increasing well, there is a clear connection to approximation algorithms and, in particular, PTASs. Suppose we have an approximation algorithm Alg(ε) that computes a solution with error ε. Then we can obtain a progressive algorithm by running Alg(ε) with increasingly smaller values of ε, each time starting from scratch. This may seem very inefficient, and it is somewhat against the spirit of progressive algorithms. (Furthermore, it does not automatically give a monotone algorithm.) Depending on the running time of the approximation algorithm, however, one can often choose a sequence of values for ε that gives a good running time per round, as shown in the next theorem. A similar technique was also used by Russell and Zilberstein [11] to convert so-called contract algorithms to anytime algorithms.
Theorem 2. Suppose we have an approximation algorithm that can compute, for any given ε > 0, a solution with error ε in time Tapx(ε, n), where Tapx(ε, n) is at least linear in 1/ε. Suppose furthermore that we have an algorithm that computes the exact solution in time Texact(n). Then for any k > 1 there is a k-round progressive algorithm with convergence function fconv(r) = 1/2 r , and Proof. In the first k−1 rounds we run the approximation algorithm, using ε = 1/2 r in round r. In round k we run the exact algorithm. This way we obtain a progressive algorithm with fconv(r) = 1/2 r . For r < k we have because Tapx(ε, n) is at least linear in 1/ε. Moreover, Using once more that Tapx(ε, n) is at least linear in 1/ε, we see that The result now follows from Theorem 1.
A simple example: a progressive sorting algorithm.
Let X be a set of n distinct numbers that we want to sort. First we have to define our partial solutions and error function. A natural partial solution for sorting is a permutation of the numbers in X. The error of a permutation π(X) can be defined in several ways. We will use the maximum displacement of any element x ∈ X with respect to its correct position in the sorted order. Thus, if rank(x) denotes the rank of x in X, and pos(π, x) denotes the position of x in permutation π(X), then err pos(π) = max x∈X |rank(x) − pos(π, x)|.
Note that err pos(π) = 0 when π is a complete solution (that is, when π(X) is sorted), as required from an error function. Our error measure is similar to Spearman's footrule metric [5], except that we use the maximum of the displacements rather than the sum. Standard sorting algorithms such as QuickSort or MergeSort do not have a good convergence rate: both algorithms already spend O(n log n) time on their first recursive call on half of the elements, and after this call the error can still be Ω(n). However, it is straightforward to adapt QuickSort (with linear-time median finding to select the pivot) to obtain a good-in fact, optimal-convergence rate. To this end, consider the recursion tree of QuickSort. In a standard implementation the recursion tree is traversed in a depth-first manner. We change this to a breadth-first traversal. In other words, in each round r we have 2 r bins of size at most n/2 r , where all elements in the i-th bin are smaller than those in the (i + 1)-th bin, and we partition each bin around its median. Giesen et al. [6] studied this algorithm for Spearman's footrule metric, and proved that it is optimal: any algorithm (even a randomized one) needs to do Ω(nt) comparison to generate a permutation whose error in Spearman's footrule metric is guaranteed to be at most n 2 /t. (In fact, they even obtained tight bounds on the exact number of comparisons.) This immediately implies that the breadth-first version of QuickSort is also an optimal progressive algorithm. The following theorem is essentially an interpretation of the results of Giesen et al.in the framework of progressive algorithms.
Theorem 3. There is a progressive algorithm for sorting a set of n numbers that consist of k := log n rounds, each taking O(n) time, and with convergence function fconv(r) = n/2 r − 1 with respect to the error function err pos(π) defined above. This convergence rate is asymptotically optimal: any comparison-based algorithm computing a permutation with error at most n/2 r − 1 needs to do Ω(rn) comparisons in the worst case.
One way to define monotonicity for the sorting problem is to require that, as the subsequent permutations are generated by the algorithm, the displacement |pos(π, x) − rank(x)| of each element x is non-increasing. The QuickSort variant sketched above is not monotone in this sense. It would be interesting to develop a monotone progressive sorting algorithm with the same optimal convergence rate.

Discussion.
Theorem 3 states that the QuickSort variant we sketched has convergence function fconv(r) = n/2 r − 1, that each round takes O(n) time, and that any comparison-based algorithm computing a permutation with error at most n/2 r − 1 needs Ω(rn) comparisons in the worst case. Nevertheless it is possible, for any constant c, to get an algorithm that runs in Θ(log n) rounds, uses O(n) time per round, and has convergence function fconv(r) = n/2 cr − 1. To this end, simply redefine a round to consist of c rounds of the original algorithm. As long as c is a constant, the running time per round is still O(n). The total number of rounds decreases by a factor c. Intuitively, the new solution should be worse than the original solution, because we would like to get as many intermediate solutions as long as it does not increase the total time. However, it seems hard to formalize this in our framework: the fact that the time per round is given asymptotically makes it possible to play tricks like combining multiple rounds into a single round in order to obtain a seemingly better convergence rate. By also adding a number of "dummy rounds" at the end, we would get an algorithm with the same number number of rounds as the original algorithm-that is, the algorithm before we combined multiple rounds into single rounds-while the error at any given round is smaller. Thus, when the convergence rate is exponential, our framework cannot really distinguish between different constant factors in the exponent.

CONVEX HULLS
Let P := {p1, . . . , pn} be a set of n points in the plane. We want to develop a progressive algorithm to compute CH(P ), the convex hull of P . We define a partial solution to be a convex polygon C ⊆ CH(P ); our algorithm actually uses polygons C that are the convex hull of a subset of P . Our error function is a measure that is also used when determining the quality of coresets [1] for the extent measure. For a convex polygon C and some 0 θ < π, let width θ (C) denote the width of C in direction θ, that is, width θ (C) is the distance between the two tangent lines of C making an angle θ + π/2 with the positive x-axis. The error function we consider is .
We describe a progressive algorithm that gives a good convergence rate with respect to this error function. It combines ideas from the construction of coresets with the QuickHull algorithm [2]. The algorithm works in k rounds, for some k log n, and produces a sequence C1, C2, . . . , C k of convex polygons such that C1 ⊂ · · · ⊂ C k = CH(P ). Each convex polygon Cr is the convex hull of a subset Pr ⊂ P such that all points in Pr are extreme, that is, vertices of the final convex hull. A first idea is to let Pr consist of the extreme points in the directions i · (2π/2 r ), for i = 0, 1, . . . , 2 r − 1. (We identify the direction indicated by a vector d with the counter-clockwise angle that d makes with the positive x-axis.) One can show that err width (Cr) = O(1/2 i ) when using these extreme points. However, the number of rounds is not necessarily logarithmic with this scheme and as a result the total running time is not guaranteed to be O(n log n). Moreover, the convergence rate is suboptimal. Next we describe how to overcome these problems.
We start by a preprocessing step (which is considered to be part of the first round of our progressive algorithm) that is also used in the construction of coresets. It ensures the width of CH(P ) is roughly the same in all directions. More precisely, we will make sure that CH(P ) is α-fat, for some constant 0 < α < 1, in the following sense: there are concentric circles Cin and Cout such that Cin ⊂ CH(P ) ⊂ Cout and radius(Cin) = α · radius(Cout).
For any set P of n points in the plane that are not all collinear, there is an affine transformation τ such (i) τ (P ) is α-fat, and (ii) for any Q ⊆ P we have Moreover, such a transformation, together with the circles Cin and Cout as defined above, can be computed in O(n) time.
From now on we assume that Cin and Cout are centered at the origin. Recall that the intermediate solution we compute in step r will be the convex hull of a set Pr ⊆ P . Next we specify which points Pr contains. More precisely, we specify which points we include in Pr to guarantee that the error of CH(Pr) is small. The set Pr will also contain some additional points to guarantee that our progressive algorithm runs in a logarithmic number of rounds, as explained later.
For an integer m, let D(m) := {i · (2π/m) : 0 i < m} be m equally-spaced directions. For a direction θ, let p(θ) ∈ P denote the point that is extreme in direction θ. Furthermore, let ρ(θ) denote the ray with direction θ emanating from the origin, and let e(θ) denote the edge of CH(P ) that is intersected by ρ(θ). To simplify the description we assume that both p(θ) and e(θ) are unique; it is easy to adapt the solution to the case where there are multiple extreme points in a given direction, or ρ(θ) passes through a vertex. Now define: The following lemma is similar to a lemma by Hershberger and Suri [7], although the set P (m) is defined differently. For the purpose of giving a progressive algorithm our definition of P (m) has the advantage that P (m) can be easily extended as m increases. Proof. Consider any direction θ. Let p := p(θ) the extreme point of P in direction θ and let p be the extreme point of P (m) in direction θ. Let O denote the origin, which is also the center of the circles Cin and Cout such that Cin ⊂ CH(P ) ⊂ Cout and radius(Cin) = α · radius(Cout). Let p, p, and O denote the lines through p, p, and O, respectively, that are orthogonal to the direction θ. To prove the lemma it suffices to prove that dist Let θi := i · (2π/m), for 0 i < m, denote the directions in D(m). Define i * such that θi * is the smallest direction greater than or equal to θ. Furthermore, define j such that p(θ) lies in cone(θj, θj+1), the cone defined by the rays ρ(θj) and ρ(θj+1). We distinguish two cases.
Assume without loss of generality that p(θi * ) lies clockwise of ρ(θj+1). Let e(θj+1) be the edge of CH(P ) that is intersected by ρ(θj+1); see Fig. 2. Define q to be the endpoint of e(θj+1) that lies in cone(θj, θj+1) and, as before, define q to be the line orthogonal to φ through q. Then the angle that q makes with qp is at most |θi * − θ|. This implies that, with our new definition of q, we can now follow the proof for Case (i).
We now describe our progressive convex-hull algorithm. As mentioned, the partial solution Cr reported after round r is the convex hull of a subset Pr ⊆ P . We will ensure that P (2 r+1 ) ⊆ Pr, so err width (Cr) = O(1/2 2r ) by Lemma 5. Our algorithm is similar to QuickHull [2], except that we compute the vertices added in each round in a differently so that we get fast convergence.
For r = 1 we use Pr = P (4). To compute P (4) we need to find, for θ = 0, π/2, π, 3π/2, the extreme points in direction θ, which can trivially be done in O(n) time. We also need to find the edges of CH(P ) intersected by the rays ρ(θ). The latter can be done in O(n) time by linear programming [10], where the points in P induce constraints. For example, if ρ(θ) is the positive y-axis, then we need to find the line with minimum y-intercept that has no points from P above it. We then compute C1 = CH(P1) in O(1) time.
In a generic round r of the algorithm we are given the polygon Cr−1 and for each edge e of Cr−1 a set P (e) defined as follows. Consider the rays from the origin through the vertices of Cr−1. These rays partition the plane into cones, each corresponding to a unique edge of Cr−1. We denote the cone corresponding to an edge e of Cr−1 by cone(e). Now P (e) contains the points from P that lie in cone(e) but outside Cr−1. These are the points that may still appear as vertices of CH(P ) in between the endpoints of e. (Note that the sets P (e) for the edges e of C1 can easily be found in O(n) time.) We now proceed as follows.
1. Determine for each direction θ ∈ D(2 r+1 ) \ D(2 r ), that is, for each new direction we need to handle, the edge e of Cr−1 such that p(θ) ∈ cone(e). This can be done in O(2 r ) = O(n) time in total by "walking around" Cr. Note that for each edge e such that P (e) is non-empty, there can be at most one direction θe such that p(θe) ∈ cone(e), because in between any two directions in D(2 r+1 ) \ D(2 r ) there is a direction in D(2 r ). Theorem 6. There is a progressive algorithm to compute the convex hull of a set of n points in the plane that runs in at most k := log n rounds, each taking O(n) time, with convergence function fconv(r) = O(1/2 2r ) with respect to the error function err width (C) defined above. The algorithm is monotone in the sense that, if Cr denotes the intermediate solution reported after round r, we have Cr ⊆ Cr+1 for all 1 r < k.
Proof. The bound on the convergence rate follows from Lemma 5 and the fact that P (2 r+1 ) ⊂ Pr. The time spent in each round r is O(n + e |P (e)|), where the sum is taken over all edges e of Cr−1. Since e |P (e)| < n, the time per round is O(n). The "median ray" selected in Step 3 above makes sure that |P (e)| < n/2 r for any edge e of Cr, so the total number of rounds is at most log n.

Discussion.
(1) In progressive divide-and-conquer algorithms we need to do the divide-step in such a way that the error is kept under control, but we also need the subproblems to decrease rapidly in size to achieve a fast running time. In the convexhull algorithm above, the error was controlled by making sure that P (2 r+1 ) ⊂ Pr, while the size of the subproblems was controlled using the "median ray" found in Step 3. For our progressive sorting algorithm, the median could play both roles, as the error function is closely related to size of the subproblems.
(2) Our convex-hull algorithm is based on Lemma 5, which gives us a set P (m) with error O(1/m 2 ). Thus, after round r we get an error of O(1/2 2r ). With less effort, namely by only putting the extreme points in directions in D(m) into P (m), we can obtain an error of O(1/m), giving us an error of O(1/2 r ) after round r. This seems to give a worse result, but we can use the trick already mentioned earlier: we can merge pairs of rounds into a single round to again obtain an error of O(1/2 2r ). By merging even more rounds, we could even obtain a seemingly better error, namely O(1/2 cr ), for any constant c.
(3) An alternative to obtain a progressive convex-hull algorithm is to use our general technique to convert approximation algorithms into progressive algorithms. To this end we use the coreset algorithm of Chan [4], who showed that for any set of n points in the plane and any ε > 0, one can compute an ε-coreset of size O( 1/ε) in time O(n + 1/ε). If we compute the convex hull of the coreset in time O( 1/ε log(1/ε)), we get a convex polygon with error O(ε). By setting k := log n and applying Theorem 2-note that an optimal planar convex-hull algorithm gives Texact(n) = O(n log n)-we get a k-round progressive algorithm using O(n) time per round and with convergence function fconv(r) = O(1/2 r ). Again, the convergence rate can be boosted by merging rounds.

COMPUTING POPULAR PLACES FROM TRAJECTORY DATA
Let T := {T1, T2, . . . , Tm} be a set of trajectories in the plane, each specified by a finite sequence of points, and let κ be a parameter with 2 κ m. With a slight abuse of notation we also use T to refer to the union of the point sets specifying the trajectories (where each point is labeled with its trajectory), and we set n := |T |. A unit square is κ-popular if it contains points from at least κ trajectories. We want to develop a progressive algorithm to compute the κ-popular region Aκ(T ), that is, the set of all points q ∈ R 2 such that σq, the axis-aligned unit square centered at q, is κ-popular; see Fig. 3 for an illustration.
As partial solutions we allow subsets S ⊆ Aκ(T ). Let B be a smallest enclosing square of T extended by half a unit 1 Figure 3: Popular region for κ = 2 defined by three given trajectories. on each side, so that Aκ(T ) ⊂ B. We now define err pop(S), the error of a partial solution S, as size(σ * )/size(B), where σ * is the largest square contained in Aκ(T ) with S ∩ σ * = ∅, and size(·) denotes the side length of a square.
In the following we will use Σ(Ti) := {σp : p ∈ Ti} to denote the set of unit squares (or intervals in the 1D case) centered at the points of Ti, and similarly we define Σ(T ) := {σp : p ∈ T }. Note that a point q ∈ R 2 is κ-popular if and only if it is contained in squares σp coming from at least κ sets Σ(Ti).

Computing 1D popular regions.
As a warm-up exercise, we study the 1D version of the problem. In this case the region Aκ(T ) can be computed non-progressively in O(n log n) time: sort the endpoints of the intervals in Σ(T ), and then sweep from left to right while maintaining an array M [1.
.m] such that M [i] is equal to the number of intervals from Σ(Ti) containing the sweep point. To turn this into a progressive algorithm we adapt the progressive QuickSort algorithm from Section 2, as follows.
Recall that in each round of the progressive QuickSort algorithm we have a permutation of the input points. In this permutation, all points that have already been chosen as pivot are at their correct position, while the other points are positioned correctly with respect to the pivots-they are in the right bin. We then pick the median of each bin as pivot and partition the bin into two smaller bins. We adapt this algorithm as follows.
First of all, we pick two pivots in each bin B: besides the median of the points in B we also pick the point (xmin(B) + xmax(B))/2, where [xmin(B), xmax(B)] is the interval corresponding to B. This second pivot, which we call the geometric pivot-note that the geometric pivot is not an endpoint of an interval in Σ(T )-is used to control the error. We then partition B into three smaller bins, based on the two pivots.
Second, we compute the partial solution Sr that we report in the current round r. We do this by sweeping, as in the non-progressive algorithm: while we sweep we maintain for each trajectory the number of intervals in Σ(T ) containing the sweep point. The order in which we encounter the endpoints of the intervals in Σ(T ) is not completely left-toright, as the points within a bin are not sorted yet. However, the pivots are at their correct position, so whenever we sweep over a pivot we can correctly determine whether it is κ-popular. We therefore call these points safe stops and we report the set of all safe stops that are κ-popular as our partial solution Sr. Note that err pop(Sr ) is bounded by the maximum length of the interval corresponding to a bin, and note that in the last round we can easily compute the complete solution by sweeping. This leads to the following theorem.
Theorem 7. Let T := {T1, . . . , Tm} be a set of 1D trajectories with n points in total, and let κ be a parameter. Then there exists a progressive algorithm to compute the κpopular region that runs in at most k := log n rounds, each taking O(n) time, with convergence function fconv(r) = 1/2 r with respect to the error function err pop(S) defined above. The algorithm is monotone in the sense that, if Sr denotes the solution reported after round r, we have Sr ⊆ Sr+1 for all 1 r < k. Computing 2D popular regions within a unit square.
We now return to the 2D problem. We start by describing a progressive algorithm that computes the part of Aκ(T ) within a given unit square U . Later we show how to solve the general problem, using this algorithm as a subroutine. We assume without loss of generality that each σp ∈ Σ(T ) intersects U . To simplify the presentation we also make the following non-degeneracy assumption: each σp has exactly one corner in the interior of U , and no two corners have the same x-or y-coordinate.
An algorithm for sorted data.
Assume we are given two lists storing the corners inside U of the squares from Σ(T ): a list Lx sorted on x-coordinate, and a list Ly sorted on y-coordinate. We can then compute Aκ(T ) in O(nκ) time with a plane-sweep algorithm, as follows.
The sweep line is horizontal and moves downward, halting at every corner of a square σp ∈ Σ(T ) inside U . Let Σ bl (Ti) denote the set of squares in Σ(Ti) whose bottomleft corner is inside U . Define Σ br (Ti), Σ tl (Ti), and Σtr(Ti) similarly, but then for the bottom-right, top-left, and topright corners. Next we describe the status information we maintain during the sweep for Σ bl (T ) := Σ bl (Ti); the status information for Σ br (T ), Σ tl (T ), and Σtr(T ) is similar.
Consider a fixed Ti, and consider all squares in Σ bl (Ti) that intersect the sweep line . We call the leftmost of these squares the active square (from the squares in Σ bl (Ti)). Clearly we do not need to consider non-active squares when computing the popular region along . During the sweep we maintain the following information for Σ bl (T ); see also   . This gives us a list with the relevant squares from all four types, sorted according to the intersections of their boundaries with . We scan this list-this is similar to the scan for the 1D problem-to obtain the popular region along . (We should actually report the popular region in between the sweep line at its current and previous position, not just the popular region along , but we omit the details.) It remains to describe how we initialize the status information, and how we maintain it as we sweep. We focus on the information for Σ bl (T ). Initialization in time O(nκ) is straightforward, so we focus on the maintenance.
Suppose the sweep line halts at the bottom-left corner of a square σp ∈ Σ bl (Ti). We only need to do something when σp is relevant, that is, when Rel bl [i] = j = nil. We then start moving the pointer Ptr bl [j] to the right, until we either encounter a square σ p ∈ Σ bl (Ti)-in this case we are done-or Ptr bl [j] becomes equal to Ptr bl [j + 1]. In the latter case we set Ptr bl [j] := Ptr bl [j + 1], and we start walking with Ptr bl [j + 1]. This continues until we find a square σ p ∈ Σ bl (Ti) and we are done, or we start walking with Ptr bl [κ]. From then on we also need to check for each square σ p we encounter whether the set Σ bl (T i ) it comes from, already has a relevant square. If not, then we let Ptr bl [κ] point to σ p . Updating the array Rel bl during these operations is a matter of simple bookkeeping.
During the maintenance, the pointers in Ptr bl [1.
.κ] only move to the right. Hence, the total time for moving pointers is bounded by O(nκ). The time needed to report the solution whenever the sweep line halts is O(κ), so in total this also takes O(nκ).
This finishes the description of how we maintain the information for Σ bl (T ). Maintaining the information for Σ br (T ) is symmetrical. For Σ tl (T ) and Σtr(T ) the situation is slightly different, since squares start to (rather than stop to) intersect the sweep line when encountered, but this makes it actually easier. Overall we thus spend O(nκ) time, as claimed.
A progressive algorithm for unsorted data.
We use the same basic idea as in the 1D case, namely to combine our progressive QuickSort algorithm with the algorithm for sorted data. Thus we sort the corners inside U of the squares from Σ(T ) both on x-coordinate and on ycoordinate, using progressive QuickSort. As in the 1D case, we not only use the median of the bins as pivots, but also geometric pivots. Thus after each round of the progressive QuickSort we have two partially sorted lists, Lx and Ly, which contain not only (corners of) the squares in Σ(T ) but also certain geometric pivots. We now essentially run the algorithm we described above using the lists Lx and Ly, pretending that they are correctly sorted. Next we describe the modifications that are needed, and we analyze the error in the partial solutions we report.
First consider the fact that Ly is only partially sorted. Recall that our algorithm is a sweep-line algorithm, where at each event we update the status information and we compute the popular region along the sweep line using a 1D scan. The former step-maintaining the status information at each event-is still done in the same way as before. The latter step-the 1D scan along the sweep line-is now only performed at the safe stops, that is, at the points that have been used as pivots and, hence, are correctly positioned in Ly. This means that we have correct knowledge of which squares intersects the sweep line . Now consider the 1D scan that we do at a safe stop in Ly. Since the ordering of the squares along is not completely correct, some relevant squares might be considered irrelevant by our algorithm, and vice versa. We call the squares that have been identified as relevant based on the partially sorted list Lx quasi-relevant. Note that (the x-coordinates of the corners of) quasi-relevant squares need not be safe stops and vice versa. Our new 1D scan will halt at the quasi-relevant squares (more precisely, at points where a vertical edge of such a square intersect the sweep line ) and at safe stops from the list Lx. Halting at the safe stops is necessary to keep the error under control. Unfortunately the number of safe stops grows in every round and can be much larger than the O(κ) quasi-relevant squares. We overcome this by only halting at safe stops that are neighbors of (the x-coordinates of) quasi-relevant squares. In other words, if we imagine the safe stops to partition the sweep line into bins, then we only stop at those safe stops that are an endpoint of a bin containing at least one quasi-relevant square. Finding these bins can be done before the sweep starts in O(n) time in total for all squares. Now, when we do the 1D scan we maintain the popularity of the points encountered during the scan, as usual. When we arrive at a safe stop and its popularity is at least κ, we report it as part of our partial solution. There are also other locations on the sweep line of which the popularity is known. In particular, if there is no quasi-relevant square in between two safe stops then the popularity is the same for all points in the interval. Thus, if we stop at two consecutive safe stops that are both popular, then we report the entire interval between these safe stops. (This is necessary because we are only stopping at safe stops surrounding quasi-relevant squares. Thus the distance between consecutive points where the scan halts could become large because we are skipping certain safe stops, which could lead to a large error.) The correctness of this approach is based on the following lemma.
Lemma 8. Suppose the sweep line halts at a safe stop in Ly. Let p l and pr be consecutive safe stops in Lx. Then for each of the four corner types, the number of relevant squares whose corner in U has x-coordinate in the range [p l , pr) is equal to number of quasi-relevant squares whose corner in U has x-coordinate in [p l , pr).
Furthermore, if the number of relevant squares with corners left of pr is strictly smaller than κ, then these numbers are also equal per trajectory.
Proof. We prove the lemma for the set Σ bl (T ) of squares with their bottom-left corner in U . For the rest of this proof the term "corner" always refers to the bottom-left corner. Note that since the sweep line only halts at safe stops in Ly, the information about which squares intersect is correct.
Recall that a square σp ∈ Σ bl (Ti) is active if it is the leftmost square of Σ bl (Ti) intersecting the sweep line. We call σp quasi-active if it is the leftmost square of Σ bl (Ti) intersecting the sweep line according to the (partially incorrect) ordering in Lx. Now number the intervals between consecutive safe stops along as I1, I2, . . .. We will prove that the lemma holds for all these intervals Ij by induction on j. For simplicity we assume we have an empty interval I0 for which the lemma trivially holds. Now consider an interval Ij := [p l , pr) and assume the lemma holds for all intervals I j with j < j. We have three cases.
Case 1: there are strictly fewer than κ relevant squares whose corner lies to the left of pr.
By induction the number of relevant squares to the left of p l is equal to the number of quasi-relevant squares to the left of p l , and this holds in fact per trajectory. That is, a trajectory Ti has a relevant square to the left of p l if and only if it has a quasi-relevant square to the left of p l . Now consider a trajectory Ti with an active square in Ti. By definition it cannot have a relevant square to the left of p l and, as we saw, this implies it does not have an quasi-relevant square to the left of p l , which implies that Ti must have an quasi-active square in Ij. The converse is also true, so the number of active and quasi-active squares inside Ij must be equal per trajectory. Since the number of relevant squares to the left of pr is strictly less than κ, these active and quasiactive squares must actually be relevant and quasi-relevant, respectively.
Case 2: there are κ relevant squares whose corner lies to the left of pr, but strictly less than κ relevant squares to the left of p l .
Using the same argument as in Case 1 we derive that, if trajectory Ti has an active square in Ij, then Ti has a quasiactive square in Ij. Let z be the number of relevant squares to the left of Ij. Then the κ − z leftmost active squares in Ij are relevant. Similarly, the κ − z quasi-active squares in Ij that are leftmost according to the list Lx are quasirelevant. These need not come from the same trajectories as the relevant squares, but their number is the same. Note that the lemma indeed proves the correctness of our approach: it implies that the popularity computed at safe stops is correct, and it proves that the popularity of an interval between two (not necessarily consecutive) safe stops, without quasi-relevant squares in between them, is indeed constant throughout the interval.
It remains to discuss the error. Since the popularity of intervals without quasi-relevant squares in them is determined correctly, we only miss popular subintervals of an interval between safe stops surrounding one or more quasi-relevant squares. But such safe stops are actually consecutive safe stops in Lx-no safe stops have been skipped in betweenso after round r the length of such intervals is at most 1/2 r . Since the distance between consecutive safe stops in Ly is 1/2 r as well, the error in round r is bounded by this quantity, leading to the following lemma.
Lemma 9. The 2D popular-places problem within a unit square U can be solved with a progressive algorithm that runs in at most k := log n rounds, each taking O(nκ) time, with convergence function fconv(r) = 1/2 r with respect to the error function err pop(S) defined above.
Computing 2D popular regions-the general case.
We showed how to progressively compute the popular region within a unit square U . Computing the entire popular region can be done using the algorithm above as a subroutine. To this end, consider the integer grid and let U = {U1, . . . , Uz} be the grid cells intersecting at least one square from Σ(T ). The set U, and for each Ui ∈ U the squares intersecting Ui, can be computed in O(n) time using geometric hashing. (This requires the floor function, which can be avoided using an approach based on a progressive computation of a kd-tree-like structure-see Section 4.1.) We now run the algorithm described above on each of the cells Ui in parallel. Each round takes O( i niκ) = O(nκ) time, where ni denotes the number of squares intersecting Ui, giving the following result.
Theorem 10. Let T := {T1, . . . , Tm} be a set of 2D trajectories with n points in total, let B be the extended bounding square of T , and let κ be a parameter. Then there exists a progressive algorithm to compute the κ-popular region that runs in at most k := log n rounds, each taking O(n) time, with convergence function fconv(r) = 1/(size(B) · 2 r ) with respect to the error function err pop(S) defined above. The algorithm is monotone in the sense that, if Sr denotes the solution reported after round r, we have Sr ⊆ Sr+1 for all 1 r < k. The algorithm requires the use of the floor function.

An approach based on kd-trees
Above we described a progressive algorithm for a special case of the popular-places problem, where we want to find the solution within a unit square U . We then gave a simple grid-based approach that reduces the general problem to this special case. The grid-based approach needs the floor function, however. Here we describe an alternative that does not need this.
Recall that B is a bounding square of the set T (which contains the points from all trajectories), extended to each side by half a unit. We recursively construct a partitioning of B into rectangles, similar to a kd-tree, as follows.
We perform the recursive partitioning in rounds, and in a breadth-first manner. Thus, in each round we are given a set of rectangles R-initially this set only contains B-which we partition as explained next. Let P be the set of corners of the squares in Σ(T ). For each rectangle R we have a list LP (R) that contains the points in R ∩ P , and a list LΣ(R) of all squares σp ∈ Σ(T ) that intersect R. We now split R using the following splitting rules.
• When the maximum side length of R is at most 1, or R does not intersect any edge of a square in LΣ(R), then R is a final region of the partitioning and the recursion ends. • When the maximum side length of R is greater than 1, and LP (R) = ∅ but R does intersect an edge of a square in LΣ(R), then we have some rectangle edges fully crossing R. Note that we can only have crossing edges parallel to the shorter side of R. We now split R into two subrectangles with a line through the median of the crossing edges and with a line through the midpoint of the longer edge. • Otherwise, the maximum side length of R is greater than 1 and LP (R) = ∅. We partition R into two or four equal-sized subrectangles by lines orthogonal to the sides of R. More precisely, if the shorter side has length at most 1 we only split orthogonal to the longer side, otherwise we split orthogonal to both the shorter and the longer side. We then split the subrectangle with the largest number of points, call it R , into two rectangles by a line orthogonal to the longer side of R and through the median of R , assuming this side has length more than 1. (If the longest edge of R has length at most 1, then R is a final rectangle.) Note that the lists for the at most five subrectangles resulting from splitting R can easily be determined from the lists for R in linear time. At the end of the round, we check for each vertex of every rectangle R in the current partition whether it is popular-this can be done by inspecting all rectangles in LΣ(R)-and we report the set of all popular vertices as a partial solution.
Lemma 11. Each round in the process described above takes O(n) time, and the total number of rounds is at most min( log n , log(size(B)) ).
Proof. Let Rr denote the set of rectangles at the beginning of round r. Consider a square σp ∈ Σ(P ). Because of our splitting rules-in particular because we always split orthogonal to a side of length greater 1-the square σp intersects O(1) rectangles from Rr. This implies that the total size of all lists LP (R) and LΣ(R) is O(n), so a round takes O(n) time.
The log n bound on the number of rounds follows from the fact that any non-final rectangle is split through a median point or edge, and that the recursion ends when no edge crosses a cell. The log(size(B)) bound follows from the fact that any non-final rectangle is split through the midpoints of its edges that have length greater than 1, and the fact that the recursion stops when a rectangle drops below unit size.
We now analyze the error of the reported partial solutions.
Lemma 12. The partial solution reported after round r has error at most 1/2 r .
Proof. Any rectangle R that is already a final rectangle of the partitioning either has maximum edge length at most 1, or it does not intersect any square from Σ(T ). In the former case the maximum size of a square centered inside Aκ(T ) ∩ R and not containing any reported vertex is at most 1, so the error is at most 1/size(B) 1/2 r because log(size(B)). In the latter case Aκ(T ) ∩ R = ∅. Any nonfinal rectangle must have side length at most size(B)/2 r , so the error is at most 1/2 r as well.
After the recursive partitioning has ended, we apply the special-case algorithm to all rectangles R with LΣ(R) = ∅. This is possible due to the following lemma, which immediately follows from our stopping rule.
Lemma 13. After the final round, any rectangle R with LΣ(R) = ∅ has at most unit size in both dimensions.
This leads to the following theorem. Note that the convergence function is slightly different from the one in Theorem 10, because the grid-approach allowed us to reduce the error to 1/size(B) in one step.
Theorem 14. Let T := {T1, . . . , Tm} be a set of 2D trajectories with n points in total, let B be the extended bounding square of T , and let κ be a parameter. Then there exists a progressive algorithm to compute the κ-popular region that runs in at most k := log n rounds, each taking O(n) time, with convergence function fconv(r) = 1/2 r with respect to the error function err pop(S) defined above. The algorithm is monotone in the sense that, if Sr denotes the solution reported after round r, we have Sr ⊆ Sr+1 for all 1 r < k.

CONCLUDING REMARKS
We presented a theoretical framework for analyzing socalled progressive algorithms: algorithms working in rounds and producing partial solutions with smaller errors in each round r, until the complete solution is reported. We investigated the framework by developing and analyzing progressive algorithms for sorting, convex hulls, and popular places. This showed some interesting connections. For instance, our algorithm for popular places is a plane-sweep algorithm based on our progressive sorting algorithm.
In our framework the convergence rate can be artificially boosted by merging multiple rounds into single rounds. Hence, it can distinguish between algorithms with different polynomial convergence rates (for example, O(1/r) versus O(1/r 2 )) and between polynomial convergence and exponential convergence, but it cannot distinguish between different exponential convergence (for example, O(1/2 r ) and O(1/2 2r ) cannot be distinguished).
Our framework is currently limited to static data sets. In some applications, however, it is natural to consider variants where the data is not fully available at the start of the computation, but more and more data becomes available as time progresses. It may be interesting to study how to apply our framework in such a streaming scenario.
As for the specific problems we studied, an interesting open problem is to develop a progressive variant of the outputsensitive algorithm for computing popular places of Benkert et al. [3].