Today I want to talk about the second solution to the task of finding m closest points out of n total. This solution makes use of a priority queue.
Priority queue is a queue which is intended to optimize operations of getting a min/max element. In java.util, there’s a PriorityQueue class which has the following characteristics:
O(log(n)) time for the enqueing and dequeing methods (offer, poll, remove, add);
linear time for the remove and contains;
constant time for the retrieval methods (peek, element, size).
Let’s define our own classes:
public static class Point { private int x; private int y; public Point(int x, int y) { this.x = x; this.y = y; } /** * For our purpose, square distance is good enough * It allows to avoid the costly square root * * @param p Point to which we want the distance * @return Distance between current and p */ public int getDistanceTo(Point p) { int dx = this.x - p.x; int dy = this.y - p.y; return dx * dx + dy * dy; } @Override public String toString() { return "(" + x + ", " + y + ')'; } } public static class PointDistance implements Comparable { private Point point; private int distance; public PointDistance(Point point, int distance) { this.point = point; this.distance = distance; } @Override public int compareTo(PointDistance o) { // in java, PriorityQueue always has the least (min) element on top // and we ned max on top, so we reverse the comparator return distance == o.distance ? 0 : (distance < o.distance ? 1 : -1); } }
Note that out PointDistance wrapper should implement a Comparable interface, and also that we reversed the results of its compareTo method (because we want to have maximum distance on top of the queue, and by default there would be a minimum).
Below is the main flow:
// you should check for m being in 0 to n limits int m = Math.min(Integer.parseInt(args[0]), points.size()); // say our base point A is always the first Point a = points.get(0); // fill in the queue PriorityQueue q = new PriorityQueue(); // iterate through all other points except a // this loop takes O(n * log(n)) time, as PriorityQueue insert cost is log(n) for (int i = 1; i < points.size(); i++) { Point current = points.get(i); PointDistance max = q.peek(); int currentDistance = current.getDistanceTo(a); // we add current item to queue if the queue is not yet full, or if its distance is less if (q.size() < m) { q.add(new PointDistance(current, currentDistance)); } else { if (currentDistance < max.distance) { q.poll(); q.add(new PointDistance(current, currentDistance)); } } } // in the end, our queue will contain the m closest for (PointDistance pointDistance : q) { System.out.println(pointDistance.point); }
So basically, we go through all the points, and if the queue size is less than m, or if the queue top element’s distance to a is more than the current calculated distance, then we put the current point into the queue. If the queue size is already m, then we take out the top element before inserting a new one.
In the end, we should be left with m points that are closest to a.
The complexity of this algorithm is O(n * log(m)), as we go through all n points but only enqueue m of them. The peek operation is constant time.
Can we do better? Will try to find out in the next post.