Maximum Points on a Line

Given n points on a 2D plane, find the maximum number of points that lie on the same straight line.

Input: [[1, 1], [2, 2], [3, 3]]
Output: 3
"""
Explanation:
^
|
|        o
|     o
|  o  
+------------->
0  1  2  3  4
"""

Input: [[1, 1], [3, 2], [5, 3], [4, 1], [2, 3], [1, 4]]
Output: 4
"""
Explanation:
^
|
|  o
|     o        o
|        o
|  o        o
+------------------->
0  1  2  3  4  5  6
"""

Problems that deal with geometry can be intimidating, because they often force us to evaluate a physical space that can expand in multiple directions.

For this question, the key here is draw on some elementary / secondary school basics. A line is really just composed of two components: A fixed point (i.e. an x-y coordinate) and a gradient. We can define a gradient as such:

Gradient = collections.namedtuple('Gradient', ('dy', 'dx'))

Notice that we’re maintaining a tuple instead of computing the value of dy / dx. This is because floating point division operations can yield unpredictable results.

Considering Pairwise Gradients from a Fixed Point

Now let’s consider the following approach: Inspect some point points[i] and consider all the lines it forms with other points (points[j1], points[j2], …). Since our reference point remains the same, the only factor that changes is the gradient. We keep track of the number of times we’ve encountered each gradient.

Two identical gradients, for instance, tell us that two lines extending from points[i] are essentially the same, and that three points (i.e. point i and two other points) must fall on this common line.

"""
 ^
 |
 |  2
 |     1        o
 |        
 |  o        R
 +------------------->
 0  1  2  3  4  5  6
"""
# Points R, 1, and 2 lie on the same line

We can repeat this examination of pairwise gradients for all other reference points (points[i1], points[i2], …).

for i in range(n):

    # Track number of occurrences of each gradient
    lines = collections.defaultdict(int)
    local_max = 1

    for j in range(i+1, n):
        dy = points[j][1] - points[i][1]
        dx = points[j][0] - points[i][0]
        gcd = math.gcd(dy, dx)
        gradient = Gradient(dy / gcd, dx / gcd)

        # Update max number of points on a line
        lines[gradient] += 1
        local_max = max(local_max, lines[gradient])

dy and dx are reduced to their simplest form for standardisation –– we need to ensure that values like 4/3 and 8/6 are treated as the same gradient.

Edge Cases for Pairwise Gradients

We’ve yet to consider some edge cases:

What if either dy or dx were zero? This would mean that the gradients would be zero or infinite. For these cases, we can save the gradients as Gradient(0, 1) and Gradient(1, 0).
What if both dy and dx were zero? This would mean that we have an overlapping point. Overlapping points will contribute to all lines extending from the reference point, so we’ll just maintain an additional overlap counter.
How do we account for negative dy and dx values? Since -2/3 is the same as 2/-3, we standardise the gradients by shifting any negative signs from dx to dy.

for i in range(n):

    lines = collections.defaultdict(int)
    # Track the number of points that overlap with
    # the reference point -- including itself
    overlaps = 1
    local_max = 1

    for j in range(i+1, n):
        dy = points[j][1] - points[i][1]
        dx = points[j][0] - points[i][0]

        # `points[j]` is an overlapping point
        if dy == 0 and dx == 0:
            overlaps += 1
            local_max += 1
            
        else:
            gradient = None
            if dy == 0:
                gradient = Gradient(0, 1)
            elif dx == 0:
                gradient = Gradient(1, 0)
            else:
                # Standardise `(dy, dx)` signs by shifting
                # negative signs on `dx` to `dy` 
                if dx < 0:
                    dy, dx = -dy, -dx
                gcd = math.gcd(dy, dx)
                gradient = Gradient(dy / gcd, dx / gcd)

            lines[gradient] += 1
            local_max = max(local_max, lines[gradient])

Now, to arrive at the complete solution, we just need to make sure that local_max is being used to update a global_max which we’ll return.

def max_points(points):

    n = len(points)
    global_max = 0

    for i in range(n):
        # Compute max points connected in a line
        # ...

        global_max = max(global_max, local_max)
    
    return global_max

Given n points, our algorithm runs in O(n) space and O(n^2) time.

With that, we’ve solved the Maximum Points On a Line problem ⚔️⚔️.

Full Solution

def max_points(points):

    n = len(points)
    global_max = 0

    for i in range(n):

        lines = collections.defaultdict(int)
        overlaps = 1
        local_max = 1

        for j in range(i+1, n):
            dy = points[j][1] - points[i][1]
            dx = points[j][0] - points[i][0]

            if dy == 0 and dx == 0:
                overlaps += 1
                local_max += 1
                
            else:
                gradient = None
                if dy == 0:
                    gradient = Gradient(0, 1)
                elif dx == 0:
                    gradient = Gradient(1, 0)
                else:
                    if dx < 0:
                        dy, dx = -dy, -dx
                    gcd = math.gcd(dy, dx)
                    gradient = Gradient(dy / gcd, dx / gcd)

                lines[gradient] += 1
                local_max = max(local_max, lines[gradient] + overlaps)

        global_max = max(global_max, local_max)

    return global_max

Published 5 Feb 2020

Thanks for reading! If you gained something from reading this article, feel free to tweet about it or share your thoughts in the comments section below 💬.