From EPI:
Given an array of numbers
A
and a keyk
, find the length of the longest subarray whose sum is less than or equal tok
.Input: A = [431, -15, 639, 342, -14, 565, -924, 635, 167, -70] k = 184 Output: 4 # sum(A[3:7]) <= 184
Our last encounter with a longest subarray / subsequence-type problem was in Longest Non-contiguous, Non-decreasing Subsequence. In that problem, we used a sorted collection to keep track of (and update) multiple candidate sequences as we moved through the list of numbers.
This time, however, it’s less likely that we’ll require a sorted collection. We’re trying to find a contiguous collection of elements, so it’s probable that whatever subarray-specific information we need can be pre-determined in linear time.
Computing Prefix Sums
In a previous article, we explored how prefix sums can be useful in allowing us to quickly derive subarray sums.
def find_longest_subarray_less_equal_k(A, k):
if not A:
return 0
n = len(A)
prefix_sums = [A[0]]
for i in range(1, n):
prefix_sums.append(prefix_sums[-1] + A[i])
def get_psum(i):
# Handle the case where i == -1
# (i.e. sum of empty prefix)
return 0 if i == -1 else prefix_sums[i]
# ...
We can compute the sum of a subarray A[i:j+1]
by subtracting get_psum(i-1)
from get_psum(j)
.
However, even with a prefix_sums
array, we can’t solve this question efficiently. Running through every possible (i, j)
will still take O(n^2)
time. Is it possible for us to precompute more information so that, for every (i, j)
, we can quickly determine whether we can discount subsequent j
values (i.e. j+1
, j+2
… n-1
) and move on to the next i
?
Computing Minimum Prefix Sums
Let’s take a look at the following input:
"""
k: 30
Index: 0 1 2 3 4 5
Values: [10, 15, 20, -30, 25, 25]
Prefix Sums: [10, 25, 45, 15, 40, 45]
"""
Let’s suppose we’re currently considering (0, 1)
as a candidate for the longest subarray. We see that A[0:2]
abides by the sum constraint (since 25 <= k
), but it’s not clear to us if we should:
- Option 1: Evaluate
(0, 2)
(i.e.A[0:3]
) in our next iteration - Option 2: Evaluate
(1, 1)
(i.e.A[1:2]
) in our next iteration
If you look ahead at the prefix sums that come after prefix_sums[1]
, you’ll notice that some of them are smaller than prefix_sums[1]
.
This is important. It tells us that, beyond j = 1
, there are still subarrays A[0:j+1]
that continue to abide by the sum constraint. Thus, we should pursue Option 1 and further explore subsequent indices of j
while keeping i = 0
as the starting index.
This act of looking ahead is a linear time operation, but we can reduce it to constant time by defining a min_prefix_sums
array. Specifically, min_prefix_sums[i]
tells us the minimum prefix sum seen in prefix_sums[i:]
:
min_prefix_sums = [prefix_sums[-1]]
for i in range(len(prefix_sums)-2, -1, -1):
min_prefix_sums.append(
min(min_prefix_sums[-1], prefix_sums[i])
)
min_prefix_sums = min_prefix_sums[::-1]
In this light, it doesn’t actually matter if the exact subarray A[i:j+1]
satisfies the sum constraint! With min_prefix_sums
, we’re able to deduce whether or not there exists some value l
such that l >= j
and A[i:l+1]
does indeed satisfy the sum constraint.
To proceed with above example: We can continue to increment j
and evaluate j = 2
and j = 3
, since we know that their minimum prefix sums are less than k = 30
:
"""
k: 30
Index: 0 1 2 3 4 5
Values: [10, 15, 20, -30, 25, 25]
Prefix Sums: [10, 25, 45, 15, 40, 45]
Min Prefix Sums: [10, 15, 15, 15, 40, 45]
"""
Once j == 4
(i.e. A[0:5]
), we’ll find that our sum constraint cannot be fulfilled any further, because the minimum prefix sum from j = 4
and beyond is 40
. At this stage, we’ll want to increment i
instead, because we’ve exhausted all the possible candidate subarrays for i = 0
. Our next iteration would thus be for (1, 4)
(i.e. A[1:5]
).
From Intuition to Code
Let’s spell out the logic described above:
def find_longest_subarray_less_equal_k(A, k):
# Get prefix sums and min_prefix_sums
# ...
longest = 0
i = j = 0
while i < len(A) and j < len(A):
# If sum constraint is satisfied,
# update return value
if min_prefix_sums[j] - get_psum(i-1) <= k:
longest = max(longest, j - (i-1))
j += 1
else:
i += 1
return longest
It’s worth highlighting here that instead of subtracting get_psum(i-1)
from get_psum(j)
, we’re subtracting it from min_prefix_sums[j]
. To reiterate, this is because we’re only concerned with the minimum prefix sum from j
and beyond, rather than the actual prefix sum at j
.
Given an array of n
numbers, our algorithm runs in O(n)
space and O(n)
time.
With that, we’ve solved the Longest Subarray with a Sum Constraint problem 🧱🛹🧱.
def find_longest_subarray_less_equal_k(A, k):
# Get prefix sums
prefix_sums = [A[0]]
for i in range(1, len(A)):
prefix_sums.append(prefix_sums[-1] + A[i])
def get_psum(i):
return 0 if i == -1 else prefix_sums[i]
# Get min prefix sums
min_prefix_sums = [prefix_sums[-1]]
for i in range(len(prefix_sums)-2, -1, -1):
min_prefix_sums.append(
min(min_prefix_sums[-1], get_psum(i))
)
min_prefix_sums = min_prefix_sums[::-1]
# Traverse through A
longest = 0
i = j = 0
while i < len(A) and j < len(A):
if min_prefix_sums[j] - get_psum(i-1) <= k:
longest = max(longest, j - (i-1))
j += 1
else:
i += 1
return longest