The Josephus Problem is a well-established problem in computer science.
From Wikipedia:
People are standing in a circle waiting to be executed. Counting begins at a specified point in the circle and proceeds around the circle in a specified direction. After a specified number of people are skipped, the next person is executed.
The procedure is repeated with the remaining people, starting with the next person, going in the same direction and skipping the same number of people, until only one person remains, and is freed.
Framed as a coding challenge, you can imagine n
to be the number of people and k
to be the number of skips to the next executed person.
To use an example: Given n = 5
and k = 2
, the first execution would be at position 2
and the second one would be at 4
:
[1, 💀, 3, 💀, 5]
One’s goal would be to find the position of the last person standing.
There are plenty of solid explanations online for this problem. In particular, I recommend checking out this Stack Overflow response.
Recurrence Relation
Most of the solutions you’ll find describe a recurrence relation in two parts:
# Recurrence relation: 1) Base case
josephus(1, k) = 1
This part is pretty straightforward. If n = 1
(i.e. there’s only one person in the circle), that same person, standing at position 1
, is the sole survivor.
# Recurrence relation: 2) General case
josephus(n, k) = (josephus(n-1, k) + k - 1) % n + 1
This second part, on the other hand, is less obvious. Thus, I’d like to provide my own interpretation of the solution.
Part I: Indexing
The defined positions are usually 1-indexed rather 0-indexed. Given n = 5
, you can imagine the labels in the circle to be:
[1, 2, 3, 4, 5]
However, 1-indexing can be sometimes confusing, so let’s modify this to be 0-indexed instead.
[0, 1, 2, 3, 4]
We can easily account for this minor adjustment in our algorithm:
def josephus(n, k):
def josephus_helper(n, k):
# Get the 0-indexed position of the survivor
return josephus_helper(n, k) + 1
We’ll eventually take whatever josephus_helper(..)
returns and add 1
to it.
Part II: Substructure
Given n = 5
and k = 2
, the first person to be executed would be Person 1:
[0, 💀, 2, 3, 4]
This leaves us with n - 1
(i.e. 4
) people.
[0, 2, 3, 4]
Given that 1) the people are standing in a circle and 2) the skips begin from the front of the list, we can also view this list as:
[2, 3, 4, 0]
Notice that [2, 3, 4, 0]
is somewhat analagous to [0, 1, 2, 3]
:
[0, 1, 2, 3] # i
[2, 3, 4, 0] # (i + 2) % 5
In particular, their execution sequences are identical:
[0, 💀, 2, 3] # i
[2, 💀, 4, 0] # (i + 2) % 5
This is critical! The position of the survivor in [2, 3, 4, 0]
will also be the position of the survivor in [0, 1, 2, 3]
–– the only difference is the index value, which needs to be adjusted ((index + 2) % 5
).
This adjustment can be generalised to (index + k) % n
.
Thus, to find the survivor in [0, 1, 2, 3, 4]
, we need to find the survivor in [2, 3, 4, 0]
, which would just be the adjusted value of the survivor in [0, 1, 2, 3]
.
This gives us our recurrence relation:
# n = 5: [0, 1, 2, 3, 4]
def josephus_helper(n, k):
# ...
# n = 4: [0, 1, 2, 3]
return (josephus_helper(n-1, k) + k) % n
Part III: Base Case
We’ve yet to handle the base case for this recursive algorithm we’ve defined. Recall that we previously saw the expression josephus(1, k) = 1
.
Thus far, we’ve only worked with 0-indexed positions, so we’ll preserve our mental model and define josephus_helper(1, k)
to be 0
. In other words, the survivor, in a circle of one, is the first (and only) person.
def josephus_helper(n, k):
if n == 1:
return 0
return (josephus_helper(n-1, k) + k) % n
Part IV: Full Solution
Now that we’ve accounted for both the general and the base case, we have an acceptable solution that works for 0-indexed positions.
def josephus(n, k):
def josephus_helper(n, k):
if n == 1:
return 0
return (josephus_helper(n-1, k) + k) % n
return josephus_helper(n, k) + 1
We’ll throw a +1
back into our algorithm (as seen above) so that we can return a 1-indexed
position.
It goes without saying that our solution runs in O(n)
time –– the size of the input is decremented by 1
for every recursive call.
Bonus
Consider the case where k == 2
. At this value, our runtime complexity shrinks to O(log n)
! Let’s see why.
Suppose we have an even number of elements in the (circular) list. There’s a pretty clear attrition pattern:
# `a -> b` denotes that `a` is eventually trimmed to `b`
[0, 1, 2, 3, 4, 6] -> [0, 2, 4]
[0, 1, 2, 3] -> [0, 2]
[0, 2] -> [0]
As you might tell, the list sizes are consistently halved.
[0, 1, 2, 3]
is trimmed down to [0, 2]
. [0, 2]
is simply [0, 1]
with every element multiplied by two. Thus, we can deduce that josephus(4)
is really just josephus(2) * 2
.
def josephus2(n):
def josephus2_helper(n):
if n == 1:
return 0
elif n % 2 == 0:
next_n = n // 2
return josephus2_helper(next_n) * 2
else: # n % 2 == 1
# Handle odd case
return josephus2_helper(n) + 1
Now what if we had an odd number of elements?
[0, 1, 2, 3, 4, 5, 6] -> [6, 0, 2, 4]
[0, 1, 2, 3, 4] -> [4, 0, 2]
[0, 1, 2] -> [2, 0]
The pattern here is less obvious. [0, 1, 2, 3, 4]
reduces to [4, 0, 2]
, but what is [4, 0, 2]
? How do we describe it as a subproblem?
[4, 0, 2]
isn’t exactly half of[0, 1, 2, 3, 4]
, it’s more akin to half + 1.[4, 0, 2]
is equivalent to[2, 0, 1]
with every element multiplied by two.[2, 0, 1]
is a left-shifted version of[0, 1, 2]
.
If we put these three assertions together, we’ll arrive at the code for handling odd-length lists:
def josephus2(n):
def josephus2_helper(n):
if n == 1:
return 0
elif n % 2 == 0:
next_n = n // 2
return josephus2_helper(next_n) * 2
else: # n % 2 == 1
next_n = n // 2 + 1
s = josephus2_helper(next_n)
if s == 0:
s = next_n
return (s - 1) * 2 # -1 to shift left
return josephus2_helper(n) + 1
Our runtime complexity, in this particular case of k == 2
, is O(log n)
instead of O(n)
due to the consistent halving at each step.
Concluding Thoughts
Initially, when I encountered this problem, I used a circular linked list to represent the circle of individuals. Though my algorithm produced the correct output, its time complexity approximated O(nk)
, which was less than ideal.
This exercise has encouraged me to approach algorithmic problems more abstractly. Instead of turning to fancy data structures immediately, we should first see if we can use a couple of simple operations (or variables) to capture the information we’ve acquired from processing one part of our input.