How to Implement Floyd-Warshall Algorithm for All-Pairs Shortest Paths

Initialization of the distance matrix

Welcome back! Before we can calculate the shortest paths between every pair of cities, we need to set the stage. Think of this as opening a blank notebook before solving a math problem. In our case, the notebook is the Distance Matrix.

Visualizing the Setup

Click the buttons below to see how we initialize the matrix step-by-step.

Infinity (No Path) Self-Distance (0) Edge Weight

The Intuition: Starting from Scratch

Before we consider any intermediate stops (where $k=0$), we only know about direct roads.

If there is a direct road from City $A$ to City $B$, the distance is the weight of that road. If there is no direct road, we assume the distance is Infinity ($\infty$) because, for now, we can't get there.

⚠️ The Common Trap

It is incredibly easy to forget to set dist[i][i] = 0.

If you leave the diagonal as Infinity, your algorithm will think it costs infinite energy to "stay put." This breaks the math later because Infinity + Anything is still Infinity, preventing valid paths from being discovered.

Advanced: The "Infinity" Problem

In programming, we can't use true mathematical infinity with integers. We use a very large number (like $10^9$).

Be careful: If you pick a number that is too large (like the maximum integer value), adding two "infinities" together (e.g., INF + INF) might cause an overflow, wrapping around to a negative number and ruining your results. Always pick a safe large number!

initialization.py

# Key initialization lines
# We use 10^9 as a safe "Infinity" that won't overflow easily
INF = 10**9 

# Create an n x n matrix filled with INF
dist = [[INF] * n for _ in range(n)]

# CRITICAL: Distance to self is always 0
for i in range(n):
    dist[i][i] = 0 

# Update with actual direct edges from input
for u, v, w in edges:
    dist[u][v] = w  # Directed edge weight

Recurrence Relation and the Update Rule

Welcome back! Now that our notebook (the matrix) is initialized, it's time to do the actual work. This is the heart of the Floyd-Warshall algorithm. We need to decide: Is there a shortcut?

Visualizing the "Relaxation" Step

We are currently considering Vertex 1 as our potential stopover (Pivot).
Click the button to see if going through Vertex 1 improves the path from Vertex 0 to Vertex 2.

Ready to calculate...

The Core Logic: "Is it shorter?"

Imagine you want to go from City $i$ to City $j$. You know the direct distance (or the best path found so far).

Now, we introduce a new potential stopover city: $k$. We ask a simple question:

"Is going $i \to k \to j$ faster than my current best path $i \to j$?"

Mathematically, we check:
new_dist = dist[i][k] + dist[k][j]

If new_dist is smaller than the current dist[i][j], we update the matrix! This is called Relaxation.

⚠️ The Critical Trap: Loop Order

The order of your loops matters immensely. The loop for Vertex $k$ must be the outermost loop.

Why? Because $k$ represents the set of allowed intermediate cities. If you put $k$ inside, you might accidentally use a city as a stopover before you've officially "allowed" it, leading to incorrect paths.

floyd_warshall.py

# The Triple Loop: k MUST be outer
for k in range(n):          # 1. Pick the pivot/stopover
    for i in range(n):        # 2. Pick the start
        for j in range(n):      # 3. Pick the destination
            
            # The Update Rule (Relaxation)
            if dist[i][k] + dist[k][j] < dist[i][j]:
                dist[i][j] = dist[i][k] + dist[k][j]

Implementation Steps: Building the Algorithm

Welcome back! We have the math, we have the intuition. Now, let's translate that into working code. The Floyd-Warshall algorithm is famously concise—often fitting on a single screen. But getting the details right is where the magic happens.

The 3-Step Blueprint

Converting the recurrence relation D_k[i][j] into code is straightforward if you follow this order:

Initialize: Build the distance matrix. Set diagonals to 0 and non-edges to Infinity.
Iterate: Run the triple loop. Remember: Vertex $k$ must be the outermost loop.
Verify: After the loops finish, check the diagonal. If dist[i][i] < 0, you have a negative cycle.

⚠️ The "Off-by-One" Trap

Math textbooks often use 1-based indexing (vertices $1$ to $n$), but programming languages usually use 0-based indexing (vertices $0$ to $n-1$).

The Mistake: Writing for k in range(1, n+1) when your array is size n. This will cause an IndexOutOfBoundsException or skip the first vertex entirely.

The Fix: Stick to range(n) for 0-indexed arrays.

Visualizing the "Allowed Intermediates"

Why must k be the outer loop? Because k represents the set of allowed stopovers.
Click the buttons to see how the "Allowed Set" grows with each iteration of k.

Waiting for loop start...

Ready to run algorithm...

floyd_warshall_impl.py

# Correct 0-indexed Implementation
def floyd_warshall(n, edges):
    # 1. Initialize Matrix
    INF = 10**9
    dist = [[INF] * n for _ in range(n)]
    for i in range(n):
        dist[i][i] = 0
    
    # 2. Run Triple Loop (k is outer!)
    for k in range(n):          # Pivot
        for i in range(n):        # Start
            for j in range(n):      # End
                if dist[i][k] + dist[k][j] < dist[i][j]:
                    dist[i][j] = dist[i][k] + dist[k][j]
    
    # 3. Check Negative Cycles
    for i in range(n):
        if dist[i][i] < 0:
            return "Negative Cycle Detected"
    return dist

Advanced: Memory Optimization & Cache

You might wonder: "Can I optimize the memory?"

The classic algorithm already uses O(n²) space (a single matrix). There is no need for a 3D array dist[k][i][j] because the recurrence for step k only needs data from step k-1. We update the matrix in-place.

Performance Tip: While the order k, i, j is the safest for correctness, some competitive programmers swap the inner loops to k, j, i or i, k, j to improve CPU cache locality (accessing memory sequentially). However, be very careful: changing the order can sometimes break the logic if you aren't using the "in-place" update rule correctly. Stick to k, i, j unless you are optimizing for massive graphs.

Complexity Analysis and Practical Considerations

Welcome back! We've built the engine, but now we need to check the fuel gauge. How "expensive" is this algorithm? Is it fast enough for your project? Let's break down the math behind the magic.

The Cubic Reality: O(n³)

The beauty of Floyd-Warshall is its simplicity. We have three nested loops. If our graph has $n$ vertices, the total number of iterations is exactly:

n × n × n = n³

Inside the loop, we do a tiny bit of work: one addition and one comparison. This is constant time, or O(1). So, the total time complexity is O(n³).

When is it Acceptable?

"Cubic" sounds scary, but modern computers are fast. A standard CPU can perform roughly 100 million to 1 billion operations per second.

n = 100: 1 million ops. Instant.
n = 500: 125 million ops. Takes ~0.1 - 0.5 seconds. (The Sweet Spot)
n = 1000: 1 billion ops. Takes ~1 - 5 seconds. (Feasible)
n = 2000: 8 billion ops. Takes minutes. (Too Slow)

⚠️ The "Dense Graph" Advantage

You might wonder: "Why not just run Dijkstra $n$ times?"

For dense graphs (where every city is connected to almost every other city), running Dijkstra $n$ times is actually slower in practice.

Why? Floyd-Warshall has excellent cache locality. It reads and writes to a single, small 2D array sequentially. Dijkstra jumps around in memory. For small-to-medium graphs ($n \le 500$), Floyd-Warshall is often the fastest choice despite the O(n³) label.

Visualizing the Explosion

See how the number of operations skyrockets as we add just a few more vertices.

Note: These are theoretical operation counts. Real speed depends on CPU speed and memory access patterns.

Advanced: Can we make it faster with GPUs?

Since the inner loops (iterating over $i$ and $j$) are independent, we can run them in parallel! This makes Floyd-Warshall a candidate for GPU acceleration.

The Bottleneck: The outer loop ($k$) must remain sequential. Each step $k$ depends on the results of step $k-1$. You can't skip ahead.

Parallel Strategy

✔ Thread Block: Assign one thread to each cell $(i, j)$ of the matrix.
✔ Synchronization: Force all threads to wait until the current $k$-iteration is finished before moving to $k+1$.
⚠ Memory: Reading the same row/column for every thread can cause "bank conflicts" on the GPU memory.

complexity_check.py

# A quick way to estimate if FW is feasible
def check_feasibility(n):
    ops = n ** 3
    # Assume 10^8 ops per second
    seconds = ops / 1e8
    
    print(f"Vertices: {n}")
    print(f"Operations: {ops:,}")
    print(f"Est. Time: {seconds:.4f} seconds")
    
    if seconds > 10:
        print("⚠️ Warning: Consider Johnson's Algorithm or Dijkstra!")
    else:
        print("✅ Floyd-Warshall is a safe choice.")

# Try it out!
check_feasibility(500) # Fast
check_feasibility(2000) # Slow

Common Pitfalls: The Negative Cycle Trap

Welcome back! We've covered the mechanics and the complexity. But in the real world, things can get messy. The most dangerous pitfall in Floyd-Warshall is the Negative Cycle.

If your graph contains a loop where the total weight is negative, the "shortest path" becomes undefined (you can loop forever to reduce the cost). Let's see how to spot this and handle it correctly.

The Trap: "It just works?"

Many beginners run Floyd-Warshall and assume the resulting matrix dist[i][j] contains the correct shortest paths.

Reality Check: If a negative cycle exists, the algorithm will detect it (usually by seeing a negative value on the diagonal, dist[i][i] < 0), but it will not automatically fix the rest of the matrix.

⚠️ The Misconception

You might think: "I'll just check the diagonal. If it's negative, I stop."

The Problem: If you need to know the shortest path between two specific nodes outside the cycle, a simple check isn't enough. You need to know if that specific path touches the negative cycle.

Advanced: Propagating the "Infinity"

To get the correct answer, we must perform a second pass after the main algorithm. If a vertex k is part of a negative cycle, then any path that goes through k should also have a cost of negative infinity.

This is the logic we will visualize:
"If I can reach a negative cycle, and I can reach my destination from that cycle, then my path is infinitely cheap."

Visualizing the Propagation

Here we have a graph with a negative cycle (0 → 1 → 2 → 0).
The standard algorithm finds a path from 0 to 1, but it misses that the cost is technically -∞.

Click "Propagate Negative Cycle" to see how the algorithm fixes the matrix.

Standard Algorithm Finished. Diagonal check: Negative Cycle Found!

negative_cycle_fix.py

# The Standard FW Loop (Assume this ran already)
# ... (k, i, j loops) ...

# STEP 2: Propagate Negative Cycles
for k in range(n):
    # If k is part of a negative cycle...
    if dist[k][k] < 0:
        for i in range(n):
            for j in range(n):
                # ...and if i can reach k, and k can reach j...
                if dist[i][k] != INF and dist[k][j] != INF:
                    # ...then the path i->j is also infinitely cheap!
                    dist[i][j] = -INF

# Now dist[i][j] correctly reflects -Infinity for affected paths