Skip to content

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Nov 23, 2025

📄 61% (0.61x) speedup for Graph.topologicalSort in code_to_optimize/topological_sort.py

⏱️ Runtime : 14.2 milliseconds 8.79 milliseconds (best of 11 runs)

📝 Explanation and details

The optimization transforms the core bottleneck from O(N²) to O(N) complexity by changing how the topological sort stack is built.

Key Change: Replaced stack.insert(0, v) with stack.append(v) followed by a single stack.reverse().

Why This is Faster:

  • stack.insert(0, v) has O(N) complexity because it shifts all existing elements one position right for each insertion
  • With N vertices, this creates O(N²) total complexity just for stack operations
  • stack.append(v) has O(1) amortized complexity, making stack operations O(N) total
  • A single stack.reverse() at the end is O(N), maintaining the correct topological order

Performance Impact by Test Case:

  • Small graphs (≤5 nodes): Modest 1-6% improvements due to reduced overhead
  • Large sparse graphs: Dramatic 75-85% speedups (e.g., 1000 disconnected nodes: 441μs → 247μs)
  • Dense graphs: Significant 33% improvement (500-node branching graph: 218μs → 163μs)

Minor Optimization: Changed visited[i] == False to not visited[i] for slightly more Pythonic and marginally faster boolean checks.

The optimization scales particularly well with graph size - larger graphs see exponentially better performance due to eliminating the quadratic list shifting behavior. This makes the implementation suitable for real-world applications with substantial graph sizes.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 8 Passed
🌀 Generated Regression Tests 78 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 2 Passed
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_topological_sort.py::test_topological_sort 11.2μs 10.5μs 6.34%✅
test_topological_sort.py::test_topological_sort_2 32.7μs 25.7μs 27.4%✅
test_topological_sort.py::test_topological_sort_3 11.8ms 7.28ms 62.0%✅
🌀 Generated Regression Tests and Runtime
# imports
from code_to_optimize.topological_sort import Graph

# unit tests


def is_topological_order(graph, order):
    """Helper function to check if 'order' is a valid topological ordering of the graph."""
    position = {node: idx for idx, node in enumerate(order)}
    for u in range(graph.V):
        for v in graph.graph[u]:
            pass


def test_empty_graph():
    # Edge Case: Empty graph (0 vertices)
    g = Graph(0)
    result, sort_id = g.topologicalSort()  # 8.29μs -> 7.75μs (6.98% faster)


def test_single_node_graph():
    # Edge Case: Single node, no edges
    g = Graph(1)
    result, sort_id = g.topologicalSort()  # 8.75μs -> 8.75μs (0.000% faster)


def test_disconnected_nodes():
    # Edge Case: Multiple nodes, no edges
    g = Graph(3)
    result, sort_id = g.topologicalSort()  # 9.21μs -> 9.12μs (0.910% faster)


def test_simple_chain():
    # Basic Case: 0 -> 1 -> 2
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[1].append(2)
    result, sort_id = g.topologicalSort()  # 9.00μs -> 8.88μs (1.41% faster)


def test_simple_branch():
    # Basic Case: 0 -> 1, 0 -> 2
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[0].append(2)
    result, sort_id = g.topologicalSort()  # 8.83μs -> 8.88μs (0.473% slower)
    is_topological_order(g, result)


def test_diamond_shape():
    # 0 -> 1, 0 -> 2, 1 -> 3, 2 -> 3
    g = Graph(4)
    g.graph[0].append(1)
    g.graph[0].append(2)
    g.graph[1].append(3)
    g.graph[2].append(3)
    result, sort_id = g.topologicalSort()  # 9.33μs -> 8.96μs (4.19% faster)
    is_topological_order(g, result)


def test_multiple_components():
    # 0 -> 1, 2 -> 3
    g = Graph(4)
    g.graph[0].append(1)
    g.graph[2].append(3)
    result, sort_id = g.topologicalSort()  # 9.42μs -> 8.88μs (6.11% faster)
    is_topological_order(g, result)


def test_cycle_detection_not_supported():
    # Edge Case: Graph with a cycle, should still return a result (function does not detect cycles)
    # 0 -> 1 -> 2 -> 0
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[1].append(2)
    g.graph[2].append(0)
    result, sort_id = g.topologicalSort()  # 9.29μs -> 8.50μs (9.31% faster)


def test_multiple_edges():
    # Basic Case: 0 -> 1, 0 -> 2, 1 -> 2
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[0].append(2)
    g.graph[1].append(2)
    result, sort_id = g.topologicalSort()  # 8.67μs -> 8.62μs (0.487% faster)
    is_topological_order(g, result)


def test_large_linear_graph():
    # Large Scale: Linear chain of 1000 nodes
    N = 1000
    g = Graph(N)
    for i in range(N - 1):
        g.graph[i].append(i + 1)
    result, sort_id = g.topologicalSort()


def test_large_branching_graph():
    # Large Scale: 500 nodes, each node i has edges to i+1 and i+2 if possible
    N = 500
    g = Graph(N)
    for i in range(N):
        if i + 1 < N:
            g.graph[i].append(i + 1)
        if i + 2 < N:
            g.graph[i].append(i + 2)
    result, sort_id = g.topologicalSort()  # 218μs -> 163μs (33.2% faster)
    # Check topological property for a few random nodes
    for i in range(0, N - 2, N // 10):
        pass
    is_topological_order(g, result)


def test_large_disconnected_graph():
    # Large Scale: 1000 nodes, no edges
    N = 1000
    g = Graph(N)
    result, sort_id = g.topologicalSort()  # 441μs -> 247μs (78.4% faster)


def test_sort_id_uniqueness():
    # Edge Case: Ensure sort_id is unique per call
    g = Graph(3)
    ids = set()
    for _ in range(5):
        _, sort_id = g.topologicalSort()  # 34.3μs -> 33.2μs (3.26% faster)
        ids.add(sort_id)


def test_graph_with_isolated_and_connected_nodes():
    # Edge Case: Some nodes are isolated, some are connected
    # 0 -> 1, 2 (isolated), 3 -> 4
    g = Graph(5)
    g.graph[0].append(1)
    g.graph[3].append(4)
    result, sort_id = g.topologicalSort()  # 9.58μs -> 9.29μs (3.14% faster)
    is_topological_order(g, result)


def test_graph_with_self_loops():
    # Edge Case: Node with a self-loop (should not affect DFS in this implementation)
    g = Graph(3)
    g.graph[0].append(0)
    g.graph[1].append(2)
    result, sort_id = g.topologicalSort()  # 8.79μs -> 8.62μs (1.94% faster)
    is_topological_order(g, result)


def test_graph_with_multiple_edges_between_same_nodes():
    # Edge Case: Multiple edges between same nodes
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[0].append(1)
    g.graph[1].append(2)
    g.graph[1].append(2)
    result, sort_id = g.topologicalSort()  # 8.71μs -> 8.54μs (1.96% faster)
    is_topological_order(g, result)


def test_graph_with_reverse_edges():
    # Edge Case: 0 -> 1, 1 -> 0 (cycle)
    g = Graph(2)
    g.graph[0].append(1)
    g.graph[1].append(0)
    result, sort_id = g.topologicalSort()  # 8.42μs -> 8.25μs (2.01% faster)


def test_graph_with_non_integer_vertices():
    # Edge Case: Vertices are integers, but graph could be used with other types
    # This implementation only supports integer vertices [0, V-1]
    g = Graph(2)
    g.graph[0].append(1)
    result, sort_id = g.topologicalSort()  # 8.42μs -> 8.00μs (5.20% faster)


def test_graph_with_no_edges():
    # Edge Case: Graph with vertices but no edges
    g = Graph(5)
    result, sort_id = g.topologicalSort()  # 9.25μs -> 9.25μs (0.000% faster)


def test_graph_with_duplicate_edges():
    # Edge Case: Graph with duplicate edges between nodes
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[0].append(1)
    g.graph[1].append(2)
    g.graph[1].append(2)
    result, sort_id = g.topologicalSort()  # 8.71μs -> 8.67μs (0.473% faster)
    is_topological_order(g, result)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import uuid

# imports
import pytest

from code_to_optimize.topological_sort import Graph

# unit tests


def is_valid_topo_sort(order, graph_dict):
    """Helper function to check if order is a valid topological sort for the given graph_dict."""
    pos = {node: idx for idx, node in enumerate(order)}
    for u in graph_dict:
        for v in graph_dict[u]:
            # u must come before v
            if pos[u] >= pos[v]:
                return False
    return True


# -----------------------
# 1. BASIC TEST CASES
# -----------------------


def test_single_node():
    """Graph with a single node should return that node."""
    g = Graph(1)
    result, sort_id = g.topologicalSort()  # 8.21μs -> 8.04μs (2.06% faster)
    uuid.UUID(sort_id)  # should not raise


def test_two_nodes_one_edge():
    """Simple graph: 0 -> 1"""
    g = Graph(2)
    g.graph[0].append(1)
    result, sort_id = g.topologicalSort()  # 8.54μs -> 8.58μs (0.478% slower)
    uuid.UUID(sort_id)


def test_three_nodes_chain():
    """Graph: 0 -> 1 -> 2"""
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[1].append(2)
    result, sort_id = g.topologicalSort()  # 8.92μs -> 8.46μs (5.41% faster)
    uuid.UUID(sort_id)


def test_three_nodes_branch():
    """Graph: 0 -> 1, 0 -> 2"""
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[0].append(2)
    result, sort_id = g.topologicalSort()  # 8.83μs -> 8.58μs (2.91% faster)
    uuid.UUID(sort_id)


def test_disconnected_graph():
    """Graph: 0->1, 2 (disconnected)"""
    g = Graph(3)
    g.graph[0].append(1)
    # 2 is disconnected
    result, sort_id = g.topologicalSort()  # 8.96μs -> 8.67μs (3.36% faster)
    uuid.UUID(sort_id)


# -----------------------
# 2. EDGE TEST CASES
# -----------------------


def test_empty_graph():
    """Graph with 0 nodes should return empty result."""
    g = Graph(0)
    result, sort_id = g.topologicalSort()  # 7.50μs -> 7.50μs (0.000% faster)
    uuid.UUID(sort_id)


def test_no_edges_multiple_nodes():
    """Graph with multiple nodes and no edges."""
    g = Graph(4)
    result, sort_id = g.topologicalSort()  # 9.12μs -> 8.88μs (2.82% faster)
    uuid.UUID(sort_id)


def test_multiple_components():
    """Graph: 0->1, 2->3"""
    g = Graph(4)
    g.graph[0].append(1)
    g.graph[2].append(3)
    result, sort_id = g.topologicalSort()  # 9.00μs -> 9.12μs (1.37% slower)
    uuid.UUID(sort_id)


def test_cycle_detection_not_supported():
    """Graph with a cycle; function does not detect cycles, so output may be incorrect."""
    g = Graph(3)
    g.graph[0].append(1)
    g.graph[1].append(2)
    g.graph[2].append(0)  # cycle
    # The function does NOT check for cycles, so it will recurse infinitely or crash.
    # We expect a RecursionError.
    with pytest.raises(RecursionError):
        g.topologicalSort()


def test_large_sparse_graph():
    """Graph with 100 nodes, only one edge: 0->99."""
    g = Graph(100)
    g.graph[0].append(99)
    result, sort_id = g.topologicalSort()  # 41.3μs -> 36.2μs (14.0% faster)
    uuid.UUID(sort_id)


def test_duplicate_edges():
    """Graph with duplicate edges: 0->1, 0->1"""
    g = Graph(2)
    g.graph[0].append(1)
    g.graph[0].append(1)
    result, sort_id = g.topologicalSort()  # 9.29μs -> 9.62μs (3.46% slower)
    uuid.UUID(sort_id)


def test_self_loop():
    """Graph with a self-loop: 0->0. Should cause RecursionError."""
    g = Graph(1)
    g.graph[0].append(0)
    with pytest.raises(RecursionError):
        g.topologicalSort()


def test_non_integer_vertices():
    """Graph with vertices labeled as integers, but edges to nonexistent vertices."""
    g = Graph(2)
    g.graph[0].append(2)  # 2 does not exist
    # Should raise IndexError in visited[v]
    with pytest.raises(IndexError):
        g.topologicalSort()  # 9.29μs -> 9.25μs (0.443% faster)


# -----------------------
# 3. LARGE SCALE TEST CASES
# -----------------------


def test_large_linear_chain():
    """Large graph: 0->1->2->...->999"""
    N = 1000
    g = Graph(N)
    for i in range(N - 1):
        g.graph[i].append(i + 1)
    result, sort_id = g.topologicalSort()
    uuid.UUID(sort_id)


def test_large_wide_graph():
    """Large graph: 0->i for i in 1..999"""
    N = 1000
    g = Graph(N)
    for i in range(1, N):
        g.graph[0].append(i)
    result, sort_id = g.topologicalSort()  # 501μs -> 273μs (83.5% faster)
    uuid.UUID(sort_id)


def test_large_disconnected_graph():
    """Large graph: 10 components, each a chain of 100 nodes."""
    N = 1000
    g = Graph(N)
    for c in range(10):
        start = c * 100
        for i in range(start, start + 99):
            g.graph[i].append(i + 1)
    result, sort_id = g.topologicalSort()  # 436μs -> 235μs (85.2% faster)
    uuid.UUID(sort_id)
    # Each chain must be in order
    for c in range(10):
        chain = [i for i in range(c * 100, (c + 1) * 100)]
        indices = [result.index(i) for i in chain]


def test_large_sparse_random_edges():
    """Large graph: 1000 nodes, 10 random edges."""
    N = 1000
    g = Graph(N)
    # Add 10 edges: 0->10, 1->20, ..., 9->90
    for i in range(10):
        g.graph[i].append((i + 1) * 10)
    result, sort_id = g.topologicalSort()  # 438μs -> 250μs (75.0% faster)
    uuid.UUID(sort_id)
    # Each i must come before (i+1)*10
    for i in range(10):
        pass


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from code_to_optimize.topological_sort import Graph


def test_Graph_topologicalSort():
    Graph.topologicalSort(Graph(1))
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_7f254fbi/tmpuqq470t0/test_concolic_coverage.py::test_Graph_topologicalSort 8.33μs 8.38μs -0.501%⚠️

To edit these changes git checkout codeflash/optimize-Graph.topologicalSort-micckrt6 and push.

Codeflash

The optimization transforms the core bottleneck from O(N²) to O(N) complexity by changing how the topological sort stack is built.

**Key Change:** Replaced `stack.insert(0, v)` with `stack.append(v)` followed by a single `stack.reverse()`.

**Why This is Faster:**
- `stack.insert(0, v)` has O(N) complexity because it shifts all existing elements one position right for each insertion
- With N vertices, this creates O(N²) total complexity just for stack operations
- `stack.append(v)` has O(1) amortized complexity, making stack operations O(N) total
- A single `stack.reverse()` at the end is O(N), maintaining the correct topological order

**Performance Impact by Test Case:**
- **Small graphs (≤5 nodes):** Modest 1-6% improvements due to reduced overhead
- **Large sparse graphs:** Dramatic 75-85% speedups (e.g., 1000 disconnected nodes: 441μs → 247μs)
- **Dense graphs:** Significant 33% improvement (500-node branching graph: 218μs → 163μs)

**Minor Optimization:** Changed `visited[i] == False` to `not visited[i]` for slightly more Pythonic and marginally faster boolean checks.

The optimization scales particularly well with graph size - larger graphs see exponentially better performance due to eliminating the quadratic list shifting behavior. This makes the implementation suitable for real-world applications with substantial graph sizes.
@codeflash-ai codeflash-ai bot requested a review from aseembits93 November 23, 2025 23:27
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Nov 23, 2025
@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-Graph.topologicalSort-micckrt6 branch November 24, 2025 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants