Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 28, 2025

📄 61% (0.61x) speedup for _use_gl in src/bokeh/embed/bundle.py

⏱️ Runtime : 1.12 milliseconds 699 microseconds (best of 49 runs)

📝 Explanation and details

The optimization achieves a 60% speedup by eliminating two key performance bottlenecks:

1. Generator Expression Overhead Removal
The original _any() function used any(query(x) for x in objs), which creates a generator object with associated overhead. The optimized version replaces this with a direct for loop that short-circuits immediately when a match is found, avoiding generator creation and method call overhead.

2. Function Call Chain Elimination
The original _use_gl() made two function calls: _any() + a lambda function for each object. The optimized version inlines this logic into a single loop, eliminating:

  • The _any() function call overhead
  • Lambda function creation and invocation for each object
  • The intermediate function call stack

Performance Characteristics by Test Case:

  • Small sets (1-10 objects): 22-47% faster due to reduced function call overhead
  • Large sets (500-1000 objects): 67-78% faster, where the cumulative effect of avoiding lambda calls per object becomes significant
  • Early termination cases: Particularly effective when WebGL plots are found early in iteration, as the optimized version can return immediately without processing remaining objects

The optimization maintains identical behavior while reducing computational overhead through direct iteration and eliminating unnecessary abstraction layers.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 19 Passed
🌀 Generated Regression Tests 51 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
unit/bokeh/embed/test_bundle.py::Test__use_gl.test_with_gl 29.7μs 19.4μs 53.0%✅
unit/bokeh/embed/test_bundle.py::Test__use_gl.test_without_gl 107μs 89.1μs 21.1%✅
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

from typing import Callable

# imports
import pytest  # used for our unit tests
from bokeh.embed.bundle import _use_gl


# Minimal HasProps base class for testing
class HasProps:
    pass

# Minimal Plot class for testing, inherits from HasProps
class Plot(HasProps):
    def __init__(self, output_backend=None):
        self.output_backend = output_backend
from bokeh.embed.bundle import _use_gl

# unit tests

# --- Basic Test Cases ---

def test_empty_set_returns_false():
    # Test with an empty set
    codeflash_output = _use_gl(set()) # 4.52μs -> 3.66μs (23.7% faster)

def test_single_plot_with_webgl_returns_true():
    # Test with a single Plot with output_backend='webgl'
    p = Plot(output_backend="webgl")
    codeflash_output = _use_gl({p}) # 4.23μs -> 3.45μs (22.8% faster)

def test_single_plot_with_non_webgl_returns_false():
    # Test with a single Plot with output_backend='canvas'
    p = Plot(output_backend="canvas")
    codeflash_output = _use_gl({p}) # 4.20μs -> 3.04μs (38.4% faster)

def test_single_non_plot_object_returns_false():
    # Test with a single object that is not a Plot
    class Dummy(HasProps):
        pass
    d = Dummy()
    codeflash_output = _use_gl({d}) # 4.13μs -> 3.27μs (26.3% faster)

def test_multiple_plots_one_with_webgl_returns_true():
    # Test with multiple Plot objects, one with output_backend='webgl'
    p1 = Plot(output_backend="canvas")
    p2 = Plot(output_backend="webgl")
    p3 = Plot(output_backend="svg")
    codeflash_output = _use_gl({p1, p2, p3}) # 4.49μs -> 3.21μs (39.5% faster)

def test_multiple_plots_none_with_webgl_returns_false():
    # Test with multiple Plot objects, none with output_backend='webgl'
    p1 = Plot(output_backend="canvas")
    p2 = Plot(output_backend="svg")
    codeflash_output = _use_gl({p1, p2}) # 4.10μs -> 3.04μs (34.6% faster)

def test_mixed_objects_with_plot_webgl_returns_true():
    # Test with a mix of Plot and non-Plot objects, one Plot with 'webgl'
    class Dummy(HasProps): pass
    d1 = Dummy()
    d2 = Dummy()
    p = Plot(output_backend="webgl")
    codeflash_output = _use_gl({d1, d2, p}) # 4.57μs -> 3.35μs (36.5% faster)

def test_mixed_objects_without_plot_webgl_returns_false():
    # Test with a mix of Plot and non-Plot objects, no Plot with 'webgl'
    class Dummy(HasProps): pass
    d1 = Dummy()
    d2 = Dummy()
    p = Plot(output_backend="svg")
    codeflash_output = _use_gl({d1, d2, p}) # 4.54μs -> 3.33μs (36.5% faster)

# --- Edge Test Cases ---

def test_plot_with_output_backend_none_returns_false():
    # Test Plot with output_backend=None
    p = Plot(output_backend=None)
    codeflash_output = _use_gl({p}) # 4.07μs -> 2.93μs (39.1% faster)

def test_plot_with_output_backend_empty_string_returns_false():
    # Test Plot with output_backend=''
    p = Plot(output_backend="")
    codeflash_output = _use_gl({p}) # 3.90μs -> 2.88μs (35.5% faster)

def test_plot_with_output_backend_case_sensitive():
    # Test Plot with output_backend='WebGL' (should be case-sensitive)
    p = Plot(output_backend="WebGL")
    codeflash_output = _use_gl({p}) # 3.77μs -> 2.83μs (33.0% faster)

def test_plot_with_output_backend_whitespace_returns_false():
    # Test Plot with output_backend=' webgl' (leading space)
    p = Plot(output_backend=" webgl")
    codeflash_output = _use_gl({p}) # 3.77μs -> 2.79μs (35.2% faster)

def test_plot_with_output_backend_webgl_among_other_types():
    # Test with several Plots, some with similar output_backend values
    p1 = Plot(output_backend="webgl")
    p2 = Plot(output_backend="webgl ")
    p3 = Plot(output_backend="webgl")
    codeflash_output = _use_gl({p1, p2, p3}) # 4.21μs -> 3.08μs (36.7% faster)

def test_non_plot_with_output_backend_webgl_returns_false():
    # Test a non-Plot object with output_backend='webgl'
    class Dummy(HasProps):
        def __init__(self):
            self.output_backend = "webgl"
    d = Dummy()
    codeflash_output = _use_gl({d}) # 4.13μs -> 3.19μs (29.4% faster)

def test_plot_with_output_backend_integer_returns_false():
    # Test Plot with output_backend as an integer
    p = Plot(output_backend=123)
    codeflash_output = _use_gl({p}) # 3.83μs -> 2.84μs (35.0% faster)

def test_plot_with_output_backend_list_returns_false():
    # Test Plot with output_backend as a list
    p = Plot(output_backend=["webgl"])
    codeflash_output = _use_gl({p}) # 4.00μs -> 2.75μs (45.2% faster)

def test_plot_with_output_backend_boolean_returns_false():
    # Test Plot with output_backend as a boolean
    p = Plot(output_backend=True)
    codeflash_output = _use_gl({p}) # 3.89μs -> 2.81μs (38.4% faster)

def test_set_with_duplicate_plot_objects():
    # Test set with duplicate Plot objects (should not affect result)
    p = Plot(output_backend="webgl")
    objs = {p, p, p}
    codeflash_output = _use_gl(objs) # 3.62μs -> 2.66μs (35.9% faster)

def test_set_with_duplicate_non_webgl_plot_objects():
    # Test set with duplicate Plot objects, none with 'webgl'
    p = Plot(output_backend="canvas")
    objs = {p, p}
    codeflash_output = _use_gl(objs) # 3.70μs -> 2.72μs (35.8% faster)

# --- Large Scale Test Cases ---

def test_large_set_all_non_webgl_returns_false():
    # Test with a large set of Plot objects, none with 'webgl'
    plots = {Plot(output_backend="canvas") for _ in range(500)}
    codeflash_output = _use_gl(plots) # 45.7μs -> 26.3μs (73.7% faster)

def test_large_set_one_webgl_returns_true():
    # Test with a large set of Plot objects, one with 'webgl'
    plots = {Plot(output_backend="canvas") for _ in range(499)}
    p_webgl = Plot(output_backend="webgl")
    plots.add(p_webgl)
    codeflash_output = _use_gl(plots) # 45.1μs -> 26.9μs (67.8% faster)

def test_large_set_mixed_types_returns_true():
    # Test with a large set of mixed HasProps, only one Plot with 'webgl'
    class Dummy(HasProps): pass
    dummies = {Dummy() for _ in range(400)}
    plots = {Plot(output_backend="canvas") for _ in range(599)}
    p_webgl = Plot(output_backend="webgl")
    all_objs = dummies | plots | {p_webgl}
    codeflash_output = _use_gl(all_objs) # 86.9μs -> 48.8μs (77.9% faster)

def test_large_set_mixed_types_returns_false():
    # Test with a large set of mixed HasProps, no Plot with 'webgl'
    class Dummy(HasProps): pass
    dummies = {Dummy() for _ in range(400)}
    plots = {Plot(output_backend="canvas") for _ in range(599)}
    all_objs = dummies | plots
    codeflash_output = _use_gl(all_objs) # 87.2μs -> 49.9μs (74.6% faster)

def test_large_set_all_webgl_returns_true():
    # Test with a large set of Plot objects, all with 'webgl'
    plots = {Plot(output_backend="webgl") for _ in range(999)}
    codeflash_output = _use_gl(plots) # 87.4μs -> 50.1μs (74.4% faster)

def test_large_set_with_edge_values():
    # Test with a large set including edge cases (None, int, etc.)
    class Dummy(HasProps): pass
    plots = {Plot(output_backend="webgl") for _ in range(1)}
    plots |= {Plot(output_backend=None) for _ in range(500)}
    plots |= {Plot(output_backend="canvas") for _ in range(498)}
    plots |= {Plot(output_backend=123)}
    plots |= {Plot(output_backend=["webgl"])}
    dummies = {Dummy() for _ in range(10)}
    all_objs = plots | dummies
    codeflash_output = _use_gl(all_objs) # 89.1μs -> 50.0μs (78.2% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import Callable, Set

# imports
import pytest
from bokeh.embed.bundle import _use_gl


# Minimal HasProps base class for testing
class HasProps:
    pass

# Minimal Plot class for testing, inheriting from HasProps
class Plot(HasProps):
    def __init__(self, output_backend=None):
        self.output_backend = output_backend
from bokeh.embed.bundle import _use_gl

# ------------------- UNIT TESTS -------------------

# 1. Basic Test Cases

def test_empty_set_returns_false():
    # Test with empty set: should return False
    codeflash_output = _use_gl(set()) # 4.64μs -> 3.44μs (35.0% faster)

def test_single_plot_webgl_true():
    # Single Plot with output_backend='webgl'
    p = Plot(output_backend="webgl")
    codeflash_output = _use_gl({p}) # 4.42μs -> 3.29μs (34.6% faster)

def test_single_plot_non_webgl_false():
    # Single Plot with output_backend not 'webgl'
    p = Plot(output_backend="canvas")
    codeflash_output = _use_gl({p}) # 4.25μs -> 2.99μs (42.1% faster)

def test_single_non_plot_object():
    # Single object that is not a Plot
    class Dummy(HasProps):
        pass
    d = Dummy()
    codeflash_output = _use_gl({d}) # 4.48μs -> 3.23μs (38.6% faster)

def test_mixed_objects_one_plot_webgl():
    # Mixed set: only one Plot with 'webgl'
    class Dummy(HasProps):
        pass
    d = Dummy()
    p = Plot(output_backend="webgl")
    codeflash_output = _use_gl({d, p}) # 4.57μs -> 3.38μs (35.1% faster)

def test_mixed_objects_no_plot_webgl():
    # Mixed set: Plot with non-webgl, others not Plot
    class Dummy(HasProps):
        pass
    d = Dummy()
    p = Plot(output_backend="canvas")
    codeflash_output = _use_gl({d, p}) # 4.61μs -> 3.29μs (40.1% faster)

# 2. Edge Test Cases

def test_plot_with_output_backend_none():
    # Plot with output_backend=None
    p = Plot(output_backend=None)
    codeflash_output = _use_gl({p}) # 4.07μs -> 2.96μs (37.3% faster)

def test_plot_with_output_backend_empty_string():
    # Plot with output_backend=''
    p = Plot(output_backend="")
    codeflash_output = _use_gl({p}) # 4.02μs -> 2.79μs (44.4% faster)

def test_plot_with_output_backend_case_sensitivity():
    # Plot with output_backend='WebGL' (should be case-sensitive)
    p = Plot(output_backend="WebGL")
    codeflash_output = _use_gl({p}) # 3.96μs -> 2.80μs (41.3% faster)

def test_multiple_plots_all_non_webgl():
    # Multiple Plot objects, all with non-webgl output_backend
    plots = {Plot("canvas"), Plot("svg"), Plot("bitmap")}
    codeflash_output = _use_gl(plots) # 4.25μs -> 3.12μs (35.9% faster)

def test_multiple_plots_one_webgl():
    # Multiple Plot objects, one with webgl
    plots = {Plot("canvas"), Plot("webgl"), Plot("svg")}
    codeflash_output = _use_gl(plots) # 3.95μs -> 3.06μs (28.9% faster)

def test_non_plot_with_output_backend_webgl():
    # Non-Plot object with attribute 'output_backend' == 'webgl'
    class NotAPlot(HasProps):
        def __init__(self):
            self.output_backend = "webgl"
    n = NotAPlot()
    codeflash_output = _use_gl({n}) # 4.23μs -> 3.09μs (36.9% faster)

def test_plot_missing_output_backend_attribute():
    # Plot object missing output_backend attribute (simulate by deleting)
    p = Plot("webgl")
    del p.output_backend
    codeflash_output = _use_gl({p}) # 4.05μs -> 2.78μs (45.6% faster)

def test_object_with_output_backend_but_not_plot():
    # Object with output_backend but not a Plot
    class Dummy(HasProps):
        def __init__(self):
            self.output_backend = "webgl"
    d = Dummy()
    codeflash_output = _use_gl({d}) # 4.10μs -> 2.95μs (38.9% faster)

def test_plot_with_output_backend_integer():
    # Plot with output_backend as integer (nonsensical, but possible)
    p = Plot(output_backend=123)
    codeflash_output = _use_gl({p}) # 4.04μs -> 2.74μs (47.8% faster)

def test_plot_with_output_backend_bool():
    # Plot with output_backend as boolean
    p = Plot(output_backend=True)
    codeflash_output = _use_gl({p}) # 3.83μs -> 2.69μs (42.0% faster)

def test_plot_with_output_backend_list():
    # Plot with output_backend as a list
    p = Plot(output_backend=["webgl"])
    codeflash_output = _use_gl({p}) # 3.90μs -> 2.77μs (40.7% faster)

def test_set_with_none_object():
    # Set containing None
    p = Plot("webgl")
    objs = {p, None}
    # Should still return True, None should be ignored
    codeflash_output = _use_gl(objs) # 4.17μs -> 2.93μs (42.6% faster)

def test_set_with_non_hasprops_object():
    # Set containing object not inheriting from HasProps
    class NotHasProps:
        pass
    n = NotHasProps()
    p = Plot("webgl")
    objs = {p, n}
    # Should still return True, only Plot with webgl matters
    codeflash_output = _use_gl(objs) # 4.26μs -> 2.92μs (46.1% faster)

# 3. Large Scale Test Cases

def test_large_set_no_webgl():
    # Large set of Plot objects, none with webgl
    plots = {Plot("canvas") for _ in range(500)}
    codeflash_output = _use_gl(plots) # 46.6μs -> 27.8μs (67.3% faster)

def test_large_set_one_webgl():
    # Large set with one Plot having webgl
    plots = {Plot("canvas") for _ in range(499)}
    plots.add(Plot("webgl"))
    codeflash_output = _use_gl(plots) # 45.9μs -> 26.9μs (70.6% faster)

def test_large_set_multiple_webgl():
    # Large set with several Plots having webgl
    plots = {Plot("canvas") for _ in range(495)}
    plots.update({Plot("webgl") for _ in range(5)})
    codeflash_output = _use_gl(plots) # 46.4μs -> 26.7μs (73.7% faster)

def test_large_set_mixed_types():
    # Large set with mixture of Plot and non-Plot objects
    class Dummy(HasProps):
        pass
    objs = {Plot("canvas") for _ in range(400)}
    objs.update({Dummy() for _ in range(400)})
    objs.add(Plot("webgl"))
    codeflash_output = _use_gl(objs) # 70.7μs -> 40.4μs (74.8% faster)

def test_large_set_all_non_plot():
    # Large set of non-Plot objects
    class Dummy(HasProps):
        pass
    objs = {Dummy() for _ in range(999)}
    codeflash_output = _use_gl(objs) # 86.4μs -> 49.0μs (76.2% faster)

def test_large_set_all_webgl():
    # Large set, all Plots with webgl
    plots = {Plot("webgl") for _ in range(999)}
    codeflash_output = _use_gl(plots) # 86.1μs -> 49.0μs (75.7% faster)

def test_large_set_with_duplicates():
    # Large set with duplicate Plot objects (should not matter)
    p = Plot("webgl")
    plots = {p for _ in range(500)}
    codeflash_output = _use_gl(plots) # 4.74μs -> 3.53μs (34.4% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-_use_gl-mhb721vb and push.

Codeflash

The optimization achieves a **60% speedup** by eliminating two key performance bottlenecks:

**1. Generator Expression Overhead Removal**
The original `_any()` function used `any(query(x) for x in objs)`, which creates a generator object with associated overhead. The optimized version replaces this with a direct `for` loop that short-circuits immediately when a match is found, avoiding generator creation and method call overhead.

**2. Function Call Chain Elimination** 
The original `_use_gl()` made two function calls: `_any()` + a lambda function for each object. The optimized version inlines this logic into a single loop, eliminating:
- The `_any()` function call overhead
- Lambda function creation and invocation for each object
- The intermediate function call stack

**Performance Characteristics by Test Case:**
- **Small sets (1-10 objects)**: 22-47% faster due to reduced function call overhead
- **Large sets (500-1000 objects)**: 67-78% faster, where the cumulative effect of avoiding lambda calls per object becomes significant
- **Early termination cases**: Particularly effective when WebGL plots are found early in iteration, as the optimized version can return immediately without processing remaining objects

The optimization maintains identical behavior while reducing computational overhead through direct iteration and eliminating unnecessary abstraction layers.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 28, 2025 23:24
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant