Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 28, 2025

📄 172% (1.72x) speedup for _pop_renderer_args in src/bokeh/plotting/_renderer.py

⏱️ Runtime : 9.04 milliseconds 3.32 milliseconds (best of 307 runs)

📝 Explanation and details

The optimization replaces the original's kwargs.pop('source', ColumnDataSource()) pattern with an explicit conditional check. In the original code, pop() with a default argument always constructs the ColumnDataSource() object, even when 'source' exists in kwargs. The optimized version only creates the expensive ColumnDataSource() object when 'source' is actually missing.

From the line profiler, the original code spent 99.4% of its time (25.2ms) on the kwargs.pop('source', ColumnDataSource()) line, while the optimized version only spends 98.2% on ColumnDataSource() creation but with significantly less total time (8.8ms) because it's only called when needed.

The test results show this optimization is most effective when:

  • 'source' is present in kwargs (10,000-20,000% speedup in many tests)
  • Large datasets or complex scenarios where avoiding unnecessary object creation matters most
  • Cases without 'source' see minimal improvement since ColumnDataSource() still gets called

The 172% speedup comes from eliminating wasteful object construction - a common Python performance anti-pattern where default arguments in pop() are eagerly evaluated regardless of necessity.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 31 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 1 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

from typing import Any, TypeAlias

# imports
import pytest  # used for our unit tests
from bokeh.models import ColumnDataSource
from bokeh.plotting._renderer import _pop_renderer_args

# function to test
#-----------------------------------------------------------------------------
# Copyright (c) Anaconda, Inc., and Bokeh Contributors.
# All rights reserved.
#
# The full license is in the file LICENSE.txt, distributed with this software.
#-----------------------------------------------------------------------------


RENDERER_ARGS = ['name', 'coordinates', 'x_range_name', 'y_range_name',
                 'level', 'view', 'visible', 'muted']

Attrs: TypeAlias = dict[str, Any]
from bokeh.plotting._renderer import _pop_renderer_args

#-----------------------------------------------------------------------------
# Code
#-----------------------------------------------------------------------------

# ---------------------------
# Basic Test Cases
# ---------------------------

def test_pop_renderer_args_basic_single_arg():
    # Test with only one renderer arg present
    kwargs = {'name': 'plot1'}
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 298μs -> 303μs (1.74% slower)

def test_pop_renderer_args_basic_multiple_args():
    # Test with several renderer args and a source
    source = ColumnDataSource({'x': [1,2]})
    kwargs = {
        'name': 'plot2',
        'coordinates': 'cartesian',
        'x_range_name': 'x_range',
        'source': source,
        'extra': 123
    }
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 261μs -> 1.93μs (13404% faster)

def test_pop_renderer_args_basic_no_args():
    # Test with no renderer args and no source
    kwargs = {'foo': 'bar'}
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 292μs -> 291μs (0.435% faster)

def test_pop_renderer_args_basic_all_args():
    # Test with all renderer args and source
    source = ColumnDataSource({'y': [5,6]})
    kwargs = {
        'name': 'plot3',
        'coordinates': 'polar',
        'x_range_name': 'x_range3',
        'y_range_name': 'y_range3',
        'level': 'overlay',
        'view': 'custom_view',
        'visible': False,
        'muted': True,
        'source': source,
        'other': 42
    }
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 261μs -> 2.23μs (11645% faster)
    for key in RENDERER_ARGS:
        pass

def test_pop_renderer_args_basic_source_default():
    # Test that source defaults to new ColumnDataSource if not provided
    kwargs = {'name': 'plot4'}
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 298μs -> 291μs (2.54% faster)
    # Should not be the same object as any other ColumnDataSource
    other_source = ColumnDataSource()

# ---------------------------
# Edge Test Cases
# ---------------------------

def test_pop_renderer_args_edge_empty_kwargs():
    # Test with completely empty kwargs
    kwargs = {}
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 291μs -> 288μs (1.03% faster)

def test_pop_renderer_args_edge_unexpected_types():
    # Renderer args present but with unexpected types
    kwargs = {
        'name': 123,  # int instead of str
        'coordinates': None,
        'x_range_name': ['a', 'b'],
        'source': "not_a_datasource"
    }
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 297μs -> 1.85μs (15980% faster)

def test_pop_renderer_args_edge_muted_and_visible_false():
    # Test with boolean args
    kwargs = {
        'muted': False,
        'visible': False,
    }
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 294μs -> 300μs (1.84% slower)

def test_pop_renderer_args_edge_source_none():
    # Test with 'source' explicitly set to None
    kwargs = {'source': None}
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 295μs -> 1.42μs (20748% faster)

def test_pop_renderer_args_edge_renderer_args_with_falsey_values():
    # Renderer args with values that are falsey
    kwargs = {
        'name': '',
        'coordinates': 0,
        'x_range_name': [],
        'y_range_name': {},
        'level': None,
        'view': False,
        'visible': 0,
        'muted': ''
    }
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 298μs -> 303μs (1.55% slower)

def test_pop_renderer_args_edge_kwargs_mutation():
    # Ensure kwargs is mutated (renderer args and source removed)
    kwargs = {
        'name': 'plot5',
        'source': ColumnDataSource(),
        'extra': 'keepme'
    }
    _pop_renderer_args(kwargs) # 257μs -> 1.79μs (14333% faster)

# ---------------------------
# Large Scale Test Cases
# ---------------------------

def test_pop_renderer_args_large_many_non_renderer_args():
    # Large number of non-renderer args, renderer args present
    kwargs = {f'key{i}': i for i in range(950)}
    kwargs.update({'name': 'plot6', 'source': ColumnDataSource({'x': [1]})})
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 259μs -> 2.01μs (12777% faster)
    # All non-renderer args should remain
    for i in range(950):
        pass

def test_pop_renderer_args_large_all_renderer_args_and_large_source():
    # All renderer args present, source is a large ColumnDataSource
    data = {f'col{i}': list(range(1000)) for i in range(10)}
    source = ColumnDataSource(data)
    kwargs = {key: f'val_{key}' for key in RENDERER_ARGS}
    kwargs['source'] = source
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 263μs -> 2.10μs (12437% faster)
    for key in RENDERER_ARGS:
        pass

def test_pop_renderer_args_large_only_non_renderer_args():
    # Only large number of non-renderer args, no renderer args
    kwargs = {f'foo{i}': i for i in range(999)}
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 307μs -> 313μs (1.69% slower)
    for i in range(999):
        pass

def test_pop_renderer_args_large_source_is_large_list():
    # Source is a large list (not a ColumnDataSource)
    large_list = list(range(999))
    kwargs = {'source': large_list}
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 301μs -> 1.38μs (21808% faster)

def test_pop_renderer_args_large_renderer_args_with_large_values():
    # Renderer args with large values
    big_str = 'x' * 1000
    big_list = list(range(1000))
    kwargs = {
        'name': big_str,
        'coordinates': big_list,
        'source': ColumnDataSource({'x': big_list})
    }
    codeflash_output = _pop_renderer_args(kwargs); res = codeflash_output # 260μs -> 1.86μs (13892% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from __future__ import annotations

from typing import Any, TypeAlias

# imports
import pytest  # used for our unit tests
from bokeh.models import ColumnDataSource
from bokeh.plotting._renderer import _pop_renderer_args

# function to test
#-----------------------------------------------------------------------------
# Copyright (c) Anaconda, Inc., and Bokeh Contributors.
# All rights reserved.
#
# The full license is in the file LICENSE.txt, distributed with this software.
#-----------------------------------------------------------------------------


RENDERER_ARGS = ['name', 'coordinates', 'x_range_name', 'y_range_name',
                 'level', 'view', 'visible', 'muted']

Attrs: TypeAlias = dict[str, Any]
from bokeh.plotting._renderer import _pop_renderer_args

#-----------------------------------------------------------------------------
# Unit Tests
#-----------------------------------------------------------------------------

# Basic Test Cases

def test_basic_all_renderer_args_present():
    # All renderer args and source are present
    source = ColumnDataSource(data={'x': [1,2,3]})
    kwargs = {
        'name': 'my_renderer',
        'coordinates': 'data',
        'x_range_name': 'x_range',
        'y_range_name': 'y_range',
        'level': 'glyph',
        'view': 'view_obj',
        'visible': True,
        'muted': False,
        'source': source,
        'extra_arg': 42
    }
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 261μs -> 2.28μs (11344% faster)
    # All renderer args should be present in result
    for key in RENDERER_ARGS:
        pass

def test_basic_some_renderer_args_missing():
    # Only some renderer args present, source present
    source = ColumnDataSource()
    kwargs = {
        'name': 'renderer_name',
        'visible': False,
        'source': source,
        'foo': 'bar'
    }
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 258μs -> 1.84μs (13952% faster)
    # Missing keys should not be present
    for key in RENDERER_ARGS:
        if key not in kwargs:
            pass

def test_basic_no_renderer_args_present():
    # No renderer args, only source
    source = ColumnDataSource()
    kwargs = {'source': source, 'other': 123}
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 256μs -> 1.47μs (17325% faster)

def test_basic_no_source_key():
    # No source key, should create new ColumnDataSource
    kwargs = {'name': 'abc', 'level': 'glyph'}
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 296μs -> 293μs (1.11% faster)

# Edge Test Cases

def test_edge_empty_kwargs():
    # Empty dict
    kwargs = {}
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 299μs -> 293μs (2.11% faster)

def test_edge_source_is_none():
    # Source is explicitly None
    kwargs = {'source': None}
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 292μs -> 1.41μs (20662% faster)

def test_edge_non_standard_types():
    # Renderer args are non-standard types
    kwargs = {
        'name': 123,
        'coordinates': [1,2,3],
        'x_range_name': {'foo': 'bar'},
        'y_range_name': None,
        'level': 3.14,
        'view': object(),
        'visible': "yes",
        'muted': "",
        'source': "not_a_datasource"
    }
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 298μs -> 2.02μs (14688% faster)

def test_edge_kwargs_with_overlapping_keys():
    # kwargs contains keys that overlap with renderer args but not in RENDERER_ARGS
    kwargs = {
        'name': 'renderer',
        'source': 'ds',
        'name_extra': 'should_remain'
    }
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 299μs -> 1.58μs (18829% faster)

def test_edge_mutation_of_original_kwargs():
    # Ensure original dict is mutated (keys popped)
    kwargs = {'name': 'renderer', 'source': 'ds', 'foo': 'bar'}
    _pop_renderer_args(kwargs) # 294μs -> 1.56μs (18766% faster)

# Large Scale Test Cases

def test_large_kwargs_many_non_renderer_args():
    # Large number of non-renderer args
    kwargs = {f'key{i}': i for i in range(900)}
    # Add renderer args and source
    kwargs.update({
        'name': 'large_test',
        'level': 'glyph',
        'source': ColumnDataSource(data={'x': [1,2,3]})
    })
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 262μs -> 2.22μs (11699% faster)
    # All non-renderer keys should remain
    for i in range(900):
        pass

def test_large_kwargs_all_renderer_args_and_many_extra():
    # All renderer args + many extra
    source = ColumnDataSource()
    kwargs = {key: f'value_{key}' for key in RENDERER_ARGS}
    kwargs['source'] = source
    # Add 500 extra keys
    kwargs.update({f'extra_{i}': i for i in range(500)})
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 264μs -> 2.07μs (12672% faster)
    for key in RENDERER_ARGS:
        pass
    # All extra keys should remain
    for i in range(500):
        pass
    # Renderer args and source should be popped
    for key in RENDERER_ARGS:
        pass

def test_large_kwargs_only_extra_keys():
    # Only extra keys, no renderer args or source
    kwargs = {f'foo{i}': i for i in range(1000)}
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 304μs -> 304μs (0.027% faster)
    # All extra keys should remain
    for i in range(1000):
        pass

def test_large_source_is_large_datasource():
    # Source is a large ColumnDataSource
    data = {f'col{i}': list(range(1000)) for i in range(10)}
    source = ColumnDataSource(data=data)
    kwargs = {'source': source}
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 267μs -> 1.59μs (16706% faster)

def test_large_renderer_args_with_large_values():
    # Renderer args have large values
    large_list = list(range(999))
    large_dict = {str(i): i for i in range(999)}
    kwargs = {
        'name': 'big',
        'coordinates': large_list,
        'x_range_name': large_dict,
        'source': ColumnDataSource()
    }
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 256μs -> 2.06μs (12310% faster)

# Edge: Ensure function does not pop keys not in RENDERER_ARGS or 'source'
def test_edge_does_not_pop_unrelated_keys():
    kwargs = {'foo': 1, 'bar': 2, 'baz': 3}
    codeflash_output = _pop_renderer_args(kwargs); result = codeflash_output # 296μs -> 298μs (0.794% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from bokeh.plotting._renderer import _pop_renderer_args

def test__pop_renderer_args():
    _pop_renderer_args({'source': ''})
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_cthbg6_3/tmpnk7ine9m/test_concolic_coverage.py::test__pop_renderer_args 289μs 1.42μs 20318%✅

To edit these changes git checkout codeflash/optimize-_pop_renderer_args-mhb5w7c6 and push.

Codeflash

The optimization replaces the original's `kwargs.pop('source', ColumnDataSource())` pattern with an explicit conditional check. In the original code, `pop()` with a default argument **always** constructs the `ColumnDataSource()` object, even when 'source' exists in kwargs. The optimized version only creates the expensive `ColumnDataSource()` object when 'source' is actually missing.

From the line profiler, the original code spent 99.4% of its time (25.2ms) on the `kwargs.pop('source', ColumnDataSource())` line, while the optimized version only spends 98.2% on `ColumnDataSource()` creation but with significantly less total time (8.8ms) because it's only called when needed.

The test results show this optimization is most effective when:
- **'source' is present in kwargs** (10,000-20,000% speedup in many tests)
- **Large datasets or complex scenarios** where avoiding unnecessary object creation matters most
- **Cases without 'source'** see minimal improvement since `ColumnDataSource()` still gets called

The 172% speedup comes from eliminating wasteful object construction - a common Python performance anti-pattern where default arguments in `pop()` are eagerly evaluated regardless of necessity.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 28, 2025 22:52
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 28, 2025
Comment on lines 13 to 14
from bokeh.models import ColumnDataSource

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from bokeh.models import ColumnDataSource

from bokeh.models import ColumnDataSource

import logging # isort:skip

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Comment on lines 13 to 14
from bokeh.models import ColumnDataSource

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from bokeh.models import ColumnDataSource

from bokeh.models import ColumnDataSource

import logging # isort:skip

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Signed-off-by: Saurabh Misra <misra.saurabh1@gmail.com>
Signed-off-by: Saurabh Misra <misra.saurabh1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants