⚡️ Speed up method JiraDataSource.get_issue_types_for_project by 21%
#543
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 21% (0.21x) speedup for
JiraDataSource.get_issue_types_for_projectinbackend/python/app/sources/external/jira/jira.py⏱️ Runtime :
1.56 milliseconds→1.28 milliseconds(best of250runs)📝 Explanation and details
The optimization achieves a 21% performance improvement by eliminating redundant operations in hot paths, specifically targeting dictionary allocations and string formatting overhead.
Key Optimizations Applied:
Empty Dictionary Fast-Path: The most significant optimization is in
_as_str_dict(), which now checks for empty dictionaries first and returns a pre-allocated constant_EMPTY_STR_DICTinstead of creating new empty dictionaries. Line profiler shows this reduced calls from 1167 to 778 hits, with 375 calls taking the fast empty path.Eliminated Redundant URL Formatting: Replaced
self.base_url + _safe_format_url(rel_path, _path)with direct string concatenationself.base_url + rel_path, since this endpoint never uses path parameters. This eliminates the expensive_safe_format_urlfunction call entirely, which was consuming 18.2% of total time.Optimized Header Processing: Changed from
dict(headers or {})toheaders if headers else _EMPTY_OBJ_DICT, avoiding unnecessary dictionary copying when headers are provided or using a constant when they're not.Constant Empty Dictionary Reuse: Introduced module-level constants
_EMPTY_STR_DICTand_EMPTY_OBJ_DICTthat are reused across calls, reducing memory allocations and garbage collection pressure.Performance Impact by the Numbers:
_as_str_dictfunction time reduced from 1.57M to 1.35M nanoseconds (14% improvement)The optimizations are particularly effective for this Jira API endpoint because it consistently uses empty path parameters and often empty headers, making the empty-dictionary fast-paths highly beneficial. These improvements would scale well in high-throughput scenarios where this function is called frequently, as evidenced by the consistent performance gains across all test cases.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
import asyncio
from typing import Any, Dict, Optional
import pytest
from app.sources.external.jira.jira import JiraDataSource
--- Minimal stubs for dependencies to allow unit testing ---
class HTTPRequest:
def init(self, method, url, headers, path_params, query_params, body):
self.method = method
self.url = url
self.headers = headers
self.path_params = path_params
self.query_params = query_params
self.body = body
class HTTPResponse:
def init(self, data):
self.data = data
class DummyClient:
"""A dummy async HTTP client for testing."""
def init(self, base_url='https://jira.example.com', execute_return=None, fail_execute=False):
self._base_url = base_url
self._execute_return = execute_return
self._fail_execute = fail_execute
self.last_request = None
self.execute_call_count = 0
class JiraClient:
"""Stub JiraClient for testing."""
def init(self, client):
self.client = client
from app.sources.external.jira.jira import JiraDataSource
------------------- UNIT TESTS BELOW -------------------
1. Basic Test Cases
@pytest.mark.asyncio
async def test_get_issue_types_for_project_basic_returns_expected_response():
"""Test that the function returns the expected HTTPResponse for valid input."""
dummy_client = DummyClient()
ds = JiraDataSource(JiraClient(dummy_client))
resp = await ds.get_issue_types_for_project(projectId=123)
@pytest.mark.asyncio
async def test_get_issue_types_for_project_with_level_and_headers():
"""Test with both level and custom headers."""
dummy_client = DummyClient()
ds = JiraDataSource(JiraClient(dummy_client))
custom_headers = {"X-Test": "foo"}
resp = await ds.get_issue_types_for_project(projectId=42, level=7, headers=custom_headers)
@pytest.mark.asyncio
async def test_get_issue_types_for_project_basic_async_await_behavior():
"""Test that the function is a coroutine and can be awaited."""
dummy_client = DummyClient()
ds = JiraDataSource(JiraClient(dummy_client))
codeflash_output = ds.get_issue_types_for_project(projectId=1); coro = codeflash_output
resp = await coro
2. Edge Test Cases
@pytest.mark.asyncio
async def test_get_issue_types_for_project_concurrent_execution():
"""Test concurrent execution with different projectIds."""
dummy_client = DummyClient()
ds = JiraDataSource(JiraClient(dummy_client))
results = await asyncio.gather(
ds.get_issue_types_for_project(projectId=1),
ds.get_issue_types_for_project(projectId=2, level=5),
ds.get_issue_types_for_project(projectId=3, headers={"A": "B"}),
)
@pytest.mark.asyncio
async def test_get_issue_types_for_project_raises_if_client_not_initialized():
"""Test that ValueError is raised if client is None."""
class NullJiraClient:
def get_client(self):
return None
with pytest.raises(ValueError, match="HTTP client is not initialized"):
JiraDataSource(NullJiraClient())
@pytest.mark.asyncio
async def test_get_issue_types_for_project_raises_if_client_missing_get_base_url():
"""Test that ValueError is raised if client lacks get_base_url."""
class BadClient:
pass
with pytest.raises(ValueError, match="HTTP client does not have get_base_url method"):
JiraDataSource(JiraClient(BadClient()))
@pytest.mark.asyncio
async def test_get_issue_types_for_project_raises_if_execute_fails():
"""Test that exceptions in execute propagate."""
dummy_client = DummyClient(fail_execute=True)
ds = JiraDataSource(JiraClient(dummy_client))
with pytest.raises(RuntimeError, match="Execution failed!"):
await ds.get_issue_types_for_project(projectId=55)
@pytest.mark.asyncio
async def test_get_issue_types_for_project_with_empty_headers_and_level_zero():
"""Test with empty headers and level=0 (edge value)."""
dummy_client = DummyClient()
ds = JiraDataSource(JiraClient(dummy_client))
resp = await ds.get_issue_types_for_project(projectId=77, level=0, headers={})
@pytest.mark.asyncio
async def test_get_issue_types_for_project_with_large_project_id():
"""Test with a very large projectId."""
dummy_client = DummyClient()
ds = JiraDataSource(JiraClient(dummy_client))
large_id = 2**60
resp = await ds.get_issue_types_for_project(projectId=large_id)
3. Large Scale Test Cases
@pytest.mark.asyncio
async def test_get_issue_types_for_project_many_concurrent_calls():
"""Test with many concurrent calls to ensure scalability."""
dummy_client = DummyClient()
ds = JiraDataSource(JiraClient(dummy_client))
# Use 50 concurrent calls, each with a unique projectId
n = 50
coros = [ds.get_issue_types_for_project(projectId=i) for i in range(n)]
results = await asyncio.gather(*coros)
for i, resp in enumerate(results):
pass
@pytest.mark.asyncio
async def test_get_issue_types_for_project_concurrent_different_headers():
"""Test concurrent calls with different headers to ensure isolation."""
dummy_client = DummyClient()
ds = JiraDataSource(JiraClient(dummy_client))
headers_list = [{"X-Req": str(i)} for i in range(10)]
coros = [ds.get_issue_types_for_project(projectId=100+i, headers=h) for i, h in enumerate(headers_list)]
results = await asyncio.gather(*coros)
for i, resp in enumerate(results):
pass
4. Throughput Test Cases
@pytest.mark.asyncio
async def test_get_issue_types_for_project_throughput_small_load():
"""Throughput: Test with a small number of rapid calls."""
dummy_client = DummyClient()
ds = JiraDataSource(JiraClient(dummy_client))
coros = [ds.get_issue_types_for_project(projectId=i) for i in range(5)]
results = await asyncio.gather(*coros)
@pytest.mark.asyncio
async def test_get_issue_types_for_project_throughput_medium_load():
"""Throughput: Test with a medium number of concurrent calls."""
dummy_client = DummyClient()
ds = JiraDataSource(JiraClient(dummy_client))
coros = [ds.get_issue_types_for_project(projectId=1000+i, level=i%3) for i in range(30)]
results = await asyncio.gather(*coros)
# Ensure all projectIds are present and correct
for i, resp in enumerate(results):
pass
@pytest.mark.asyncio
async def test_get_issue_types_for_project_throughput_high_volume():
"""Throughput: Test with a high (but safe) number of concurrent calls."""
dummy_client = DummyClient()
ds = JiraDataSource(JiraClient(dummy_client))
n = 100 # Not too large to avoid test timeouts
coros = [ds.get_issue_types_for_project(projectId=10000+i) for i in range(n)]
results = await asyncio.gather(*coros)
# Spot check a few responses
for idx in [0, 10, 50, 99]:
pass
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import asyncio # used to run async functions
import pytest # used for our unit tests
from app.sources.external.jira.jira import JiraDataSource
---- Minimal stubs for required classes ----
These are minimal, fast, deterministic stubs for unit testing only.
class HTTPRequest:
def init(self, method, url, headers, path_params, query_params, body):
self.method = method
self.url = url
self.headers = headers
self.path_params = path_params
self.query_params = query_params
self.body = body
class HTTPResponse:
def init(self, data):
self.data = data
class DummyAsyncHTTPClient:
"""A dummy async HTTP client for testing."""
def init(self, base_url="https://dummy.jira.com"):
self._base_url = base_url
self._executed_requests = []
class JiraClient:
"""Stub JiraClient for testing."""
def init(self, client):
self.client = client
from app.sources.external.jira.jira import JiraDataSource
---- UNIT TESTS ----
1. BASIC TEST CASES
@pytest.mark.asyncio
async def test_get_issue_types_for_project_basic():
"""Test basic async/await behavior and expected values."""
client = JiraClient(DummyAsyncHTTPClient())
ds = JiraDataSource(client)
response = await ds.get_issue_types_for_project(123)
@pytest.mark.asyncio
async def test_get_issue_types_for_project_with_level_and_headers():
"""Test with level and custom headers."""
client = JiraClient(DummyAsyncHTTPClient())
ds = JiraDataSource(client)
headers = {"X-Test": "abc"}
response = await ds.get_issue_types_for_project(456, level=2, headers=headers)
@pytest.mark.asyncio
async def test_get_issue_types_for_project_level_none():
"""Test with level explicitly set to None."""
client = JiraClient(DummyAsyncHTTPClient())
ds = JiraDataSource(client)
response = await ds.get_issue_types_for_project(789, level=None)
2. EDGE TEST CASES
@pytest.mark.asyncio
async def test_get_issue_types_for_project_invalid_client_none():
"""Test error raised if HTTP client is None."""
class BadJiraClient:
def get_client(self):
return None
with pytest.raises(ValueError, match="HTTP client is not initialized"):
JiraDataSource(BadJiraClient())
@pytest.mark.asyncio
async def test_get_issue_types_for_project_missing_get_base_url():
"""Test error raised if client lacks get_base_url method."""
class NoBaseUrlClient:
pass
client = JiraClient(NoBaseUrlClient())
with pytest.raises(ValueError, match="HTTP client does not have get_base_url method"):
JiraDataSource(client)
@pytest.mark.asyncio
async def test_get_issue_types_for_project_concurrent_execution():
"""Test concurrent execution of multiple async calls."""
client = JiraClient(DummyAsyncHTTPClient())
ds = JiraDataSource(client)
# Run 5 concurrent requests
coros = [ds.get_issue_types_for_project(i) for i in range(5)]
results = await asyncio.gather(*coros)
for idx, resp in enumerate(results):
pass
@pytest.mark.asyncio
async def test_get_issue_types_for_project_headers_edge_cases():
"""Test headers with non-string values and empty dict."""
client = JiraClient(DummyAsyncHTTPClient())
ds = JiraDataSource(client)
headers = {"X-Int": 42, "X-Bool": True, "X-None": None}
response = await ds.get_issue_types_for_project(1, headers=headers)
3. LARGE SCALE TEST CASES
@pytest.mark.asyncio
async def test_get_issue_types_for_project_many_concurrent_requests():
"""Test scalability with many concurrent requests."""
client = JiraClient(DummyAsyncHTTPClient())
ds = JiraDataSource(client)
n = 50 # Large but reasonable number
coros = [ds.get_issue_types_for_project(i, level=i % 3) for i in range(n)]
results = await asyncio.gather(*coros)
# Check a few random results for correctness
for i in [0, n//2, n-1]:
resp = results[i]
@pytest.mark.asyncio
async def test_get_issue_types_for_project_empty_headers_and_paths():
"""Test with empty headers and path params."""
client = JiraClient(DummyAsyncHTTPClient())
ds = JiraDataSource(client)
response = await ds.get_issue_types_for_project(100, headers={})
4. THROUGHPUT TEST CASES
@pytest.mark.asyncio
async def test_get_issue_types_for_project_throughput_small_load():
"""Throughput test: small number of requests."""
client = JiraClient(DummyAsyncHTTPClient())
ds = JiraDataSource(client)
coros = [ds.get_issue_types_for_project(i) for i in range(5)]
results = await asyncio.gather(*coros)
@pytest.mark.asyncio
async def test_get_issue_types_for_project_throughput_medium_load():
"""Throughput test: medium number of requests."""
client = JiraClient(DummyAsyncHTTPClient())
ds = JiraDataSource(client)
coros = [ds.get_issue_types_for_project(i, level=i % 5) for i in range(20)]
results = await asyncio.gather(*coros)
for i, resp in enumerate(results):
pass
@pytest.mark.asyncio
async def test_get_issue_types_for_project_throughput_large_load():
"""Throughput test: large number of requests."""
client = JiraClient(DummyAsyncHTTPClient())
ds = JiraDataSource(client)
coros = [ds.get_issue_types_for_project(i, level=i % 7) for i in range(100)]
results = await asyncio.gather(*coros)
# Spot check a few
for i in [0, 50, 99]:
resp = results[i]
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
To edit these changes
git checkout codeflash/optimize-JiraDataSource.get_issue_types_for_project-mhrzkxfuand push.