use pytest as the execution engine for all tests #951

KRRT7 · 2025-12-03T08:54:06Z

PR Type

Enhancement, Tests, Bug fix

Description

Standardize on pytest for all tests
Remove unittest-specific timeout/decorator logic
Simplify runners and parsers for pytest-only flow
Add robust test file path resolution

Diagram Walkthrough

flowchart LR
  instr["Test instrumentation"] -- "drop timeout_decorator" --> pytestOnly["Pytest-only execution"]
  runner["Test runner"] -- "unify paths, args" --> pytestOnly
  parser["Result parsing"] -- "resolve class paths robustly" --> pytestOnly
  optimizer["Function optimizer"] -- "remove unittest loops" --> pytestOnly
  tests["Unit tests"] -- "update expectations, add cases" --> pytestOnly

File Walkthrough

Relevant files

Enhancement

4 files

instrument_existing_tests.py `Remove unittest timeout decorators and imports`	+0/-54
function_optimizer.py `Unify benchmarking/profiling to pytest; simplify loops`	+49/-104
parse_test_output.py `Add robust pytest classname-to-file resolver; unify timeout parsing`	+62/-27
test_runner.py `Consolidate runners to pytest for all frameworks; remove unittest` `runner`	+19/-97

Tests

3 files

test_code_utils.py `Add tests for resolving pytest unittest class paths`	+129/-0
test_instrument_tests.py `Update expectations to remove timeout_decorator usage`	+17/-28
test_test_runner.py `Adjust env and invocation for pytest-based unittest run`	+9/-1

Configuration changes

1 files

e2e-bubblesort-unittest.yaml `Add timeout_decorator install for CI compatibility`	+1/-0

Dependencies

1 files

pyproject.toml `Drop unittest and timeout-decorator deps; keep pytest-timeout`	+0/-2

github-actions · 2025-12-03T08:55:16Z

PR Reviewer Guide 🔍

(Review updated until commit `1c9e886`)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Robustness The new resolve_test_file_from_class_path does progressive prefix stripping; verify it cannot mistakenly resolve to an incorrect test file when multiple candidates exist. Consider short-circuiting once a match under the intended tests root is found or validating uniqueness. def resolve_test_file_from_class_path(test_class_path: str, base_dir: Path) -> Path \| None: """Resolve test file path from pytest's test class path. This function handles various cases where pytest's classname in JUnit XML includes parent directories that may already be part of base_dir. Args: test_class_path: The full class path from pytest (e.g., "project.tests.test_file.TestClass") base_dir: The base directory for tests (tests project root) Returns: Path to the test file if found, None otherwise Examples: >>> # base_dir = "/path/to/tests" >>> # test_class_path = "code_to_optimize.tests.unittest.test_file.TestClass" >>> # Should find: /path/to/tests/unittest/test_file.py """ # First try the full path test_file_path = file_name_from_test_module_name(test_class_path, base_dir) # If we couldn't find the file, try stripping the last component (likely a class name) # This handles cases like "module.TestClass" where TestClass is a class, not a module if test_file_path is None and "." in test_class_path: module_without_class = ".".join(test_class_path.split(".")[:-1]) test_file_path = file_name_from_test_module_name(module_without_class, base_dir) # If still not found, progressively strip prefix components # This handles cases where pytest's classname includes parent directories that are # already part of base_dir (e.g., "project.tests.unittest.test_file.TestClass" # when base_dir is "/.../tests") if test_file_path is None: parts = test_class_path.split(".") # Try stripping 1, 2, 3, ... prefix components for num_to_strip in range(1, len(parts)): remaining = ".".join(parts[num_to_strip:]) test_file_path = file_name_from_test_module_name(remaining, base_dir) if test_file_path: break # Also try without the last component (class name) if "." in remaining: remaining_no_class = ".".join(remaining.split(".")[:-1]) test_file_path = file_name_from_test_module_name(remaining_no_class, base_dir) if test_file_path: break return test_file_path Timeout Detection The unified timeout parsing now checks for both "failed: timeout >" and "timed out". Ensure this covers all pytest-timeout messages across versions and locales; otherwise, timeouts may be misclassified. Consider more resilient matching or relying on standardized markers. loop_index = int(testcase.name.split("[ ")[-1][:-2]) if testcase.name and "[" in testcase.name else 1 timed_out = False if len(testcase.result) > 1: logger.debug(f"!!!!!Multiple results for {testcase.name or '<None>'} in {test_xml_file_path}!!!") if len(testcase.result) == 1: message = testcase.result[0].message.lower() if "failed: timeout >" in message or "timed out" in message: timed_out = True sys_stdout = testcase.system_out or "" Framework Flag Handling Several branches now accept both "pytest" and "unittest" but always execute pytest. Ensure upstream components no longer pass other values and that environment flags (e.g., PYTEST_PLUGINS) don’t break legacy unittest-only suites. def run_behavioral_tests( test_paths: TestFiles, test_framework: str, test_env: dict[str, str], cwd: Path, , pytest_timeout: int \| None = None, pytest_cmd: str = "pytest", pytest_target_runtime_seconds: int = TOTAL_LOOPING_TIME_EFFECTIVE, enable_coverage: bool = False, ) -> tuple[Path, subprocess.CompletedProcess, Path \| None, Path \| None]: if test_framework in {"pytest", "unittest"}: test_files: list[str] = [] for file in test_paths.test_files: if file.test_type == TestType.REPLAY_TEST: # Replay tests need specific test targeting because one file contains tests for multiple functions test_files.extend( [ str(file.instrumented_behavior_file_path) + "::" + test.test_function for test in file.tests_in_file ] ) else: test_files.append(str(file.instrumented_behavior_file_path)) pytest_cmd_list = ( shlex.split(f"{SAFE_SYS_EXECUTABLE} -m pytest", posix=IS_POSIX) if pytest_cmd == "pytest" else [SAFE_SYS_EXECUTABLE, "-m", shlex.split(pytest_cmd, posix=IS_POSIX)] ) test_files = list(set(test_files)) # remove multiple calls in the same test function common_pytest_args = [ "--capture=tee-sys", "-q", "--codeflash_loops_scope=session", "--codeflash_min_loops=1", "--codeflash_max_loops=1", f"--codeflash_seconds={pytest_target_runtime_seconds}", ] if pytest_timeout is not None: common_pytest_args.insert(1, f"--timeout={pytest_timeout}") result_file_path = get_run_tmp_file(Path("pytest_results.xml")) result_args = [f"--junitxml={result_file_path.as_posix()}", "-o", "junit_logging=all"] pytest_test_env = test_env.copy()

github-actions · 2025-12-03T08:55:38Z

PR Code Suggestions ✨

Latest suggestions up to 1c9e886
Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Robustify loop index parsing Parsing loop indices by naive string slicing is fragile and can crash if the test name format changes. Use a regex to safely extract the trailing number pattern and default to 1 when not present. codeflash/verification/parse_test_output.py [326-327] -loop_index = int(testcase.name.split("[ ")[-1][:-2]) if testcase.name and "[" in testcase.name else 1 +import re # at top-level of this file +... +if testcase.name: + m = re.search(r"\[\s(\d+)\s\]$", testcase.name) + loop_index = int(m.group(1)) if m else 1 +else: + loop_index = 1 Suggestion importance[1-10]: 6 __ Why: The regex-based extraction would handle format variations more safely than brittle slicing in `testcase.name`. It’s a reasonable correctness hardening, though not critical.	Low
Possible issue	Make timeout flag appending robust Inserting the timeout flag at a fixed index is brittle and may break if the base args change. Build the args list conditionally and append the timeout in a deterministic place to avoid index errors and unexpected ordering. codeflash/verification/test_runner.py [61-71] common_pytest_args = [ "--capture=tee-sys", "-q", "--codeflash_loops_scope=session", "--codeflash_min_loops=1", "--codeflash_max_loops=1", f"--codeflash_seconds={pytest_target_runtime_seconds}", ] if pytest_timeout is not None: - common_pytest_args.insert(1, f"--timeout={pytest_timeout}") + common_pytest_args.append(f"--timeout={pytest_timeout}") Suggestion importance[1-10]: 5 __ Why: Replacing insert with append is a small robustness improvement and matches the nearby pattern used elsewhere (line profiling/benchmarking). Low risk, modest benefit; functionally equivalent.	Low
General	Clarify and standardize return contract The function previously returned a possible line-profiler output file; callers still expect two values but the first should be the JUnit XML path. Ensure all call sites use the returned JUnit XML path and not a line-profiler file name to prevent downstream parsing errors. codeflash/verification/test_runner.py [171-182] +# Ensure this function consistently returns the JUnit XML path for parsing return result_file_path, results Suggestion importance[1-10]: 3 __ Why: The PR already standardizes this function to return `(result_file_path, results)`; the suggestion reiterates the contract without proposing a change. It's accurate but low impact.	Low

Previous suggestions

Suggestions up to commit b507770

Category	Suggestion	Impact
General	Replace assert with clear error This hard assertion will crash when older code paths pass "unittest". Replace with a defensive normalization or user-facing error to prevent abrupt termination. Provide a clear message guiding the user that only pytest is supported now. codeflash/tracing/replay_test.py [52] -assert test_framework == "pytest" +if test_framework != "pytest": + raise ValueError("Only pytest is supported for replay tests. Please switch your test framework to 'pytest'.") Suggestion importance[1-10]: 7 __ Why: Replacing `assert test_framework == "pytest"` with a `ValueError` provides a clearer, non-optimized runtime check and user-facing message. It's accurate to the new pytest-only direction and improves error handling without changing behavior.	Medium
General	Add base dir fallback Previously, unittest used `project_root_path` while pytest used `tests_project_rootdir`. If any consumers still pass junit XML from outside `tests_project_rootdir`, path resolution may fail. Ensure `base_dir` is set from `tests_project_rootdir` when available, else fall back to `project_root_path`. codeflash/verification/parse_test_output.py [213] -base_dir = test_config.tests_project_rootdir +base_dir = test_config.tests_project_rootdir or test_config.project_root_path Suggestion importance[1-10]: 6 __ Why: The PR unconditionally sets `base_dir = test_config.tests_project_rootdir`. Adding a fallback to `project_root_path` can improve resilience when tests XML originates outside the tests root; it's contextually reasonable though not essential.	Low
Possible issue	Restore safe framework fallback Falling back to detection from test files was removed, which can misconfigure projects without explicit config files. Restore file-based detection as a fallback when `detected_framework` is None to avoid silently defaulting to an incorrect framework. codeflash/cli_cmds/cmd_init.py [459] -autodetected_test_framework = detected_framework +autodetected_test_framework = detected_framework or "pytest" Suggestion importance[1-10]: 4 __ Why: The snippet at line 459 sets `autodetected_test_framework = detected_framework`, removing the previous fallback. Proposing a default `"pytest"` is safer than nothing but may still mis-detect; it's a modest robustness improvement, not critical.	Low

restore restore this too Revert "first pass" This reverts commit b507770.

github-actions · 2025-12-03T11:57:41Z

Persistent review updated to latest commit 1c9e886

aseembits93 · 2025-12-05T15:05:08Z

.github/workflows/e2e-bubblesort-unittest.yaml

      - name: Install dependencies (CLI)
        run: |
          uv sync
+          uv add timeout_decorator


add to toml and update lock file?

oh my bad, we're removing this from the dependencies, do we need the workflow file then?

github-actions bot added the Review effort 3/5 label Dec 3, 2025

first pass

db472bb

restore restore this too Revert "first pass" This reverts commit b507770.

KRRT7 force-pushed the explicit-unittest branch from 2b2fd42 to db472bb Compare December 3, 2025 09:09

KRRT7 added 12 commits December 3, 2025 04:23

continue

75bd766

Update uv.lock

5077c6f

refresh lockfile

561cfda

bugfix

adcf3fc

temp

043a611

fix these

5ac5f6b

pytest changes

51a3fe1

formatting

b1056f8

set up test env properly here too

899d073

ruff

c3b625d

make ruff happy

775f1a5

Update e2e-bubblesort-unittest.yaml

cbf9ad7

github-actions bot added the workflow-modified This PR modifies GitHub Actions workflows label Dec 3, 2025

KRRT7 added 3 commits December 3, 2025 06:37

with pytest

5c3ae6d

bugfix

dc78d8f

oops

1c9e886

KRRT7 changed the title ~~remove unittest code~~ use pytest as the execution engine for all tests Dec 3, 2025

KRRT7 marked this pull request as ready for review December 3, 2025 11:56

KRRT7 requested a review from misrasaurabh1 December 3, 2025 19:23

Merge branch 'main' into explicit-unittest

89d506b

aseembits93 reviewed Dec 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

use pytest as the execution engine for all tests #951

use pytest as the execution engine for all tests #951

Uh oh!

KRRT7 commented Dec 3, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Dec 3, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 3, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 3, 2025

Uh oh!

aseembits93 Dec 5, 2025

Uh oh!

aseembits93 Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

use pytest as the execution engine for all tests #951

Are you sure you want to change the base?

use pytest as the execution engine for all tests #951

Uh oh!

Conversation

KRRT7 commented Dec 3, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

github-actions bot commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 1c9e886)

Uh oh!

github-actions bot commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions bot commented Dec 3, 2025

Uh oh!

aseembits93 Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

aseembits93 Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

KRRT7 commented Dec 3, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Dec 3, 2025 •

edited

Loading

(Review updated until commit `1c9e886`)

github-actions bot commented Dec 3, 2025 •

edited

Loading