Skip to content

Conversation

@KRRT7
Copy link
Contributor

@KRRT7 KRRT7 commented Dec 3, 2025

PR Type

Enhancement, Tests, Bug fix


Description

  • Standardize on pytest for all tests

  • Remove unittest-specific timeout/decorator logic

  • Simplify runners and parsers for pytest-only flow

  • Add robust test file path resolution


Diagram Walkthrough

flowchart LR
  instr["Test instrumentation"] -- "drop timeout_decorator" --> pytestOnly["Pytest-only execution"]
  runner["Test runner"] -- "unify paths, args" --> pytestOnly
  parser["Result parsing"] -- "resolve class paths robustly" --> pytestOnly
  optimizer["Function optimizer"] -- "remove unittest loops" --> pytestOnly
  tests["Unit tests"] -- "update expectations, add cases" --> pytestOnly
Loading

File Walkthrough

Relevant files
Enhancement
4 files
instrument_existing_tests.py
Remove unittest timeout decorators and imports                     
+0/-54   
function_optimizer.py
Unify benchmarking/profiling to pytest; simplify loops     
+49/-104
parse_test_output.py
Add robust pytest classname-to-file resolver; unify timeout parsing
+62/-27 
test_runner.py
Consolidate runners to pytest for all frameworks; remove unittest
runner
+19/-97 
Tests
3 files
test_code_utils.py
Add tests for resolving pytest unittest class paths           
+129/-0 
test_instrument_tests.py
Update expectations to remove timeout_decorator usage       
+17/-28 
test_test_runner.py
Adjust env and invocation for pytest-based unittest run   
+9/-1     
Configuration changes
1 files
e2e-bubblesort-unittest.yaml
Add timeout_decorator install for CI compatibility             
+1/-0     
Dependencies
1 files
pyproject.toml
Drop unittest and timeout-decorator deps; keep pytest-timeout
+0/-2     

@github-actions
Copy link

github-actions bot commented Dec 3, 2025

PR Reviewer Guide 🔍

(Review updated until commit 1c9e886)

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Robustness

The new resolve_test_file_from_class_path does progressive prefix stripping; verify it cannot mistakenly resolve to an incorrect test file when multiple candidates exist. Consider short-circuiting once a match under the intended tests root is found or validating uniqueness.

def resolve_test_file_from_class_path(test_class_path: str, base_dir: Path) -> Path | None:
    """Resolve test file path from pytest's test class path.

    This function handles various cases where pytest's classname in JUnit XML
    includes parent directories that may already be part of base_dir.

    Args:
        test_class_path: The full class path from pytest (e.g., "project.tests.test_file.TestClass")
        base_dir: The base directory for tests (tests project root)

    Returns:
        Path to the test file if found, None otherwise

    Examples:
        >>> # base_dir = "/path/to/tests"
        >>> # test_class_path = "code_to_optimize.tests.unittest.test_file.TestClass"
        >>> # Should find: /path/to/tests/unittest/test_file.py

    """
    # First try the full path
    test_file_path = file_name_from_test_module_name(test_class_path, base_dir)

    # If we couldn't find the file, try stripping the last component (likely a class name)
    # This handles cases like "module.TestClass" where TestClass is a class, not a module
    if test_file_path is None and "." in test_class_path:
        module_without_class = ".".join(test_class_path.split(".")[:-1])
        test_file_path = file_name_from_test_module_name(module_without_class, base_dir)

    # If still not found, progressively strip prefix components
    # This handles cases where pytest's classname includes parent directories that are
    # already part of base_dir (e.g., "project.tests.unittest.test_file.TestClass"
    # when base_dir is "/.../tests")
    if test_file_path is None:
        parts = test_class_path.split(".")
        # Try stripping 1, 2, 3, ... prefix components
        for num_to_strip in range(1, len(parts)):
            remaining = ".".join(parts[num_to_strip:])
            test_file_path = file_name_from_test_module_name(remaining, base_dir)
            if test_file_path:
                break
            # Also try without the last component (class name)
            if "." in remaining:
                remaining_no_class = ".".join(remaining.split(".")[:-1])
                test_file_path = file_name_from_test_module_name(remaining_no_class, base_dir)
                if test_file_path:
                    break

    return test_file_path
Timeout Detection

The unified timeout parsing now checks for both "failed: timeout >" and "timed out". Ensure this covers all pytest-timeout messages across versions and locales; otherwise, timeouts may be misclassified. Consider more resilient matching or relying on standardized markers.

loop_index = int(testcase.name.split("[ ")[-1][:-2]) if testcase.name and "[" in testcase.name else 1

timed_out = False
if len(testcase.result) > 1:
    logger.debug(f"!!!!!Multiple results for {testcase.name or '<None>'} in {test_xml_file_path}!!!")
if len(testcase.result) == 1:
    message = testcase.result[0].message.lower()
    if "failed: timeout >" in message or "timed out" in message:
        timed_out = True

sys_stdout = testcase.system_out or ""
Framework Flag Handling

Several branches now accept both "pytest" and "unittest" but always execute pytest. Ensure upstream components no longer pass other values and that environment flags (e.g., PYTEST_PLUGINS) don’t break legacy unittest-only suites.

def run_behavioral_tests(
    test_paths: TestFiles,
    test_framework: str,
    test_env: dict[str, str],
    cwd: Path,
    *,
    pytest_timeout: int | None = None,
    pytest_cmd: str = "pytest",
    pytest_target_runtime_seconds: int = TOTAL_LOOPING_TIME_EFFECTIVE,
    enable_coverage: bool = False,
) -> tuple[Path, subprocess.CompletedProcess, Path | None, Path | None]:
    if test_framework in {"pytest", "unittest"}:
        test_files: list[str] = []
        for file in test_paths.test_files:
            if file.test_type == TestType.REPLAY_TEST:
                # Replay tests need specific test targeting because one file contains tests for multiple functions
                test_files.extend(
                    [
                        str(file.instrumented_behavior_file_path) + "::" + test.test_function
                        for test in file.tests_in_file
                    ]
                )
            else:
                test_files.append(str(file.instrumented_behavior_file_path))
        pytest_cmd_list = (
            shlex.split(f"{SAFE_SYS_EXECUTABLE} -m pytest", posix=IS_POSIX)
            if pytest_cmd == "pytest"
            else [SAFE_SYS_EXECUTABLE, "-m", *shlex.split(pytest_cmd, posix=IS_POSIX)]
        )
        test_files = list(set(test_files))  # remove multiple calls in the same test function
        common_pytest_args = [
            "--capture=tee-sys",
            "-q",
            "--codeflash_loops_scope=session",
            "--codeflash_min_loops=1",
            "--codeflash_max_loops=1",
            f"--codeflash_seconds={pytest_target_runtime_seconds}",
        ]
        if pytest_timeout is not None:
            common_pytest_args.insert(1, f"--timeout={pytest_timeout}")

        result_file_path = get_run_tmp_file(Path("pytest_results.xml"))
        result_args = [f"--junitxml={result_file_path.as_posix()}", "-o", "junit_logging=all"]

        pytest_test_env = test_env.copy()

@github-actions
Copy link

github-actions bot commented Dec 3, 2025

PR Code Suggestions ✨

Latest suggestions up to 1c9e886
Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Robustify loop index parsing

Parsing loop indices by naive string slicing is fragile and can crash if the test
name format changes. Use a regex to safely extract the trailing number pattern and
default to 1 when not present.

codeflash/verification/parse_test_output.py [326-327]

-loop_index = int(testcase.name.split("[ ")[-1][:-2]) if testcase.name and "[" in testcase.name else 1
+import re  # at top-level of this file
+...
+if testcase.name:
+    m = re.search(r"\[\s*(\d+)\s*\]$", testcase.name)
+    loop_index = int(m.group(1)) if m else 1
+else:
+    loop_index = 1
Suggestion importance[1-10]: 6

__

Why: The regex-based extraction would handle format variations more safely than brittle slicing in testcase.name. It’s a reasonable correctness hardening, though not critical.

Low
Make timeout flag appending robust

Inserting the timeout flag at a fixed index is brittle and may break if the base
args change. Build the args list conditionally and append the timeout in a
deterministic place to avoid index errors and unexpected ordering.

codeflash/verification/test_runner.py [61-71]

 common_pytest_args = [
     "--capture=tee-sys",
     "-q",
     "--codeflash_loops_scope=session",
     "--codeflash_min_loops=1",
     "--codeflash_max_loops=1",
     f"--codeflash_seconds={pytest_target_runtime_seconds}",
 ]
 if pytest_timeout is not None:
-    common_pytest_args.insert(1, f"--timeout={pytest_timeout}")
+    common_pytest_args.append(f"--timeout={pytest_timeout}")
Suggestion importance[1-10]: 5

__

Why: Replacing insert with append is a small robustness improvement and matches the nearby pattern used elsewhere (line profiling/benchmarking). Low risk, modest benefit; functionally equivalent.

Low
General
Clarify and standardize return contract

The function previously returned a possible line-profiler output file; callers still
expect two values but the first should be the JUnit XML path. Ensure all call sites
use the returned JUnit XML path and not a line-profiler file name to prevent
downstream parsing errors.

codeflash/verification/test_runner.py [171-182]

+# Ensure this function consistently returns the JUnit XML path for parsing
 return result_file_path, results
Suggestion importance[1-10]: 3

__

Why: The PR already standardizes this function to return (result_file_path, results); the suggestion reiterates the contract without proposing a change. It's accurate but low impact.

Low

Previous suggestions

Suggestions up to commit b507770
CategorySuggestion                                                                                                                                    Impact
General
Replace assert with clear error

This hard assertion will crash when older code paths pass "unittest". Replace with a
defensive normalization or user-facing error to prevent abrupt termination. Provide
a clear message guiding the user that only pytest is supported now.

codeflash/tracing/replay_test.py [52]

-assert test_framework == "pytest"
+if test_framework != "pytest":
+    raise ValueError("Only pytest is supported for replay tests. Please switch your test framework to 'pytest'.")
Suggestion importance[1-10]: 7

__

Why: Replacing assert test_framework == "pytest" with a ValueError provides a clearer, non-optimized runtime check and user-facing message. It's accurate to the new pytest-only direction and improves error handling without changing behavior.

Medium
Add base dir fallback

Previously, unittest used project_root_path while pytest used tests_project_rootdir.
If any consumers still pass junit XML from outside tests_project_rootdir, path
resolution may fail. Ensure base_dir is set from tests_project_rootdir when
available, else fall back to project_root_path.

codeflash/verification/parse_test_output.py [213]

-base_dir = test_config.tests_project_rootdir
+base_dir = test_config.tests_project_rootdir or test_config.project_root_path
Suggestion importance[1-10]: 6

__

Why: The PR unconditionally sets base_dir = test_config.tests_project_rootdir. Adding a fallback to project_root_path can improve resilience when tests XML originates outside the tests root; it's contextually reasonable though not essential.

Low
Possible issue
Restore safe framework fallback

Falling back to detection from test files was removed, which can misconfigure
projects without explicit config files. Restore file-based detection as a fallback
when detected_framework is None to avoid silently defaulting to an incorrect
framework.

codeflash/cli_cmds/cmd_init.py [459]

-autodetected_test_framework = detected_framework
+autodetected_test_framework = detected_framework or "pytest"
Suggestion importance[1-10]: 4

__

Why: The snippet at line 459 sets autodetected_test_framework = detected_framework, removing the previous fallback. Proposing a default "pytest" is safer than nothing but may still mis-detect; it's a modest robustness improvement, not critical.

Low

restore

restore this too

Revert "first pass"

This reverts commit b507770.
@KRRT7 KRRT7 force-pushed the explicit-unittest branch from 2b2fd42 to db472bb Compare December 3, 2025 09:09
@github-actions github-actions bot added the workflow-modified This PR modifies GitHub Actions workflows label Dec 3, 2025
@KRRT7 KRRT7 changed the title remove unittest code use pytest as the execution engine for all tests Dec 3, 2025
@KRRT7 KRRT7 marked this pull request as ready for review December 3, 2025 11:56
@github-actions
Copy link

github-actions bot commented Dec 3, 2025

Persistent review updated to latest commit 1c9e886

@KRRT7 KRRT7 requested a review from misrasaurabh1 December 3, 2025 19:23
- name: Install dependencies (CLI)
run: |
uv sync
uv add timeout_decorator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add to toml and update lock file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh my bad, we're removing this from the dependencies, do we need the workflow file then?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Review effort 3/5 workflow-modified This PR modifies GitHub Actions workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants