Skip to content

Conversation

@matdev83
Copy link
Owner

@matdev83 matdev83 commented Nov 7, 2025

Summary

  • optimize the streaming tool-call repair buffer trimming by working on the encoded payload once and aligning the cutoff with UTF-8 boundaries
  • add a regression test to ensure multi-byte characters are preserved when trimming the buffer

Testing

  • python -m pytest

Codex Task

Summary by CodeRabbit

  • Bug Fixes

    • Improved streaming tool call processing with enhanced buffer management and UTF-8 safety.
    • Better handling of incomplete tool call repairs in streaming scenarios.
  • Tests

    • Re-enabled comprehensive test suite for tool call repair and streaming functionality.

@coderabbitai
Copy link

coderabbitai bot commented Nov 7, 2025

Walkthrough

The pull request rewrites the ToolCallRepairProcessor logic to explicitly manage per-stream buffers with robust UTF-8-safe trimming. It adds synthetic tool-call handling when closing tags are missing and updates state management for buffer persistence. Previously disabled tests for the repair processor are re-enabled.

Changes

Cohort / File(s) Summary
Core streaming tool-call repair processor
src/core/services/streaming/tool_call_repair_processor.py
Rewrites the process method to maintain explicit per-stream buffers with enhanced chunk accumulation and reasoning segment handling. Introduces UTF-8-safe buffer trimming via iterative byte-length reduction to avoid splitting multibyte characters. Adds synthetic tool-call repair logic at the content.done path. Updates state management for buffer persistence based on is_done/is_cancellation flags and pending text. Enriches metadata with detected tool calls and finish reason.
Tool-call repair test suite
tests/unit/core/services/test_tool_call_repair.py
Re-enables comprehensive test suite covering ToolCallRepairService and StreamingToolCallRepairProcessor behaviors, including fixtures, buffering, reasoning extraction, XML/MCP-wrapped tool calls, mixed message formats, integration scenarios, error handling, and performance checks.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Processor as ToolCallRepairProcessor
    participant Buffer as Per-Stream Buffer
    participant Detector as Tool-Call Detector

    Client->>Processor: process(event, stream_id)
    Processor->>Buffer: retrieve(stream_id)
    Note over Processor: Accumulate chunk<br/>Handle reasoning
    Processor->>Detector: detect tool_calls()
    alt Tool calls detected
        Detector-->>Processor: tool_calls, pending_text
        Processor->>Buffer: append to buffer
        Note over Processor,Buffer: UTF-8-safe trim<br/>if exceeds max
    else No tool calls
        Processor->>Buffer: accumulate
    end
    alt content.done event
        Processor->>Processor: Attempt synthetic<br/>repair if needed
        Note over Processor: Check for pending<br/>segments & closing tags
    end
    alt is_done or is_cancellation
        Processor->>Buffer: persist or drop state
    end
    Processor-->>Client: emit event + metadata
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Buffer trimming logic: UTF-8-safe byte-length calculation and iterative reduction requires careful validation to ensure no character corruption across diverse Unicode inputs.
  • State management: Per-stream buffer persistence logic tied to is_done/is_cancellation/pending_text conditions needs verification against all edge cases (cancellation mid-stream, concurrent streams).
  • Synthetic tool-call repair: New logic path at content.done that reconstructs tool calls requires tracing through detection, closing-tag validation, and metadata enrichment.
  • Test re-enablement scope: Large test suite re-activation—verify that all test cases reflect current processor behavior and catch regressions across buffering, reasoning, XML handling, and error paths.

Possibly related PRs

  • Prevent unbounded tool call repair buffering #576: Directly modifies the same ToolCallRepairProcessor streaming buffering logic, including UTF-8-safe trimming and buffer max-size handling—suggests coordinated or sequential changes to the same component.

Poem

🐰 Buffers dance with UTF-8 care,
Trimming bytes through the air,
Tool calls mend when tags are lost,
State preserved, stream lines crossed,
Tests awakened, logic bright,
Streaming repairs done just right!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 47.92% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Optimize streaming buffer trimming' directly reflects the main changes in the PR: UTF-8-safe buffer trimming optimization in the ToolCallRepairProcessor and re-enabled tests.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/find-and-fix-major-performance-issues

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c03159e and 6d8563f.

📒 Files selected for processing (2)
  • src/core/services/streaming/tool_call_repair_processor.py (2 hunks)
  • tests/unit/core/services/test_tool_call_repair.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Avoid using emojis in Python code comments or docstrings
Follow PEP 8 and use type hints for all functions
Import order must be: standard library, third-party, then local imports, separated by blank lines
Naming conventions: snake_case for variables and functions; PascalCase for classes
Error handling: use specific exceptions and include meaningful error messages
Prefer f-strings for string formatting

Files:

  • src/core/services/streaming/tool_call_repair_processor.py
  • tests/unit/core/services/test_tool_call_repair.py
src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Catch specific exceptions; avoid broad except Exception blocks
If a broad exception must be caught, log with exc_info=True and re-raise a specific custom exception
Use the most specific exception class from src.core.common.exceptions that accurately describes the error
Provide clear, helpful error messages and include relevant details in the details dictionary of custom exceptions

Files:

  • src/core/services/streaming/tool_call_repair_processor.py

Comment on lines +97 to 104
if not detected_tool_calls:
trimmed = self._trim_buffer(state["pending_text"])
if trimmed:
repaired_content_parts.append(trimmed)
state["pending_text"] = state["pending_text"][len(trimmed) :]

if content.is_done:
pending_text = state["pending_text"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Restore pass-through for plain text streaming.

Because _trim_buffer only returns data once pending_text exceeds self._max_buffer_bytes, this branch now emits nothing for ordinary chunks that do not contain tool calls. With the default cap (~64 KiB), plain-text streams return empty chunks until the terminal is_done flush, effectively disabling streaming for normal responses. Please reinstate immediate pass-through for non-tool-call text (keeping only a minimal tail for future detection) so we do not regress streaming behaviour.

🤖 Prompt for AI Agents
In src/core/services/streaming/tool_call_repair_processor.py around lines 97 to
104, the branch that handles "no detected_tool_calls" currently relies on
_trim_buffer and therefore emits nothing until pending_text exceeds
_max_buffer_bytes; restore immediate pass-through for ordinary text by emitting
all but a small tail used for future detection: if pending_text length <=
self._max_buffer_bytes, append the whole pending_text to repaired_content_parts
and clear state["pending_text"]; otherwise append
pending_text[:-self._max_buffer_bytes] to repaired_content_parts and set
state["pending_text"] to the last self._max_buffer_bytes characters; keep the
existing behavior for content.is_done (flush remaining pending_text).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants