-
Notifications
You must be signed in to change notification settings - Fork 1
Optimize streaming buffer trimming #698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
WalkthroughThe pull request rewrites the ToolCallRepairProcessor logic to explicitly manage per-stream buffers with robust UTF-8-safe trimming. It adds synthetic tool-call handling when closing tags are missing and updates state management for buffer persistence. Previously disabled tests for the repair processor are re-enabled. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Processor as ToolCallRepairProcessor
participant Buffer as Per-Stream Buffer
participant Detector as Tool-Call Detector
Client->>Processor: process(event, stream_id)
Processor->>Buffer: retrieve(stream_id)
Note over Processor: Accumulate chunk<br/>Handle reasoning
Processor->>Detector: detect tool_calls()
alt Tool calls detected
Detector-->>Processor: tool_calls, pending_text
Processor->>Buffer: append to buffer
Note over Processor,Buffer: UTF-8-safe trim<br/>if exceeds max
else No tool calls
Processor->>Buffer: accumulate
end
alt content.done event
Processor->>Processor: Attempt synthetic<br/>repair if needed
Note over Processor: Check for pending<br/>segments & closing tags
end
alt is_done or is_cancellation
Processor->>Buffer: persist or drop state
end
Processor-->>Client: emit event + metadata
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/core/services/streaming/tool_call_repair_processor.py(2 hunks)tests/unit/core/services/test_tool_call_repair.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Avoid using emojis in Python code comments or docstrings
Follow PEP 8 and use type hints for all functions
Import order must be: standard library, third-party, then local imports, separated by blank lines
Naming conventions: snake_case for variables and functions; PascalCase for classes
Error handling: use specific exceptions and include meaningful error messages
Prefer f-strings for string formatting
Files:
src/core/services/streaming/tool_call_repair_processor.pytests/unit/core/services/test_tool_call_repair.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Catch specific exceptions; avoid broad except Exception blocks
If a broad exception must be caught, log with exc_info=True and re-raise a specific custom exception
Use the most specific exception class from src.core.common.exceptions that accurately describes the error
Provide clear, helpful error messages and include relevant details in the details dictionary of custom exceptions
Files:
src/core/services/streaming/tool_call_repair_processor.py
| if not detected_tool_calls: | ||
| trimmed = self._trim_buffer(state["pending_text"]) | ||
| if trimmed: | ||
| repaired_content_parts.append(trimmed) | ||
| state["pending_text"] = state["pending_text"][len(trimmed) :] | ||
|
|
||
| if content.is_done: | ||
| pending_text = state["pending_text"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restore pass-through for plain text streaming.
Because _trim_buffer only returns data once pending_text exceeds self._max_buffer_bytes, this branch now emits nothing for ordinary chunks that do not contain tool calls. With the default cap (~64 KiB), plain-text streams return empty chunks until the terminal is_done flush, effectively disabling streaming for normal responses. Please reinstate immediate pass-through for non-tool-call text (keeping only a minimal tail for future detection) so we do not regress streaming behaviour.
🤖 Prompt for AI Agents
In src/core/services/streaming/tool_call_repair_processor.py around lines 97 to
104, the branch that handles "no detected_tool_calls" currently relies on
_trim_buffer and therefore emits nothing until pending_text exceeds
_max_buffer_bytes; restore immediate pass-through for ordinary text by emitting
all but a small tail used for future detection: if pending_text length <=
self._max_buffer_bytes, append the whole pending_text to repaired_content_parts
and clear state["pending_text"]; otherwise append
pending_text[:-self._max_buffer_bytes] to repaired_content_parts and set
state["pending_text"] to the last self._max_buffer_bytes characters; keep the
existing behavior for content.is_done (flush remaining pending_text).
Summary
Testing
Codex Task
Summary by CodeRabbit
Bug Fixes
Tests