fix: include_stop_str_in_output #4688

drnikolaev · 2025-12-02T02:22:04Z

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

Summary by CodeRabbit

New Features

Enhanced stop sequence control. Stop sequences are now configurable for output visibility—they can either be included in or excluded from the final generated text. This provides users with greater flexibility and control over text generation formatting and output structure based on specific use case needs.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

copy-pr-bot · 2025-12-02T02:22:08Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2025-12-02T02:24:55Z

Walkthrough

This PR threads an include_stop_str_in_output boolean parameter through the backend decoding pipeline, distinguishing between hidden and visible stop sequences. The Decoder struct now accepts this parameter during initialization, categorizes stop sequences accordingly, and implements logic to include stop strings in output when triggered, with corresponding updates to the decoding processing loop and related enum variants.

Changes

Cohort / File(s)	Summary
Benchmark decoder call sites `lib/llm/benches/tokenizer.rs`	Updated `Decoder::new` calls to accept a third boolean argument (`false`) aligning with the new method signature across all benchmark code paths.
Backend decoder implementation `lib/llm/src/backend.rs`	Added `include_stop_str_in_output` parameter threading through `Backend::decoder` and `Decoder::new`. Introduced `visible_stop_sequences` field to Decoder and `VisibleStopSequenceDetected` enum variant to StopTrigger. Updated stop sequence categorization logic, output handling to include visible stop strings, and processing pipeline to distinguish hidden vs. visible sequence detection.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Logic complexity: The distinction between hidden and visible stop sequences introduces conditional branching and state management changes that require careful verification of correctness.
Integration points: Stop sequence handling is threaded through multiple layers (Backend, Decoder, processing loop), requiring validation of parameter propagation and state consistency.
Output behavior: Changes to jail/content handling and when stop strings are included demand careful review to ensure existing behavior remains intact for hidden sequences while new behavior works correctly for visible ones.
Areas requiring extra attention:
- Verification that hidden_stop_sequences and visible_stop_sequences categorization logic correctly splits sequences based on the include_stop_str_in_output flag
- Validation of the processing loop's dual-check logic (hidden first, then visible) to ensure correct precedence and termination behavior
- Confirmation that jail_max_bytes computation incorporates both sequence sets appropriately
- Testing of edge cases where visible and hidden sequences may overlap or interact

Poem

🐰 Stop strings now shine with pride!
Hidden or visible, they don't hide.
Through the decoder, parameters flow,
Making output bright and aglow!
✨ One feature, one flag, one way to go!

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The pull request description contains only placeholder template text with no actual content in any required sections (Overview, Details, Where should the reviewer start?, Related Issues).	Fill in the Overview section with what this change accomplishes, add Details explaining the implementation, specify which files need review focus, and replace '#xxx' with the actual GitHub issue number.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix: include_stop_str_in_output' directly aligns with the main changes, which introduce and thread the include_stop_str_in_output parameter throughout the decoding backend.
Docstring Coverage	✅ Passed	Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

lib/llm/src/backend.rs (1)

178-185: Missing match arm for VisibleStopSequenceDetected.

The finish_reason match does not handle the new StopTrigger::VisibleStopSequenceDetected variant. When a visible stop sequence is detected, finish_reason will incorrectly remain None instead of being set to FinishReason::Stop.

Apply this diff to handle the new variant:

                     let finish_reason = match &result.stop_trigger {
                         Some(StopTrigger::MaxTokensLimit) => Some(FinishReason::Length),
                         Some(StopTrigger::HiddenStopTokenDetected(_)) => Some(FinishReason::Stop),
                         Some(StopTrigger::HiddenStopSequenceDetected(_)) => {
                             Some(FinishReason::Stop)
                         }
+                        Some(StopTrigger::VisibleStopSequenceDetected(_)) => {
+                            Some(FinishReason::Stop)
+                        }
                         None => None,
                     };

🧹 Nitpick comments (1)

lib/llm/src/backend.rs (1)
291-293: Consider updating this TODO comment.

The visible_stop_sequences feature is now implemented. This comment could be updated to reflect that only visible_stop_ids remains TODO, or removed if visible_stop_ids is not planned.
-// todo - add visible stop conditions
-// visible_stop_ids: HashSet<TokenIdType>,
-// visible_stop_sequences: Vec<String>,
+// todo - consider adding visible_stop_ids: HashSet<TokenIdType>

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5708b70 and dc51439.

📒 Files selected for processing (2)

lib/llm/benches/tokenizer.rs (2 hunks)
lib/llm/src/backend.rs (11 hunks)

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-06-24T20:59:35.725Z

Learnt from: ishandhanani
Repo: ai-dynamo/dynamo PR: 1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.

Applied to files:

lib/llm/src/backend.rs

📚 Learning: 2025-09-10T15:27:42.511Z

Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 2932
File: lib/llm/src/preprocessor.rs:768-844
Timestamp: 2025-09-10T15:27:42.511Z
Learning: In the tool calling jail implementation in lib/llm/src/preprocessor.rs, the design intentionally emits only the first accumulated choice that contains tool calls during unjailing, dropping other accumulated choices. This is a deliberate design decision, not a bug.

Applied to files:

lib/llm/src/backend.rs

🧬 Code graph analysis (1)

lib/llm/benches/tokenizer.rs (2)

lib/llm/src/backend.rs (1)

new (382-423)

components/src/dynamo/sglang/protocol.py (1)

StopConditions (19-24)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: tests (lib/runtime/examples)
GitHub Check: tests (launch/dynamo-run)
GitHub Check: tests (lib/bindings/python)
GitHub Check: clippy (launch/dynamo-run)
GitHub Check: tests (.)
GitHub Check: clippy (.)
GitHub Check: clippy (lib/bindings/python)
GitHub Check: Build and Test - dynamo

🔇 Additional comments (7)

lib/llm/benches/tokenizer.rs (1)

57-57: LGTM!

The benchmark call sites are correctly updated to pass false for the new include_stop_str_in_output parameter, maintaining the previous default behavior where stop strings are hidden from output.

Also applies to: 81-81

lib/llm/src/backend.rs (6)

92-114: LGTM!

The include_stop_str_in_output parameter is cleanly threaded through the decoder method to the Decoder::new constructor.

132-149: LGTM!

The include_stop_str_in_output option is correctly extracted from sampling_options with a safe default of false, maintaining backward compatibility.

313-316: LGTM!

The visible_stop_sequences field is well-documented and follows the same pattern as hidden_stop_sequences.

332-350: LGTM!

The StopTrigger enum is properly extended with VisibleStopSequenceDetected, and should_hide_text correctly returns false for this variant, ensuring the stop string appears in the output.

395-423: LGTM!

The categorization logic correctly routes all stop sequences to either hidden_stop_sequences or visible_stop_sequences based on the include_stop_str_in_output flag. The jail_max_bytes calculation properly accounts for both sets.

488-507: LGTM!

The visible stop sequence detection correctly includes the stop string in the output by computing stop_end = offset + seq.len() and returning text up to that point. The edge case handling for partial matches (when stop_end <= pre_append) is appropriate.

grahamking · 2025-12-02T21:58:13Z

@drnikolaev Can you look at the CI errors? cargo clippy is one of them.

fix: include_stop_str_in_output

dc51439

drnikolaev requested a review from a team as a code owner December 2, 2025 02:22

pull-request-size bot added the size/M label Dec 2, 2025

github-actions bot added the fix label Dec 2, 2025

coderabbitai bot reviewed Dec 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: include_stop_str_in_output #4688

fix: include_stop_str_in_output #4688

drnikolaev commented Dec 2, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Dec 2, 2025

Uh oh!

coderabbitai bot commented Dec 2, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

grahamking commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: include_stop_str_in_output #4688

Are you sure you want to change the base?

fix: include_stop_str_in_output #4688

Conversation

drnikolaev commented Dec 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

New Features

Uh oh!

copy-pr-bot bot commented Dec 2, 2025

Uh oh!

coderabbitai bot commented Dec 2, 2025

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

grahamking commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

drnikolaev commented Dec 2, 2025 •

edited by coderabbitai bot

Loading