Skip to content

Conversation

@drnikolaev
Copy link

@drnikolaev drnikolaev commented Dec 2, 2025

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

New Features

  • Enhanced stop sequence control. Stop sequences are now configurable for output visibility—they can either be included in or excluded from the final generated text. This provides users with greater flexibility and control over text generation formatting and output structure based on specific use case needs.

✏️ Tip: You can customize this high-level summary in your review settings.

@drnikolaev drnikolaev requested a review from a team as a code owner December 2, 2025 02:22
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 2, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

Walkthrough

This PR threads an include_stop_str_in_output boolean parameter through the backend decoding pipeline, distinguishing between hidden and visible stop sequences. The Decoder struct now accepts this parameter during initialization, categorizes stop sequences accordingly, and implements logic to include stop strings in output when triggered, with corresponding updates to the decoding processing loop and related enum variants.

Changes

Cohort / File(s) Summary
Benchmark decoder call sites
lib/llm/benches/tokenizer.rs
Updated Decoder::new calls to accept a third boolean argument (false) aligning with the new method signature across all benchmark code paths.
Backend decoder implementation
lib/llm/src/backend.rs
Added include_stop_str_in_output parameter threading through Backend::decoder and Decoder::new. Introduced visible_stop_sequences field to Decoder and VisibleStopSequenceDetected enum variant to StopTrigger. Updated stop sequence categorization logic, output handling to include visible stop strings, and processing pipeline to distinguish hidden vs. visible sequence detection.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Logic complexity: The distinction between hidden and visible stop sequences introduces conditional branching and state management changes that require careful verification of correctness.
  • Integration points: Stop sequence handling is threaded through multiple layers (Backend, Decoder, processing loop), requiring validation of parameter propagation and state consistency.
  • Output behavior: Changes to jail/content handling and when stop strings are included demand careful review to ensure existing behavior remains intact for hidden sequences while new behavior works correctly for visible ones.
  • Areas requiring extra attention:
    • Verification that hidden_stop_sequences and visible_stop_sequences categorization logic correctly splits sequences based on the include_stop_str_in_output flag
    • Validation of the processing loop's dual-check logic (hidden first, then visible) to ensure correct precedence and termination behavior
    • Confirmation that jail_max_bytes computation incorporates both sequence sets appropriately
    • Testing of edge cases where visible and hidden sequences may overlap or interact

Poem

🐰 Stop strings now shine with pride!
Hidden or visible, they don't hide.
Through the decoder, parameters flow,
Making output bright and aglow!
✨ One feature, one flag, one way to go!

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request description contains only placeholder template text with no actual content in any required sections (Overview, Details, Where should the reviewer start?, Related Issues). Fill in the Overview section with what this change accomplishes, add Details explaining the implementation, specify which files need review focus, and replace '#xxx' with the actual GitHub issue number.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: include_stop_str_in_output' directly aligns with the main changes, which introduce and thread the include_stop_str_in_output parameter throughout the decoding backend.
Docstring Coverage ✅ Passed Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/llm/src/backend.rs (1)

178-185: Missing match arm for VisibleStopSequenceDetected.

The finish_reason match does not handle the new StopTrigger::VisibleStopSequenceDetected variant. When a visible stop sequence is detected, finish_reason will incorrectly remain None instead of being set to FinishReason::Stop.

Apply this diff to handle the new variant:

                     let finish_reason = match &result.stop_trigger {
                         Some(StopTrigger::MaxTokensLimit) => Some(FinishReason::Length),
                         Some(StopTrigger::HiddenStopTokenDetected(_)) => Some(FinishReason::Stop),
                         Some(StopTrigger::HiddenStopSequenceDetected(_)) => {
                             Some(FinishReason::Stop)
                         }
+                        Some(StopTrigger::VisibleStopSequenceDetected(_)) => {
+                            Some(FinishReason::Stop)
+                        }
                         None => None,
                     };
🧹 Nitpick comments (1)
lib/llm/src/backend.rs (1)

291-293: Consider updating this TODO comment.

The visible_stop_sequences feature is now implemented. This comment could be updated to reflect that only visible_stop_ids remains TODO, or removed if visible_stop_ids is not planned.

-// todo - add visible stop conditions
-// visible_stop_ids: HashSet<TokenIdType>,
-// visible_stop_sequences: Vec<String>,
+// todo - consider adding visible_stop_ids: HashSet<TokenIdType>
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5708b70 and dc51439.

📒 Files selected for processing (2)
  • lib/llm/benches/tokenizer.rs (2 hunks)
  • lib/llm/src/backend.rs (11 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-06-24T20:59:35.725Z
Learnt from: ishandhanani
Repo: ai-dynamo/dynamo PR: 1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.

Applied to files:

  • lib/llm/src/backend.rs
📚 Learning: 2025-09-10T15:27:42.511Z
Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 2932
File: lib/llm/src/preprocessor.rs:768-844
Timestamp: 2025-09-10T15:27:42.511Z
Learning: In the tool calling jail implementation in lib/llm/src/preprocessor.rs, the design intentionally emits only the first accumulated choice that contains tool calls during unjailing, dropping other accumulated choices. This is a deliberate design decision, not a bug.

Applied to files:

  • lib/llm/src/backend.rs
🧬 Code graph analysis (1)
lib/llm/benches/tokenizer.rs (2)
lib/llm/src/backend.rs (1)
  • new (382-423)
components/src/dynamo/sglang/protocol.py (1)
  • StopConditions (19-24)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: tests (lib/runtime/examples)
  • GitHub Check: tests (launch/dynamo-run)
  • GitHub Check: tests (lib/bindings/python)
  • GitHub Check: clippy (launch/dynamo-run)
  • GitHub Check: tests (.)
  • GitHub Check: clippy (.)
  • GitHub Check: clippy (lib/bindings/python)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (7)
lib/llm/benches/tokenizer.rs (1)

57-57: LGTM!

The benchmark call sites are correctly updated to pass false for the new include_stop_str_in_output parameter, maintaining the previous default behavior where stop strings are hidden from output.

Also applies to: 81-81

lib/llm/src/backend.rs (6)

92-114: LGTM!

The include_stop_str_in_output parameter is cleanly threaded through the decoder method to the Decoder::new constructor.


132-149: LGTM!

The include_stop_str_in_output option is correctly extracted from sampling_options with a safe default of false, maintaining backward compatibility.


313-316: LGTM!

The visible_stop_sequences field is well-documented and follows the same pattern as hidden_stop_sequences.


332-350: LGTM!

The StopTrigger enum is properly extended with VisibleStopSequenceDetected, and should_hide_text correctly returns false for this variant, ensuring the stop string appears in the output.


395-423: LGTM!

The categorization logic correctly routes all stop sequences to either hidden_stop_sequences or visible_stop_sequences based on the include_stop_str_in_output flag. The jail_max_bytes calculation properly accounts for both sets.


488-507: LGTM!

The visible stop sequence detection correctly includes the stop string in the output by computing stop_end = offset + seq.len() and returning text up to that point. The edge case handling for partial matches (when stop_end <= pre_append) is appropriate.

@grahamking
Copy link
Contributor

@drnikolaev Can you look at the CI errors? cargo clippy is one of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants