Skip to content

Conversation

@drnikolaev
Copy link

@drnikolaev drnikolaev commented Dec 2, 2025

Overview:

Context: TODO

Fixes nvbug: https://nvbugs/5662384

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced AI response handling to properly format messages when reasoning content is present without accompanying text.

✏️ Tip: You can customize this high-level summary in your review settings.

@drnikolaev drnikolaev requested a review from a team as a code owner December 2, 2025 02:36
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 2, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

Walkthrough

A single file modification updates the From<DeltaChoice> implementation to conditionally return an empty string instead of None for content when delta.reasoning_content is present but delta.text is empty, altering how empty content is represented in ChatChoice conversion.

Changes

Cohort / File(s) Summary
ChatChoice conversion logic
lib/llm/src/protocols/openai/chat_completions/aggregator.rs
Modified content construction in From<DeltaChoice> to return Some(String::new()) when delta.reasoning_content exists and delta.text is empty, instead of returning None.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Focus attention on the conditional logic chain to verify the new Some(String::new()) path correctly handles the reasoning content case
  • Consider downstream implications of emitting an empty string versus None for content field consumers

Poem

🐰 When reasoning flows but words stay mute,
An empty string finds its route,
No longer lost to silent None,
The content shows when thoughts are spun! 🎯

Pre-merge checks

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is largely incomplete with most required sections left empty or containing only placeholder text. Complete the Overview section (currently shows 'TODO'), provide detailed description of changes in the Details section, specify files to review, and replace the placeholder GitHub issue reference with the actual issue number.
Title check ❓ Inconclusive The title is vague and incomplete—it mentions the field returns None but lacks context about what was changed, why, or in what component, making it unclear to reviewers scanning history. Revise the title to be more specific and descriptive, e.g., 'fix: Return empty string for ChatChoice content when reasoning present but text absent' to clearly communicate the change.
✅ Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5708b70 and c9053a7.

📒 Files selected for processing (1)
  • lib/llm/src/protocols/openai/chat_completions/aggregator.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: The create_choice method exists on multiple different objects in the codebase. The DeltaGenerator::create_choice in lib/llm/src/protocols/openai/chat_completions/delta.rs has its own signature that was updated to include reasoning_content, but other objects in lib/llm/src/engines.rs have their own separate create_choice methods with different signatures that are not related to chat completions.
Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: There are two separate DeltaGenerator classes in the codebase: one for chat completions (lib/llm/src/protocols/openai/chat_completions/delta.rs with object "chat.completion.chunk") and one for text completions (lib/llm/src/protocols/openai/completions/delta.rs with object "text_completion"). They have different create_choice method signatures and serve different OpenAI API endpoints. The reasoning parsing functionality is only relevant to the chat completions DeltaGenerator.
📚 Learning: 2025-08-22T19:55:41.608Z
Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: The create_choice method exists on multiple different objects in the codebase. The DeltaGenerator::create_choice in lib/llm/src/protocols/openai/chat_completions/delta.rs has its own signature that was updated to include reasoning_content, but other objects in lib/llm/src/engines.rs have their own separate create_choice methods with different signatures that are not related to chat completions.

Applied to files:

  • lib/llm/src/protocols/openai/chat_completions/aggregator.rs
📚 Learning: 2025-08-22T19:55:41.608Z
Learnt from: nachiketb-nvidia
Repo: ai-dynamo/dynamo PR: 2656
File: lib/llm/src/protocols/openai/chat_completions/delta.rs:320-327
Timestamp: 2025-08-22T19:55:41.608Z
Learning: There are two separate DeltaGenerator classes in the codebase: one for chat completions (lib/llm/src/protocols/openai/chat_completions/delta.rs with object "chat.completion.chunk") and one for text completions (lib/llm/src/protocols/openai/completions/delta.rs with object "text_completion"). They have different create_choice method signatures and serve different OpenAI API endpoints. The reasoning parsing functionality is only relevant to the chat completions DeltaGenerator.

Applied to files:

  • lib/llm/src/protocols/openai/chat_completions/aggregator.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Build and Test - dynamo
  • GitHub Check: clippy (.)
  • GitHub Check: clippy (lib/runtime/examples)
  • GitHub Check: tests (lib/bindings/python)
  • GitHub Check: tests (launch/dynamo-run)
  • GitHub Check: clippy (launch/dynamo-run)
  • GitHub Check: clippy (lib/bindings/python)
  • GitHub Check: tests (.)

Comment on lines 286 to +292
content: if delta.text.is_empty() {
None
// If we have reasoning content, provide an empty string instead of None.
if delta.reasoning_content.is_some() {
Some(String::new())
} else {
None
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add test coverage for the empty content with reasoning_content scenario.

The code change at lines 286-292 intentionally returns Some(String::new()) for the content field when reasoning_content is present but text is empty. While this behavior is clearly documented in the inline comment, there is no test coverage for this specific scenario. Add a test case to verify this behavior and prevent regressions.

🤖 Prompt for AI Agents
In lib/llm/src/protocols/openai/chat_completions/aggregator.rs around lines
286-292, add a unit test that constructs a delta with empty text and a
Some(reasoning_content) value, runs it through the aggregator code path that
builds the output struct, and asserts that the resulting content field is
Some(String::new()) (not None) while reasoning_content is preserved; place the
test alongside existing aggregator tests, name it clearly (e.g.,
empty_text_with_reasoning_returns_empty_string), and include any necessary
setup/mocks so it executes deterministically in CI.

@grahamking
Copy link
Contributor

grahamking commented Dec 2, 2025

@drnikolaev In three months we will hit a problem. The fix is deleting this line.

We will think: "Why did we add this line?". git blame, find this PR.

This PR needs to explain what problem you are fixing, and why we should not delete this line in three months (without referring to any NVIDIA internal tools).

role: delta.role.expect("delta should have a Role"),
content: if delta.text.is_empty() {
None
// If we have reasoning content, provide an empty string instead of None.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

@rmccorm4 rmccorm4 requested a review from ayushag-nv December 4, 2025 22:57
@rmccorm4
Copy link
Contributor

rmccorm4 commented Dec 4, 2025

@ayushag-nv to help repro the original issue and review or update the fix on this one

@ayushag-nv
Copy link
Contributor

ayushag-nv commented Dec 5, 2025

@rmccorm4

Command Used: python -m dynamo.vllm --model nvidia/Llama-3_3-Nemotron-Super-49B-v1_5 --trust-remote-code --tensor-parallel-size 2 --max-model-len=65536 --gpu-memory-utilization 0.95

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nvidia/Llama-3_3-Nemotron-Super-49B-v1_5",
    "messages": [
      {"role": "user", "content": "Tell me the answer of 1+1."}
    ],
    "max_tokens": 1000
  }' | jq
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  3159  100  2985  100   174    246     14  0:00:12  0:00:12 --:--:--   761
{
  "id": "chatcmpl-9525e0ad-0fca-4d3c-a299-cb9a241ceb64",
  "choices": [
    {
      "index": 0,
      "message": {
        "content": "<think>\nOkay, so the user asked, \"Tell me the answer of 1+1.\" Hmm, that seems pretty straightforward. But wait, maybe there's a trick here. Let me think. In most contexts, 1+1 is 2. But sometimes people ask this question to test if you're thinking outside the box or considering different number systems or contexts.\n\nLike, in binary, 1+1 is 10, right? But the user didn't specify the number system. They just said \"1+1,\" which in standard arithmetic is definitely 2. But maybe they're thinking of something else, like in computer science, sometimes 1+1 could be represented in a different way. Or maybe in a different cultural context or a riddle.\n\nWait, but the user is probably just looking for the basic math answer here. They might be testing if the AI can handle simple questions. Let me check the history. The user has asked similar questions before, like \"What is 2+2?\" and the answer was 4. So maybe they're just continuing that pattern.\n\nAlternatively, could there be any other interpretation? For example, if you have two objects and you add another one, you have two. But maybe in some abstract algebra context, but that's probably overcomplicating. The user most likely expects the standard answer.\n\nAlso, considering the way the question is phrased: \"Tell me the answer of 1+1.\" The use of \"the answer\" implies a single correct answer, which in common arithmetic is 2. Unless there's some wordplay, but I don't see any obvious puns here. \n\nAnother angle: in some programming languages, if you concatenate strings, \"1\"+\"1\" would be \"11\", but the question uses numbers, not strings. Unless there's a specific context where that's the case, but again, without more information, it's safer to assume the standard math answer.\n\nI think the safest bet is to answer 2, but maybe mention the binary case just in case. But the user might not be expecting that. Since they asked for \"the answer,\" and in most cases, it's 2, I should probably start with that and then offer additional contexts if needed. But maybe the user just wants a straightforward answer. Let me check the previous answers the assistant gave. The user mentioned before asking \"What is 2+2?\" and the answer was 4. So following that pattern, here it's 2. \n\nTherefore, the answer is 2. Unless there's some hidden trick, but given the previous interactions, it's likely expecting the standard response.\n</think>\n\nThe answer to $ 1 + 1 $ is **2** in standard arithmetic. \n\nIf you're asking in a different context (e.g., binary, where $ 1 + 1 = 10 $, or other abstract systems), feel free to clarify! But by default, it’s **2**. 😊",
        "role": "assistant",
        "reasoning_content": null
      },
      "finish_reason": "stop"
    }
  ],
  "created": 1764956455,
  "model": "nvidia/Llama-3_3-Nemotron-Super-49B-v1_5",
  "object": "chat.completion",
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 616,
    "total_tokens": 641
  }
}

Not reproducible on main

Comment on lines +288 to +292
if delta.reasoning_content.is_some() {
Some(String::new())
} else {
None
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this fix does not make sense to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants