feat: Add logprobs support to vLLM backend (#4683) #4697

AryanBagade · 2025-12-02T11:21:04Z

Implement logprobs functionality for vLLM backend to pass log probability information from vLLM to the frontend.

Overview:

This PR adds logprobs support to the vLLM backend, enabling it to pass log probability information from vLLM's Request-Output back to the Dynamo's frontend. The frontend already supports logprobs outputs, but the vLLM backend was not propagating this particular info when giving responses.

Details:

handlers.py changes:

Added support for logprobs and prompt_logprobs in build_sampling_params(), these get passed to vLLM's SamplingParams
New _extract_logprobs() method grabs the logprob data from vLLM's output nd formats it for Dynamo
Updated generate_tokens() to call the extraction method nd add logprobs to the response

Testing:

Created validation classes that check logprobs are actually present and correctly structured in responses
Added aggregated_logprobs test config that tests both chat and completion endpoints with logprobs enabled
Tests validate the full round-trip: request with logprobs → vLLM processing → response with logprobs

Where should the reviewer start?

components/src/dynamo/vllm/handlers.py - Core implementation:

Lines 99-110: output_options handling in build_sampling_params()
Lines 574-636: _extract_logprobs() method implementation
Lines 684-691: Integration in generate_tokens() method

tests/utils/payloads.py - Validation logic:

Lines 159-188: ChatPayloadWithLogprobs validation
Lines 214-239: CompletionPayloadWithLogprobs validation

tests/serve/test_vllm.py - E2E test:

Lines 56-78: aggregated_logprobs test configuration

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Closes [FEATURE]: Add logprobs support to vLLM Backend #4683
Follows approach from PoC: logprobs support for vLLM backend #4678

Summary by CodeRabbit

New Features
- Added support for returning token-level log probabilities (logprobs and top_logprobs) in API responses
- Chat and completion requests can now enable logprobs via request options
Tests
- Added test payloads and validators to verify logprobs are present and correctly structured
- New test configurations exercise logprob behavior for both chat and completion flows

_{✏️ Tip: You can customize this high-level summary in your review settings.}

copy-pr-bot · 2025-12-02T11:21:09Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2025-12-02T11:21:16Z

👋 Hi AryanBagade! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

Implement logprobs functionality for vLLM backend to pass log probability information from vLLM to the frontend. Signed-off-by: Aryan Bagade <aryan@aryanbagade.com>

coderabbitai · 2025-12-02T11:24:46Z

Walkthrough

Adds end-to-end logprobs support for the vLLM backend: request output_options are consumed into sampling params, per-token logprobs/top_logprobs are extracted from vLLM CompletionOutput, emitted with token streams, and tests/payload validators added to cover the feature.

Changes

Cohort / File(s)	Change Summary
Core vLLM handlers `components/src/dynamo/vllm/handlers.py`	Added `_extract_logprobs()` (static) in handler classes; updated `build_sampling_params()` to read `output_options` (map `logprobs` and `prompt_logprobs`); integrated logprob extraction into token generation/streaming payloads and minor control-flow/logging adjustments.
Tests: vLLM config `tests/serve/test_vllm.py`	Added `aggregated_logprobs` VLLMConfig entry; imported and wired `chat_payload_with_logprobs` and `completion_payload_with_logprobs` into `request_payloads`.
Tests: payload builders `tests/utils/payload_builder.py`	Extended `chat_payload()` and `completion_payload()` signatures to accept `logprobs`/`top_logprobs`; added `chat_payload_with_logprobs()` and `completion_payload_with_logprobs()` builders that produce requests with logprobs enabled.
Tests: payload validators `tests/utils/payloads.py`	Added `ChatPayloadWithLogprobs` and `CompletionPayloadWithLogprobs` classes that extend existing validators to assert presence and structure of `logprobs`, `token_logprobs`, `tokens`, and `top_logprobs` in responses.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Review _extract_logprobs() logic for correct indexing relative to num_output_tokens_so_far and handling of missing fields.
Verify build_sampling_params() correctly converts request output_options into integer sampling params and doesn't alter other sampling behavior.
Confirm token-stream emission in generate_tokens() includes logprob fields without disrupting existing streaming payload format.
Check new test payload validators for robust JSON parsing and assertions that match runtime response shapes.

Poem

🐇 I nibble through outputs, counting each hop,
Tokens and logprobs — a delicious crop.
I fetch top choices and tidy their sums,
Streaming probability carrots for curious crumbs.
Hooray — the backend now sings in full stop!

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly and specifically describes the main change: adding logprobs support to the vLLM backend.
Description check	✅ Passed	The PR description comprehensively covers all required template sections with detailed explanations of changes, testing approach, and specific line numbers for reviewer guidance.
Linked Issues check	✅ Passed	The PR successfully implements all requirements from issue #4683: logprobs extraction in handlers, LLMEngineOutput field population, and E2E test coverage for both chat and completion endpoints.
Out of Scope Changes check	✅ Passed	All changes are directly related to implementing logprobs support: handlers logic, test utilities for payload building/validation, and E2E test configuration; no unrelated modifications detected.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (6)

tests/utils/payloads.py (2)

158-188: Strengthen chat logprobs validation to catch null/empty responses

Right now this passes if the backend returns "logprobs": null or "content": [] even when you explicitly requested logprobs. If the intent of this payload is to ensure logprobs are actually produced, consider asserting that logprobs_data is not None and that content_logprobs is non-empty, so regressions where logprobs disappear don’t silently pass these tests.

213-244: Make completion logprobs checks fail when logprobs are omitted

Similar to the chat variant, this validation will still succeed when choice["logprobs"] is present but None (or when token_logprobs is empty). If this test is meant to guarantee that logprobs are actually returned, you may want to assert that logprobs_data is not None and that token_logprobs is non-empty, in addition to the existing length equality check.

components/src/dynamo/vllm/handlers.py (2)

99-110: output_options mapping is fine but only covers logprobs/prompt_logprobs

The mapping of output_options["logprobs"] and "prompt_logprobs" into SamplingParams looks reasonable and defensive (casting to int). Just be aware that any other keys added under output_options won’t be honored here; if future output options are introduced, this is the place they’ll need to be plumbed through.

573-637: Keep per-token alignment between token_ids, log_probs, and top_logprobs

The overall extraction logic looks correct, but two details are worth tightening:

You continue when token_logprobs_dict is None and only append to log_probs when you find at least one logprob, while top_logprobs always gets an entry (even if empty). This can easily lead to len(token_ids_slice) != len(log_probs) and/or len(log_probs) != len(top_logprobs).

Downstream consumers typically assume a 1:1 alignment between emitted tokens and their logprob/top_logprobs entries.

To avoid subtle bugs later, consider always appending a placeholder (e.g., None and []) for positions where logprobs are missing, and ensure log_probs and top_logprobs always have the same length as the new token_ids slice.

tests/serve/test_vllm.py (1)

18-25: New aggregated_logprobs scenario is consistent with existing VLLMConfigs

The additional imports and the aggregated_logprobs entry plug neatly into the existing vllm_configs pattern and should exercise both chat and completion logprobs end-to-end without affecting other scenarios. If you ever want logprobs runs to also validate metrics, you could add a metric_payload_default here similar to "aggregated", but that’s not strictly necessary.

Also applies to: 56-78

tests/utils/payload_builder.py (1)

294-371: *New _with_logprobs builders nicely encapsulate logprobs-specific tests

These helpers make the logprobs test cases much clearer by coupling the request shape with the stronger validation payload types. One minor nit: chat_payload treats logprobs as an Optional[int], while chat_payload_with_logprobs hardcodes "logprobs": True. For long-term maintainability it might be worth aligning on a single convention (either always an int count or always the OpenAI-style bool + top_logprobs) to avoid confusion about what “logprobs” means at the HTTP layer.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 71f94ed and 3566e82.

📒 Files selected for processing (4)

components/src/dynamo/vllm/handlers.py (4 hunks)
tests/serve/test_vllm.py (2 hunks)
tests/utils/payload_builder.py (5 hunks)
tests/utils/payloads.py (2 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-06-24T20:59:35.725Z

Learnt from: ishandhanani
Repo: ai-dynamo/dynamo PR: 1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.

Applied to files:

components/src/dynamo/vllm/handlers.py

🧬 Code graph analysis (1)

tests/utils/payload_builder.py (1)

tests/utils/payloads.py (3)

ChatPayloadWithLogprobs (159-188)

CompletionPayload (192-210)

CompletionPayloadWithLogprobs (214-243)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and Test - dynamo

🔇 Additional comments (3)

components/src/dynamo/vllm/handlers.py (1)

684-691: Streaming integration of logprobs looks good

Hooking _extract_logprobs into generate_tokens and attaching "log_probs" / "top_logprobs" to the streamed payloads is clean and isolated; existing fields (token_ids, finish_reason, completion_usage, etc.) are unchanged, so this should be backward-compatible for callers that ignore the new keys.

tests/utils/payload_builder.py (2)

7-14: chat_payload enhancements for logprobs are straightforward and non-breaking

Allowing logprobs and top_logprobs to flow through the generic chat_payload body is a clean extension and doesn’t disturb existing callers. The defaults and behavior for other arguments remain unchanged.

Also applies to: 131-165

167-191: completion_payload logprobs plumbed cleanly

The added logprobs parameter and the explicit body construction keep the function flexible while remaining compatible with prior usage. This also mirrors the pattern used for chat, which helps keep the payload API coherent.

rmccorm4 · 2025-12-03T18:41:20Z

/ok to test 2deb598

AryanBagade · 2025-12-05T17:53:29Z

will resolve the merge conflict!

components/src/dynamo/vllm/handlers.py

tests/utils/payloads.py

Signed-off-by: Aryan Bagade <73382554+AryanBagade@users.noreply.github.com>

Signed-off-by: Aryan Bagade <aryan@aryanbagade.com>

AryanBagade · 2025-12-09T10:15:09Z

Added input validation for logprobs/prompt_logprobs :-> basically previous code would crash if someone passed an invalid value like "abc" or a negative number. Now it validates and logs a warning instead of throwing ValueError.

rmccorm4 · 2025-12-09T16:38:23Z

/ok to test c3d2f28

AryanBagade requested review from a team as code owners December 2, 2025 11:21

pull-request-size bot added the size/L label Dec 2, 2025

github-actions bot added external-contribution Pull request is from an external contributor feat labels Dec 2, 2025

feat: Add logprobs support to vLLM backend (ai-dynamo#4683)

3566e82

Implement logprobs functionality for vLLM backend to pass log probability information from vLLM to the frontend. Signed-off-by: Aryan Bagade <aryan@aryanbagade.com>

AryanBagade force-pushed the feat/add-logprobs-support-vllm-backend branch from b2760a7 to 3566e82 Compare December 2, 2025 11:23

coderabbitai bot reviewed Dec 2, 2025

View reviewed changes

AryanBagade added 2 commits December 2, 2025 15:51

Merge branch 'main' into feat/add-logprobs-support-vllm-backend

4ca1553

Merge branch 'main' into feat/add-logprobs-support-vllm-backend

2deb598

rmccorm4 requested review from GuanLuo, alec-flowers and ptarasiewiczNV December 3, 2025 18:41

copy-pr-bot bot had a problem deploying to GITLAB December 3, 2025 18:41 Failure

esoba mentioned this pull request Dec 4, 2025

feat: Add logprobs support to TRTLLM backend #4759

Open

rmccorm4 requested a review from karen-sy December 5, 2025 16:17

karen-sy reviewed Dec 6, 2025

View reviewed changes

components/src/dynamo/vllm/handlers.py Outdated Show resolved Hide resolved

karen-sy reviewed Dec 6, 2025

View reviewed changes

tests/utils/payloads.py Show resolved Hide resolved

AryanBagade and others added 3 commits December 9, 2025 01:32

Merge branch 'main' into feat/add-logprobs-support-vllm-backend

8014b84

Signed-off-by: Aryan Bagade <73382554+AryanBagade@users.noreply.github.com>

fix: address review feedback for logprobs implementation

6f7d85e

Signed-off-by: Aryan Bagade <aryan@aryanbagade.com>

fix: add input validation for logprobs values

bbb8fb9

Signed-off-by: Aryan Bagade <aryan@aryanbagade.com>

AryanBagade requested a review from karen-sy December 9, 2025 10:19

Merge branch 'main' into feat/add-logprobs-support-vllm-backend

c3d2f28

copy-pr-bot bot had a problem deploying to GITLAB December 9, 2025 16:38 Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add logprobs support to vLLM backend (#4683) #4697

feat: Add logprobs support to vLLM backend (#4683) #4697

AryanBagade commented Dec 2, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Dec 2, 2025

Uh oh!

github-actions bot commented Dec 2, 2025

Uh oh!

coderabbitai bot commented Dec 2, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

rmccorm4 commented Dec 3, 2025

Uh oh!

AryanBagade commented Dec 5, 2025

Uh oh!

Uh oh!

Uh oh!

AryanBagade commented Dec 9, 2025

Uh oh!

rmccorm4 commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: Add logprobs support to vLLM backend (#4683) #4697

Are you sure you want to change the base?

feat: Add logprobs support to vLLM backend (#4683) #4697

Conversation

AryanBagade commented Dec 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Dec 2, 2025

Uh oh!

github-actions bot commented Dec 2, 2025

Uh oh!

coderabbitai bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

rmccorm4 commented Dec 3, 2025

Uh oh!

AryanBagade commented Dec 5, 2025

Uh oh!

Uh oh!

Uh oh!

AryanBagade commented Dec 9, 2025

Uh oh!

rmccorm4 commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AryanBagade commented Dec 2, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 2, 2025 •

edited

Loading