Skip to content

Conversation

@janeyx99
Copy link
Contributor

@janeyx99 janeyx99 commented Nov 14, 2025

Purpose

Instead of using indexing, use narrow, which is supported by ABI stable in the hadacore_transform kernel. They compute the same slice. Contributes to #26946.

Test Plan

pytest tests/kernels/quantization/test_hadacore.py

Test Result

(vllm) [janeyx@devgpu007.eag6 /data/users/janeyx/local/vllm (python-meta-kernels)]$ pytest tests/kernels/quantization/test_hadacore.py 
======================= test session starts ========================
platform linux -- Python 3.13.9, pytest-9.0.1, pluggy-1.6.0
rootdir: /data/users/janeyx/local/vllm
configfile: pyproject.toml
plugins: anyio-4.11.0
collected 20 items                                                 

tests/kernels/quantization/test_hadacore.py ................ [ 80%]
....                                                         [100%]

========================= warnings summary =========================
<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyPacked has no __module__ attribute

<frozen importlib._bootstrap>:488
  <frozen importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyObject has no __module__ attribute

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================== 20 passed, 2 warnings in 5.94s ==================
<sys>:0: DeprecationWarning: builtin type swigvarlink has no __module__ attribute
(vllm) [janeyx@devgpu007.eag6 /data/users/janeyx/local/vllm (python-meta-kernels)]$

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Jane Xu <janeyx@meta.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly replaces torch::indexing::Slice with narrow in hadacore_transform to improve ABI stability, which is a valid and good change. However, I've identified a critical bug in the surrounding code. The hadacore_transform function returns uninitialized data when called with inplace=false. This is because the run_fht kernel is hardcoded to write its output to the input tensor x, but when inplace=false, the out tensor is a separate, uninitialized buffer that is never written to. This should be fixed, and a test case for inplace=false should be added to prevent regressions.


if (numel % 256 != 0) {
out = out.index({torch::indexing::Slice(0, numel / had_size)});
out = out.narrow(0, 0, numel / had_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

While this change to use narrow is correct for ABI stability, it operates on the out tensor which contains uninitialized data when inplace=false. This is because the run_fht kernel at line 801 is called to perform an in-place operation on x, but when inplace=false, out is a separate tensor that is never populated with the result. This causes the function to return garbage data.

To fix this critical bug, the kernel should be called to write to out:

// at line 801
hadacore::run_fht<SCALAR_TYPE>(x.data_ptr(), out.data_ptr(), x.numel(), had_size, stream);

Additionally, the existing tests only seem to cover the inplace=true path. Please add a test case with inplace=false to verify the fix and prevent future regressions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Um, I think gemini is trolling. If there is a problem, then it was pre-existing

@github-project-automation github-project-automation bot moved this to In review in NVIDIA Nov 14, 2025
@zou3519 zou3519 added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 14, 2025
@vllm-bot vllm-bot merged commit 74b5267 into vllm-project:main Nov 15, 2025
95 of 97 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in NVIDIA Nov 15, 2025
geodavic pushed a commit to geodavic/vllm that referenced this pull request Nov 16, 2025
…le (vllm-project#28756)

Signed-off-by: Jane Xu <janeyx@meta.com>
Signed-off-by: George D. Torres <gdavtor@gmail.com>
bwasti pushed a commit to bwasti/vllm that referenced this pull request Nov 17, 2025
…le (vllm-project#28756)

Signed-off-by: Jane Xu <janeyx@meta.com>
Signed-off-by: Bram Wasti <bwasti@meta.com>
bringlein pushed a commit to bringlein/vllm that referenced this pull request Nov 26, 2025
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

nvidia ready ONLY add when PR is ready to merge/full CI is needed

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants