-
-
Notifications
You must be signed in to change notification settings - Fork 11.6k
Use narrow over indexing in hadacore_transform to prep for ABI stable
#28756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Jane Xu <janeyx@meta.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly replaces torch::indexing::Slice with narrow in hadacore_transform to improve ABI stability, which is a valid and good change. However, I've identified a critical bug in the surrounding code. The hadacore_transform function returns uninitialized data when called with inplace=false. This is because the run_fht kernel is hardcoded to write its output to the input tensor x, but when inplace=false, the out tensor is a separate, uninitialized buffer that is never written to. This should be fixed, and a test case for inplace=false should be added to prevent regressions.
|
|
||
| if (numel % 256 != 0) { | ||
| out = out.index({torch::indexing::Slice(0, numel / had_size)}); | ||
| out = out.narrow(0, 0, numel / had_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this change to use narrow is correct for ABI stability, it operates on the out tensor which contains uninitialized data when inplace=false. This is because the run_fht kernel at line 801 is called to perform an in-place operation on x, but when inplace=false, out is a separate tensor that is never populated with the result. This causes the function to return garbage data.
To fix this critical bug, the kernel should be called to write to out:
// at line 801
hadacore::run_fht<SCALAR_TYPE>(x.data_ptr(), out.data_ptr(), x.numel(), had_size, stream);Additionally, the existing tests only seem to cover the inplace=true path. Please add a test case with inplace=false to verify the fix and prevent future regressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Um, I think gemini is trolling. If there is a problem, then it was pre-existing
…le (vllm-project#28756) Signed-off-by: Jane Xu <janeyx@meta.com> Signed-off-by: George D. Torres <gdavtor@gmail.com>
…le (vllm-project#28756) Signed-off-by: Jane Xu <janeyx@meta.com> Signed-off-by: Bram Wasti <bwasti@meta.com>
…le (vllm-project#28756) Signed-off-by: Jane Xu <janeyx@meta.com>
…le (vllm-project#28756) Signed-off-by: Jane Xu <janeyx@meta.com>
Purpose
Instead of using indexing, use
narrow, which is supported by ABI stable in thehadacore_transformkernel. They compute the same slice. Contributes to #26946.Test Plan
pytest tests/kernels/quantization/test_hadacore.pyTest Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.