Skip to content

Conversation

@ChenxiQ
Copy link
Contributor

@ChenxiQ ChenxiQ commented Dec 1, 2025

What this PR does / why we need it?

  1. Fixes the environment path used to locate custom op shared libraries.
  2. Uses empty tensor initialization for op outputs instead of zero-initialization for better efficiency.

Does this PR introduce any user-facing change?

How was this patch tested?

Signed-off-by: QianChenxi <chenxi.qian.cq@outlook.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request includes a bug fix and a performance optimization. In vllm_ascend/platform.py, a duplicated path segment in the construction of CUSTOM_OPP_PATH is corrected, which is a necessary fix. In csrc/torch_binding.cpp, at::zeros is replaced with at::empty for output tensor allocation, which is a good performance improvement. However, I've identified a related critical issue that should be addressed.

Comment on lines +571 to +573
at::Tensor output = at::empty({m, n/2}, x.options().dtype(at::kChar));
at::Tensor output_scale = at::empty({m}, x.options().dtype(at::kFloat));
at::Tensor output_offset = at::empty({m}, x.options().dtype(at::kFloat));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The size of these output tensors depends on n, which is calculated on line 567 from the weight TensorList (weight[0].sizes()[1]). This is unsafe because if the weight TensorList is empty, accessing weight[0] will cause a crash. It's crucial to add a check to ensure weight is not empty before its elements are accessed.

For example, you could add the following check before line 567:

TORCH_CHECK(!weight.empty(), "weight tensor list cannot be empty");

@github-actions
Copy link

github-actions bot commented Dec 1, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@ChenxiQ ChenxiQ changed the title [Bugfix] custom op fix [Bugfix] fix custom op GmmSwigluQuantWeightNzTensorList Dec 1, 2025
@ChenxiQ ChenxiQ force-pushed the br_custom_op_fix branch 3 times, most recently from b1c3922 to 29191fb Compare December 2, 2025 13:04
Signed-off-by: QianChenxi <chenxi.qian.cq@outlook.com>
@wangxiyuan wangxiyuan merged commit 4588cda into vllm-project:main Dec 2, 2025
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants