Skip to content

Commit 554f16a

Browse files
authored
[Kernel] add custom op GmmSwigluQuantWeightNzTensorList (#3804)
### What this PR does / why we need it? This PR introduces support for adding custom CANN `aclnn` ops to `vllm-ascend`, allowing users to define and use their own custom operators. Key changes include: - Building and installing custom ops into the `vllm-ascend`-specified directory - Binding the `aclnn` op interface to the `torch.ops._C_ascend` module - Enabling invocation of these ops within `vllm-ascend` This PR includes a sample custom op: `aclnnGroupedMatmulSwigluQuantWeightNzTensorList`, which is adapted from the CANN operator [`aclnnGroupedMatmulSwigluQuantWeightNZ`](https://www.hiascend.com/document/detail/zh/canncommercial/83RC1/API/aolapi/context/aclnnGroupedMatmulSwigluQuantWeightNZ.md). Its input parameters `weight` and `weight_scale` now accept `list[torch.Tensor]` (i.e., `at::TensorList`). ### Does this PR introduce _any_ user-facing change? No. - vLLM version: v0.11.2 --------- Signed-off-by: QianChenxi <chenxi.qian.cq@outlook.com>
1 parent 3199fe8 commit 554f16a

File tree

50 files changed

+6934
-7
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+6934
-7
lines changed

.github/workflows/release_whl.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,11 @@ jobs:
9696
--exclude libge_common_base.so \
9797
--exclude libc10.so \
9898
--exclude libc_sec.so \
99+
--exclude libnnopbase.so \
100+
--exclude libprofapi.so \
101+
--exclude libgraph_base.so \
102+
--exclude libgraph.so \
103+
--exclude libexe_graph.so \
99104
--exclude "libascend*.so" \
100105
--exclude "libtorch*.so" \
101106
--exclude "libopapi.so" \

.pre-commit-config.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ repos:
1212
- id: codespell
1313
args: [
1414
--toml, pyproject.toml,
15-
'--skip', 'tests/e2e/multicard/test_torchair_graph_mode.py,csrc/mla_preprocess/**,tests/prompts/**,./benchmarks/sonnet.txt,*tests/lora/data/**,build/**,./vllm_ascend.egg-info/**,.github/**,typos.toml',
15+
'--skip', 'tests/e2e/multicard/test_torchair_graph_mode.py,csrc/**,tests/prompts/**,./benchmarks/sonnet.txt,*tests/lora/data/**,build/**,./vllm_ascend.egg-info/**,.github/**,typos.toml',
1616
'-L', 'CANN,cann,NNAL,nnal,ASCEND,ascend,EnQue,CopyIn,ArchType,AND'
1717
]
1818
additional_dependencies:
@@ -37,7 +37,7 @@ repos:
3737
- id: typos
3838
args: [
3939
"--force-exclude",
40-
"--exclude", "csrc/mla_preprocess/**"
40+
"--exclude", "csrc/**"
4141
]
4242
- repo: https://github.com/PyCQA/isort
4343
rev: 6.0.1

CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ set(
8282
${TORCH_NPU_INCLUDE_DIRS}
8383
${ASCEND_HOME_PATH}/include
8484
${ASCEND_HOME_PATH}/aarch64-linux/include/experiment/platform
85+
${ASCEND_HOME_PATH}/x86_64-linux/include/experiment/platform
8586
)
8687

8788
pybind11_add_module(vllm_ascend_C ${VLLM_ASCEND_SRC})

0 commit comments

Comments
 (0)