Skip to content

Commit bc69d7c

Browse files
wangxiyuanMengqingCaozhangxinyuehfadleo-pony
authored
upgrade to vllm 0.11.2 (#4400)
Bump vLLM version to v0.11.2 What's broken and changed by vLLM: 1. structured_output is broken by vllm-project/vllm#26866 2. get_mrope_input_positions is broken by vllm-project/vllm#28399 3. graph mode is broken by vllm-project/vllm#25110 we'll upgrade torch to 2.8 to fix the problem later 4. embedding is broken by vllm-project/vllm#27583 5. `get_attn_backend_cls` and attention backend is broken are broken by vllm-project/vllm#28534 6. spec decode is broken by vllm-project/vllm#28771 7. sp feature is broken by vllm-project/vllm#27126 8. mtp is broken by vllm-project/vllm#27922 9. lora is broken by vllm-project/vllm#21068 10. execute_model is broken by vllm-project/vllm#26866 11. `VLLM_DISABLE_SHARED_EXPERTS_STREAM` env is broken by vllm-project/vllm#28159 12. kv cahe is broken by vllm-project/vllm#27753 13. dp is broken by vllm-project/vllm#25110 What's broken and changed by ourself: 1. qwen vl is broken by vllm-project/vllm#28455 We'll remove model files in the future to avoid this kind of error 2. Engine core is broken by vllm-project/vllm#23691 We'll remove the patch file in the future. 3. Ascend scheduler is broken by vllm-project/vllm#28733 We'll remove ascend scheudler later. 4. qwen3-next is broken by vllm-project/vllm#28083 We'll remove model files in the future to avoid this kind of error 5. qwen vl is broken by vllm-project/vllm#27764. We'll remove model files in the future Known issue: 1. ray doesn't work 2. the accuracy of qwen3-next is not correct 3. qwen3-vl is broken 4. prefix cache+ ascend scheduler + deepseek v2 lite is broken. Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: leo-pony <nengjunma@outlook.com> Co-authored-by: 22dimensions <waitingwind@foxmail.com> Co-authored-by: shen-shanshan <467638484@qq.com> - vLLM version: v0.11.2 --------- Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: hfadzxy <starmoon_zhang@163.com> Signed-off-by: leo-pony <nengjunma@outlook.com> Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: hfadzxy <starmoon_zhang@163.com> Co-authored-by: leo-pony <nengjunma@outlook.com>
1 parent d5f77f1 commit bc69d7c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

54 files changed

+744
-437
lines changed

.github/workflows/_e2e_nightly_multi_node.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ on:
3232
description: how many pods will be pulled up via lws.yaml, indicates number of nodes we need
3333
vllm_version:
3434
required: false
35-
default: "2918c1b49c88c29783c86f78d2c4221cb9622379"
35+
default: "v0.11.2"
3636
type: string
3737
description: vllm version to use
3838
vllm_ascend_remote_url:

.github/workflows/format_pr_body.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ jobs:
3636

3737
- name: Get vLLM version
3838
run: |
39-
VLLM_COMMIT=2918c1b49c88c29783c86f78d2c4221cb9622379
39+
VLLM_COMMIT=v0.11.2
4040
echo "VLLM_COMMIT=https://github.com/vllm-project/vllm/commit/$VLLM_COMMIT" >> $GITHUB_ENV
4141
4242
- name: Checkout repository

.github/workflows/nightly_benchmarks.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ jobs:
5151
strategy:
5252
matrix:
5353
include:
54-
- vllm_branch: 2918c1b49c88c29783c86f78d2c4221cb9622379
54+
- vllm_branch: v0.11.2
5555
vllm_ascend_branch: main
5656
max-parallel: 1
5757
container:

.github/workflows/vllm_ascend_test.yaml

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ jobs:
4242
lint:
4343
uses: ./.github/workflows/pre-commit.yml
4444
with:
45-
vllm: 2918c1b49c88c29783c86f78d2c4221cb9622379
45+
vllm: v0.11.2
4646
changes:
4747
runs-on: ubuntu-latest
4848
outputs:
@@ -83,7 +83,7 @@ jobs:
8383
VLLM_USE_MODELSCOPE: True
8484
strategy:
8585
matrix:
86-
vllm_version: [2918c1b49c88c29783c86f78d2c4221cb9622379]
86+
vllm_version: [v0.11.2]
8787
steps:
8888
- name: Install packages
8989
run: |
@@ -121,7 +121,10 @@ jobs:
121121
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib
122122
pytest -sv --cov --cov-report=xml:unittests-coverage.xml tests/ut \
123123
--ignore tests/ut/torchair/models/test_torchair_deepseek_mtp.py \
124-
--ignore tests/ut/torchair/models/test_torchair_deepseek_v2.py
124+
--ignore tests/ut/torchair/models/test_torchair_deepseek_v2.py \
125+
--ignore tests/ut/models/test_qwen2_vl.py \
126+
--ignore tests/ut/models/test_qwen2_5_vl.py \
127+
--ignore tests/ut/models/test_qwen2_5_vl_without_padding.py
125128
126129
- name: Upload coverage to Codecov
127130
# only upload coverage when commits merged
@@ -138,7 +141,7 @@ jobs:
138141
name: e2e-light
139142
strategy:
140143
matrix:
141-
vllm_version: [2918c1b49c88c29783c86f78d2c4221cb9622379]
144+
vllm_version: [v0.11.2]
142145
# Note (yikun): If CI resource are limited we can split job into two chain jobs
143146
needs: [lint, changes]
144147
# only trigger e2e test after lint passed and the change is e2e related with pull request.

.github/workflows/vllm_ascend_test_full.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ jobs:
6969
name: e2e-full
7070
strategy:
7171
matrix:
72-
vllm_version: [2918c1b49c88c29783c86f78d2c4221cb9622379]
72+
vllm_version: [v0.11.2]
7373
needs: [changes]
7474
if: ${{ needs.changes.outputs.e2e_tracker == 'true' }}
7575
uses: ./.github/workflows/_e2e_test.yaml

.github/workflows/vllm_ascend_test_nightly_a2.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ jobs:
8686
tests: tests/e2e/nightly/ops
8787
uses: ./.github/workflows/_e2e_nightly_single_node.yaml
8888
with:
89-
vllm: 2918c1b49c88c29783c86f78d2c4221cb9622379
89+
vllm: v0.11.2
9090
runner: ${{ matrix.test_config.os }}
9191
tests: ${{ matrix.test_config.tests }}
9292
image: 'swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/vllm-ascend:nightly-a2'
@@ -125,7 +125,7 @@ jobs:
125125
- Qwen3-Next-80B-A3B-Instruct
126126
uses: ./.github/workflows/_e2e_nightly_single_node_models.yaml
127127
with:
128-
vllm: 2918c1b49c88c29783c86f78d2c4221cb9622379
128+
vllm: v0.11.2
129129
runner: ${{ matrix.test_config.os }}
130130
model_list: ${{ toJson(matrix.test_config.model_list) }}
131131
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.2.rc1-910b-ubuntu22.04-py3.11

.github/workflows/vllm_ascend_test_nightly_a3.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ jobs:
136136
tests: tests/e2e/nightly/models/test_deepseek_v3_2_exp_w8a8.py
137137
uses: ./.github/workflows/_e2e_nightly_single_node.yaml
138138
with:
139-
vllm: 2918c1b49c88c29783c86f78d2c4221cb9622379
139+
vllm: v0.11.2
140140
runner: ${{ matrix.test_config.os }}
141141
image: 'swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/vllm-ascend:nightly-a3'
142142
tests: ${{ matrix.test_config.tests }}

.github/workflows/vllm_ascend_test_report.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@ jobs:
7272
- DeepSeek-V2-Lite
7373
uses: ./.github/workflows/_e2e_nightly_single_node_models.yaml
7474
with:
75-
vllm: 2918c1b49c88c29783c86f78d2c4221cb9622379
75+
vllm: v0.11.2
7676
runner: ${{ matrix.runner }}
7777
image: swr.cn-southwest-2.myhuaweicloud.com/base_image/ascend-ci/cann:8.3.rc1-910b-ubuntu22.04-py3.11
7878
model_list: ${{ toJson(matrix.model_list) }}

Dockerfile

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -46,10 +46,8 @@ RUN pip config set global.index-url ${PIP_INDEX_URL}
4646

4747
# Install vLLM
4848
ARG VLLM_REPO=https://github.com/vllm-project/vllm.git
49-
ARG VLLM_TAG=2918c1b49c88c29783c86f78d2c4221cb9622379
50-
# Revert this change once VLLM_TAG is specified to branch or tag
51-
# RUN git clone --depth 1 $VLLM_REPO --branch $VLLM_TAG /vllm-workspace/vllm
52-
RUN git clone $VLLM_REPO /vllm-workspace/vllm && (cd /vllm-workspace/vllm && git checkout $VLLM_TAG)
49+
ARG VLLM_TAG=v0.11.2
50+
RUN git clone --depth 1 $VLLM_REPO --branch $VLLM_TAG /vllm-workspace/vllm
5351
# In x86, triton will be installed by vllm. But in Ascend, triton doesn't work correctly. we need to uninstall it.
5452
RUN VLLM_TARGET_DEVICE="empty" python3 -m pip install -v -e /vllm-workspace/vllm/[audio] --extra-index https://download.pytorch.org/whl/cpu/ && \
5553
python3 -m pip uninstall -y triton && \

Dockerfile.310p

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -37,10 +37,8 @@ RUN pip config set global.index-url ${PIP_INDEX_URL}
3737

3838
# Install vLLM
3939
ARG VLLM_REPO=https://github.com/vllm-project/vllm.git
40-
ARG VLLM_TAG=2918c1b49c88c29783c86f78d2c4221cb9622379
41-
# Revert this change once VLLM_TAG is specified to branch or tag
42-
# RUN git clone --depth 1 $VLLM_REPO --branch $VLLM_TAG /vllm-workspace/vllm
43-
RUN git clone $VLLM_REPO /vllm-workspace/vllm && (cd /vllm-workspace/vllm && git checkout $VLLM_TAG)
40+
ARG VLLM_TAG=v0.11.2
41+
RUN git clone --depth 1 $VLLM_REPO --branch $VLLM_TAG /vllm-workspace/vllm
4442
# In x86, triton will be installed by vllm. But in Ascend, triton doesn't work correctly. we need to uninstall it.
4543
RUN VLLM_TARGET_DEVICE="empty" python3 -m pip install -v -e /vllm-workspace/vllm/[audio] --extra-index https://download.pytorch.org/whl/cpu/ && \
4644
python3 -m pip uninstall -y triton && \

0 commit comments

Comments
 (0)