Skip to content

Commit 6718755

Browse files
authored
[Docs] Enable some more markdown lint rules for the docs (#28731)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
1 parent a425dc2 commit 6718755

File tree

6 files changed

+10
-15
lines changed

6 files changed

+10
-15
lines changed

.markdownlint.yaml

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,6 @@ MD024:
66
MD031:
77
list_items: false
88
MD033: false
9-
MD045: false
109
MD046: false
11-
MD051: false
1210
MD052: false
13-
MD053: false
1411
MD059: false

docs/contributing/benchmarks.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,6 @@ vLLM provides comprehensive benchmarking tools for performance testing and evalu
1010
- **[Parameter sweeps](#parameter-sweeps)**: Automate `vllm bench` runs for multiple configurations
1111
- **[Performance benchmarks](#performance-benchmarks)**: Automated CI benchmarks for development
1212

13-
[Benchmark CLI]: #benchmark-cli
14-
1513
## Benchmark CLI
1614

1715
This section guides you through running benchmark tests with the extensive

docs/contributing/ci/update_pytorch_version.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ when manually triggering a build on Buildkite. This branch accomplishes two thin
9595
to warm it up so that future builds are faster.
9696

9797
<p align="center" width="100%">
98-
<img width="60%" src="https://github.com/user-attachments/assets/a8ff0fcd-76e0-4e91-b72f-014e3fdb6b94">
98+
<img width="60%" alt="Buildkite new build popup" src="https://github.com/user-attachments/assets/a8ff0fcd-76e0-4e91-b72f-014e3fdb6b94">
9999
</p>
100100

101101
## Update dependencies

docs/deployment/frameworks/chatbox.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,8 @@ pip install vllm
2929
- API Path: `/chat/completions`
3030
- Model: `qwen/Qwen1.5-0.5B-Chat`
3131

32-
![](../../assets/deployment/chatbox-settings.png)
32+
![Chatbox settings screen](../../assets/deployment/chatbox-settings.png)
3333

3434
1. Go to `Just chat`, and start to chat:
3535

36-
![](../../assets/deployment/chatbox-chat.png)
36+
![Chatbot chat screen](../../assets/deployment/chatbox-chat.png)

docs/deployment/frameworks/dify.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,12 +46,12 @@ And install [Docker](https://docs.docker.com/engine/install/) and [Docker Compos
4646
- **Model Name for API Endpoint**: `Qwen/Qwen1.5-7B-Chat`
4747
- **Completion Mode**: `Completion`
4848

49-
![](../../assets/deployment/dify-settings.png)
49+
![Dify settings screen](../../assets/deployment/dify-settings.png)
5050

5151
1. To create a test chatbot, go to `Studio → Chatbot → Create from Blank`, then select Chatbot as the type:
5252

53-
![](../../assets/deployment/dify-create-chatbot.png)
53+
![Dify create chatbot screen](../../assets/deployment/dify-create-chatbot.png)
5454

5555
1. Click the chatbot you just created to open the chat interface and start interacting with the model:
5656

57-
![](../../assets/deployment/dify-chat.png)
57+
![Dify chat screen](../../assets/deployment/dify-chat.png)

docs/design/fused_moe_modular_kernel.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@ The input activation format completely depends on the All2All Dispatch being use
1919

2020
The FusedMoE operation is generally made of multiple operations, in both the Contiguous and Batched variants, as described in the diagrams below
2121

22-
![](../assets/design/fused_moe_modular_kernel/fused_moe_non_batched.png "FusedMoE Non-Batched")
22+
![FusedMoE Non-Batched](../assets/design/fused_moe_modular_kernel/fused_moe_non_batched.png)
2323

24-
![](../assets/design/fused_moe_modular_kernel/fused_moe_batched.png "FusedMoE Batched")
24+
![FusedMoE Batched](../assets/design/fused_moe_modular_kernel/fused_moe_batched.png)
2525

2626
!!! note
2727
The main difference, in terms of operations, between the Batched and Non-Batched cases is the Permute / Unpermute operations. All other operations remain.
@@ -57,7 +57,7 @@ The `FusedMoEModularKernel` acts as a bridge between the `FusedMoEPermuteExperts
5757
The `FusedMoEPrepareAndFinalize` abstract class exposes `prepare`, `prepare_no_receive` and `finalize` functions.
5858
The `prepare` function is responsible for input activation Quantization and All2All Dispatch. If implemented, The `prepare_no_receive` is like `prepare` except it does not wait to receive results from other workers. Instead it returns a "receiver" callback that must be invoked to wait for the final results of worker. It is not required that this method is supported by all `FusedMoEPrepareAndFinalize` classes, but if it is available, it can be used to interleave work with the initial all to all communication, e.g. interleaving shared experts with fused experts. The `finalize` function is responsible for invoking the All2All Combine. Additionally the `finalize` function may or may not do the TopK weight application and reduction (Please refer to the TopKWeightAndReduce section)
5959

60-
![](../assets/design/fused_moe_modular_kernel/prepare_and_finalize_blocks.png "FusedMoEPrepareAndFinalize Blocks")
60+
![FusedMoEPrepareAndFinalize Blocks](../assets/design/fused_moe_modular_kernel/prepare_and_finalize_blocks.png)
6161

6262
### FusedMoEPermuteExpertsUnpermute
6363

@@ -88,7 +88,7 @@ The core FusedMoE implementation performs a series of operations. It would be in
8888
It is sometimes efficient to perform TopK weight application and Reduction inside the `FusedMoEPermuteExpertsUnpermute::apply()`. Find an example [here](https://github.com/vllm-project/vllm/pull/20228). We have a `TopKWeightAndReduce` abstract class to facilitate such implementations. Please refer to the TopKWeightAndReduce section.
8989
`FusedMoEPermuteExpertsUnpermute::finalize_weight_and_reduce_impl()` returns the `TopKWeightAndReduce` object that the implementation wants the `FusedMoEPrepareAndFinalize::finalize()` to use.
9090

91-
![](../assets/design/fused_moe_modular_kernel/fused_experts_blocks.png "FusedMoEPermuteExpertsUnpermute Blocks")
91+
![FusedMoEPermuteExpertsUnpermute Blocks](../assets/design/fused_moe_modular_kernel/fused_experts_blocks.png)
9292

9393
### FusedMoEModularKernel
9494

0 commit comments

Comments
 (0)