Skip to content

Commit 5597c2a

Browse files
didier-duranddevpatelio
authored andcommitted
[Doc]: fix typos in various files (vllm-project#28945)
Signed-off-by: Didier Durand <durand.didier@gmail.com>
1 parent bcf2e93 commit 5597c2a

File tree

6 files changed

+7
-7
lines changed

6 files changed

+7
-7
lines changed

docs/design/moe_kernel_features.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ The purpose of this document is to provide an overview of the various MoE kernel
44

55
## Fused MoE Modular All2All backends
66

7-
There are a number of all2all communication backends that are used to implement expert parallelism (EP) for the `FusedMoE` layer. The different `FusedMoEPrepareAndFinalize` sub-classes provide an interface for each all2all backend.
7+
There are a number of all2all communication backends that are used to implement expert parallelism (EP) for the `FusedMoE` layer. The different `FusedMoEPrepareAndFinalize` subclasses provide an interface for each all2all backend.
88

99
The following table describes the relevant features of each backend, i.e. activation format, supported quantization schemes and async support.
1010

@@ -68,7 +68,7 @@ Modular kernels are supported by the following `FusedMoEMethodBase` classes.
6868

6969
## Fused MoE Experts Kernels
7070

71-
The are a number of MoE experts kernel implementations for different quantization types and architectures. Most follow the general API of the base Triton [`fused_experts`][vllm.model_executor.layers.fused_moe.fused_moe.fused_experts] function. Many have modular kernel adapters so they can be used with compatible all2all backends. This table lists each experts kernel and its particular properties.
71+
There are a number of MoE experts kernel implementations for different quantization types and architectures. Most follow the general API of the base Triton [`fused_experts`][vllm.model_executor.layers.fused_moe.fused_moe.fused_experts] function. Many have modular kernel adapters so they can be used with compatible all2all backends. This table lists each experts kernel and its particular properties.
7272

7373
Each kernel must be provided with one of the supported input activation formats. Some flavors of kernels support both standard and batched formats through different entry points, e.g. `TritonExperts` and `BatchedTritonExperts`. Batched format kernels are currently only needed for matching with certain all2all backends, e.g. `pplx`, `DeepEPLLPrepareAndFinalize`.
7474

docs/design/plugin_system.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ Every plugin has three parts:
4949

5050
- **Platform plugins** (with group name `vllm.platform_plugins`): The primary use case for these plugins is to register custom, out-of-the-tree platforms into vLLM. The plugin function should return `None` when the platform is not supported in the current environment, or the platform class's fully qualified name when the platform is supported.
5151

52-
- **IO Processor plugins** (with group name `vllm.io_processor_plugins`): The primary use case for these plugins is to register custom pre/post processing of the model prompt and model output for pooling models. The plugin function returns the IOProcessor's class fully qualified name.
52+
- **IO Processor plugins** (with group name `vllm.io_processor_plugins`): The primary use case for these plugins is to register custom pre-/post-processing of the model prompt and model output for pooling models. The plugin function returns the IOProcessor's class fully qualified name.
5353

5454
- **Stat logger plugins** (with group name `vllm.stat_logger_plugins`): The primary use case for these plugins is to register custom, out-of-the-tree loggers into vLLM. The entry point should be a class that subclasses StatLoggerBase.
5555

docs/features/quantization/quark.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -306,7 +306,7 @@ As examples, we provide some ready-to-use quantized mixed precision model to sho
306306

307307
### 2. inference the quantized mixed precision model in vLLM
308308

309-
Models quantized with AMD Quark using mixed precision can natively be reload in vLLM, and e.g. evaluated using lm-evaluation-harness as follow:
309+
Models quantized with AMD Quark using mixed precision can natively be reload in vLLM, and e.g. evaluated using lm-evaluation-harness as follows:
310310

311311
```bash
312312
lm_eval --model vllm \

examples/online_serving/prometheus_grafana/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ Navigate to [`http://localhost:3000`](http://localhost:3000). Log in with the de
4646

4747
Navigate to [`http://localhost:3000/connections/datasources/new`](http://localhost:3000/connections/datasources/new) and select Prometheus.
4848

49-
On Prometheus configuration page, we need to add the `Prometheus Server URL` in `Connection`. For this setup, Grafana and Prometheus are running in separate containers, but Docker creates DNS name for each containers. You can just use `http://prometheus:9090`.
49+
On Prometheus configuration page, we need to add the `Prometheus Server URL` in `Connection`. For this setup, Grafana and Prometheus are running in separate containers, but Docker creates DNS name for each container. You can just use `http://prometheus:9090`.
5050

5151
Click `Save & Test`. You should get a green check saying "Successfully queried the Prometheus API.".
5252

vllm/engine/arg_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1500,7 +1500,7 @@ def create_engine_config(
15001500
# Local DP rank = 1, use pure-external LB.
15011501
if data_parallel_external_lb:
15021502
assert self.data_parallel_rank is not None, (
1503-
"data_parallel_rank or node_rank must be spefified if "
1503+
"data_parallel_rank or node_rank must be specified if "
15041504
"data_parallel_external_lb is enable."
15051505
)
15061506
assert self.data_parallel_size_local in (1, None), (

vllm/envs.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1261,7 +1261,7 @@ def get_vllm_port() -> int | None:
12611261
# MoE routing strategy selector.
12621262
# See `RoutingSimulator.get_available_strategies()` # for available
12631263
# strategies.
1264-
# Cutstom routing strategies can be registered by
1264+
# Custom routing strategies can be registered by
12651265
# RoutingSimulator.register_strategy()
12661266
# Note: custom strategies may not produce correct model outputs
12671267
"VLLM_MOE_ROUTING_SIMULATION_STRATEGY": lambda: os.environ.get(

0 commit comments

Comments
 (0)