Skip to content

Commit 9261eb3

Browse files
wangchen615hmellor
andauthored
docs(lora_resolvers): clarify multi-resolver order and storage path requirement (#28153)
Signed-off-by: Chen Wang <Chen.Wang1@ibm.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
1 parent cdd7025 commit 9261eb3

File tree

4 files changed

+226
-17
lines changed

4 files changed

+226
-17
lines changed

.markdownlint.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@ MD007:
33
MD013: false
44
MD024:
55
siblings_only: true
6+
MD031:
7+
list_items: false
68
MD033: false
79
MD045: false
810
MD046: false

docs/.nav.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,10 @@ nav:
4646
- contributing/model/multimodal.md
4747
- contributing/model/transcription.md
4848
- CI: contributing/ci
49-
- Design Documents: design
49+
- Design Documents:
50+
- Plugins:
51+
- design/*plugin*.md
52+
- design/*
5053
- API Reference:
5154
- api/README.md
5255
- api/vllm
Lines changed: 220 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,220 @@
1+
# LoRA Resolver Plugins
2+
3+
This directory contains vLLM's LoRA resolver plugins built on the `LoRAResolver` framework.
4+
They automatically discover and load LoRA adapters from a specified local storage path, eliminating the need for manual configuration or server restarts.
5+
6+
## Overview
7+
8+
LoRA Resolver Plugins provide a flexible way to dynamically load LoRA adapters at runtime. When vLLM
9+
receives a request for a LoRA adapter that hasn't been loaded yet, the resolver plugins will attempt
10+
to locate and load the adapter from their configured storage locations. This enables:
11+
12+
- **Dynamic LoRA Loading**: Load adapters on-demand without server restarts
13+
- **Multiple Storage Backends**: Support for filesystem, S3, and custom backends. The built-in `lora_filesystem_resolver` requires a local storage path, but custom resolvers can be implemented to fetch from any source.
14+
- **Automatic Discovery**: Seamless integration with existing LoRA workflows
15+
- **Scalable Deployment**: Centralized adapter management across multiple vLLM instances
16+
17+
## Prerequisites
18+
19+
Before using LoRA Resolver Plugins, ensure the following environment variables are configured:
20+
21+
### Required Environment Variables
22+
23+
1. **`VLLM_ALLOW_RUNTIME_LORA_UPDATING`**: Must be set to `true` or `1` to enable dynamic LoRA loading
24+
```bash
25+
export VLLM_ALLOW_RUNTIME_LORA_UPDATING=true
26+
```
27+
28+
2. **`VLLM_PLUGINS`**: Must include the desired resolver plugins (comma-separated list)
29+
```bash
30+
export VLLM_PLUGINS=lora_filesystem_resolver
31+
```
32+
33+
3. **`VLLM_LORA_RESOLVER_CACHE_DIR`**: Must be set to a valid directory path for filesystem resolver
34+
```bash
35+
export VLLM_LORA_RESOLVER_CACHE_DIR=/path/to/lora/adapters
36+
```
37+
38+
### Optional Environment Variables
39+
40+
- **`VLLM_PLUGINS`**: If not set, all available plugins will be loaded. If set to empty string, no plugins will be loaded.
41+
42+
## Available Resolvers
43+
44+
### lora_filesystem_resolver
45+
46+
The filesystem resolver is installed with vLLM by default and enables loading LoRA adapters from a local directory structure.
47+
48+
#### Setup Steps
49+
50+
1. **Create the LoRA adapter storage directory**:
51+
```bash
52+
mkdir -p /path/to/lora/adapters
53+
```
54+
55+
2. **Set environment variables**:
56+
```bash
57+
export VLLM_ALLOW_RUNTIME_LORA_UPDATING=true
58+
export VLLM_PLUGINS=lora_filesystem_resolver
59+
export VLLM_LORA_RESOLVER_CACHE_DIR=/path/to/lora/adapters
60+
```
61+
62+
3. **Start vLLM server**:
63+
Your base model can be `meta-llama/Llama-2-7b-hf`. Please make sure you set up the Hugging Face token in your env var `export HF_TOKEN=xxx235`.
64+
```bash
65+
python -m vllm.entrypoints.openai.api_server \
66+
--model your-base-model \
67+
--enable-lora
68+
```
69+
70+
#### Directory Structure Requirements
71+
72+
The filesystem resolver expects LoRA adapters to be organized in the following structure:
73+
74+
```text
75+
/path/to/lora/adapters/
76+
├── adapter1/
77+
│ ├── adapter_config.json
78+
│ ├── adapter_model.bin
79+
│ └── tokenizer files (if applicable)
80+
├── adapter2/
81+
│ ├── adapter_config.json
82+
│ ├── adapter_model.bin
83+
│ └── tokenizer files (if applicable)
84+
└── ...
85+
```
86+
87+
Each adapter directory must contain:
88+
89+
- **`adapter_config.json`**: Required configuration file with the following structure:
90+
```json
91+
{
92+
"peft_type": "LORA",
93+
"base_model_name_or_path": "your-base-model-name",
94+
"r": 16,
95+
"lora_alpha": 32,
96+
"target_modules": ["q_proj", "v_proj"],
97+
"bias": "none",
98+
"modules_to_save": null,
99+
"use_rslora": false,
100+
"use_dora": false
101+
}
102+
```
103+
104+
- **`adapter_model.bin`**: The LoRA adapter weights file
105+
106+
#### Usage Example
107+
108+
1. **Prepare your LoRA adapter**:
109+
```bash
110+
# Assuming you have a LoRA adapter in /tmp/my_lora_adapter
111+
cp -r /tmp/my_lora_adapter /path/to/lora/adapters/my_sql_adapter
112+
```
113+
114+
2. **Verify the directory structure**:
115+
```bash
116+
ls -la /path/to/lora/adapters/my_sql_adapter/
117+
# Should show: adapter_config.json, adapter_model.bin, etc.
118+
```
119+
120+
3. **Make a request using the adapter**:
121+
```bash
122+
curl http://localhost:8000/v1/completions \
123+
-H "Content-Type: application/json" \
124+
-d '{
125+
"model": "my_sql_adapter",
126+
"prompt": "Generate a SQL query for:",
127+
"max_tokens": 50,
128+
"temperature": 0.1
129+
}'
130+
```
131+
132+
#### How It Works
133+
134+
1. When vLLM receives a request for a LoRA adapter named `my_sql_adapter`
135+
2. The filesystem resolver checks if `/path/to/lora/adapters/my_sql_adapter/` exists
136+
3. If found, it validates the `adapter_config.json` file
137+
4. If the configuration matches the base model and is valid, the adapter is loaded
138+
5. The request is processed normally with the newly loaded adapter
139+
6. The adapter remains available for future requests
140+
141+
## Advanced Configuration
142+
143+
### Multiple Resolvers
144+
145+
You can configure multiple resolver plugins to load adapters from different sources:
146+
147+
'lora_s3_resolver' is an example of a custom resolver you would need to implement
148+
149+
```bash
150+
export VLLM_PLUGINS=lora_filesystem_resolver,lora_s3_resolver
151+
```
152+
153+
All listed resolvers are enabled; at request time, vLLM tries them in order until one succeeds.
154+
155+
### Custom Resolver Implementation
156+
157+
To implement your own resolver plugin:
158+
159+
1. **Create a new resolver class**:
160+
```python
161+
from vllm.lora.resolver import LoRAResolver, LoRAResolverRegistry
162+
from vllm.lora.request import LoRARequest
163+
164+
class CustomResolver(LoRAResolver):
165+
async def resolve_lora(self, base_model_name: str, lora_name: str) -> Optional[LoRARequest]:
166+
# Your custom resolution logic here
167+
pass
168+
```
169+
170+
2. **Register the resolver**:
171+
```python
172+
def register_custom_resolver():
173+
resolver = CustomResolver()
174+
LoRAResolverRegistry.register_resolver("Custom Resolver", resolver)
175+
```
176+
177+
## Troubleshooting
178+
179+
### Common Issues
180+
181+
1. **"VLLM_LORA_RESOLVER_CACHE_DIR must be set to a valid directory"**
182+
- Ensure the directory exists and is accessible
183+
- Check file permissions on the directory
184+
185+
2. **"LoRA adapter not found"**
186+
- Verify the adapter directory name matches the requested model name
187+
- Check that `adapter_config.json` exists and is valid JSON
188+
- Ensure `adapter_model.bin` exists in the directory
189+
190+
3. **"Invalid adapter configuration"**
191+
- Verify `peft_type` is set to "LORA"
192+
- Check that `base_model_name_or_path` matches your base model
193+
- Ensure `target_modules` is properly configured
194+
195+
4. **"LoRA rank exceeds maximum"**
196+
- Check that `r` value in `adapter_config.json` doesn't exceed `max_lora_rank` setting
197+
198+
### Debugging Tips
199+
200+
1. **Enable debug logging**:
201+
```bash
202+
export VLLM_LOGGING_LEVEL=DEBUG
203+
```
204+
205+
2. **Verify environment variables**:
206+
```bash
207+
echo $VLLM_ALLOW_RUNTIME_LORA_UPDATING
208+
echo $VLLM_PLUGINS
209+
echo $VLLM_LORA_RESOLVER_CACHE_DIR
210+
```
211+
212+
3. **Test adapter configuration**:
213+
```bash
214+
python -c "
215+
import json
216+
with open('/path/to/lora/adapters/my_adapter/adapter_config.json') as f:
217+
config = json.load(f)
218+
print('Config valid:', config)
219+
"
220+
```

vllm/plugins/lora_resolvers/README.md

Lines changed: 0 additions & 16 deletions
This file was deleted.

0 commit comments

Comments
 (0)