-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Open
Labels
Description
Your current environment
My makefile:
clean:
rm -rf /tmp/torchinductor_rzou
rm -rf ~/.cache/vllm/torch_compile_cache
killall -9 "VLLM::EngineCore"
run:
VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=2 vllm serve --gpu-memory-utilization 0.1 &
VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=3 vllm serve --gpu-memory-utilization 0.1 &
VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=4 vllm serve --gpu-memory-utilization 0.1 &
VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=5 vllm serve --gpu-memory-utilization 0.1 &
VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=6 vllm serve --gpu-memory-utilization 0.1 &
VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=7 vllm serve --gpu-memory-utilization 0.1 &
make run
🐛 Describe the bug
The problem is:
- we launch vllm in multiple processes
- each process tries to compile and save an artifact to disk in the same location
- each artifact is a directory
- the directories get clobbered
There's multiple ways I think we can fix the problem. I've tried some of these and haven't gotten them to work yet.
- standalone_compile offers a "binary" option instead of a "directory" option. writing to these binary files is atomic (we implemented it as a write+rename), reading from them is multiprocess-safe too (a process will not delete a file that's open until it is closed).
- if we go with the "directory" option, then we need some sort of read-write lock for CompiledArtifact.{save, load}. vLLM will always acquire said locks when it is performing save/load, and we can probably use something like a filelock for this. I've heard that filelocks are unreliable over nfs or other network stores, which does seem like how we are recommending vLLM users mount their ~/.cache/vllm directory, so I'm not sure how viable this is. We also have requirements like sometimes the vLLM cache directory is on a read-only filesystem, so we need to be cognizant of those too...
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
ywx217 and josephrocca
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
To triage