Skip to content

[Bug]: Launching multiple vLLM processes at the same time doesn't work well with vLLM's compile cache #24601

@zou3519

Description

@zou3519

Your current environment

My makefile:

clean:
    rm -rf /tmp/torchinductor_rzou
    rm -rf ~/.cache/vllm/torch_compile_cache
    killall -9  "VLLM::EngineCore"

run:
    VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=2 vllm serve --gpu-memory-utilization 0.1 &
    VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=3 vllm serve --gpu-memory-utilization 0.1 &
    VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=4 vllm serve --gpu-memory-utilization 0.1 &
    VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=5 vllm serve --gpu-memory-utilization 0.1 &
    VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=6 vllm serve --gpu-memory-utilization 0.1 &
    VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=7 vllm serve --gpu-memory-utilization 0.1 &

make run

🐛 Describe the bug

The problem is:

  • we launch vllm in multiple processes
  • each process tries to compile and save an artifact to disk in the same location
  • each artifact is a directory
  • the directories get clobbered

There's multiple ways I think we can fix the problem. I've tried some of these and haven't gotten them to work yet.

  • standalone_compile offers a "binary" option instead of a "directory" option. writing to these binary files is atomic (we implemented it as a write+rename), reading from them is multiprocess-safe too (a process will not delete a file that's open until it is closed).
  • if we go with the "directory" option, then we need some sort of read-write lock for CompiledArtifact.{save, load}. vLLM will always acquire said locks when it is performing save/load, and we can probably use something like a filelock for this. I've heard that filelocks are unreliable over nfs or other network stores, which does seem like how we are recommending vLLM users mount their ~/.cache/vllm directory, so I'm not sure how viable this is. We also have requirements like sometimes the vLLM cache directory is on a read-only filesystem, so we need to be cognizant of those too...

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    To triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions