[Bug]: Launching multiple vLLM processes at the same time doesn't work well with vLLM's compile cache

### Your current environment

My makefile:
```
clean:
    rm -rf /tmp/torchinductor_rzou
    rm -rf ~/.cache/vllm/torch_compile_cache
    killall -9  "VLLM::EngineCore"

run:
    VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=2 vllm serve --gpu-memory-utilization 0.1 &
    VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=3 vllm serve --gpu-memory-utilization 0.1 &
    VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=4 vllm serve --gpu-memory-utilization 0.1 &
    VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=5 vllm serve --gpu-memory-utilization 0.1 &
    VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=6 vllm serve --gpu-memory-utilization 0.1 &
    VLLM_ENABLE_V1_MULTIPROCESSING=0 CUDA_VISIBLE_DEVICES=7 vllm serve --gpu-memory-utilization 0.1 &
```
`make run` 

### 🐛 Describe the bug

The problem is:
- we launch vllm in multiple processes
- each process tries to compile and save an artifact to disk in the same location
- each artifact is a directory
- the directories get clobbered

There's multiple ways I think we can fix the problem. I've tried some of these and haven't gotten them to work yet.
- standalone_compile offers a "binary" option instead of a "directory" option. writing to these binary files is atomic (we implemented it as a write+rename), reading from them is multiprocess-safe too (a process will not delete a file that's open until it is closed).
- if we go with the "directory" option, then we need some sort of read-write lock for CompiledArtifact.{save, load}. vLLM will always acquire said locks when it is performing save/load, and we can probably use something like a filelock for this. I've heard that filelocks are unreliable over nfs or other network stores, which does seem like how we are recommending vLLM users mount their ~/.cache/vllm directory, so I'm not sure how viable this is. We also have requirements like sometimes the vLLM cache directory is on a read-only filesystem, so we need to be cognizant of those too...

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: Launching multiple vLLM processes at the same time doesn't work well with vLLM's compile cache #24601

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Launching multiple vLLM processes at the same time doesn't work well with vLLM's compile cache #24601

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions