[Performance]: Option for disabling model info collection in subprocess

### Proposal to improve performance

By default vLLM collects model support info in a single sub process per model
(added in in https://github.com/vllm-project/vllm/pull/9233). Specifically, this
https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/registry.py#L336
`_run_in_subprocess` call.

This adds ~4s when running against local ssd and can easily be double or more
against a network filesystem in some environments. Collecting the info
in-process does not seem to have adverse effects, at least based on my limited
manual testing, but I lack context on why this was done in the first place.

Can we make this behaviour configurable via a boolean flag or env var? That way
users could opt out.

`collect_model_info_via_subprocess = True`

Something like

```
if self.model_config.collect_model_info_via_subprocess:
    return _run_in_subprocess(
        lambda: _ModelInfo.from_model_cls(self.load_model_cls()))
return _ModelInfo.from_model_cls(self.load_model_cls())
```


![Image](https://github.com/user-attachments/assets/5426bbd0-5c35-472a-b430-b800ce5d1a58)

Show the latency in "inspect-model" span based on my local wip otel tracing of start up

CC @DarkLight1337 

### Report of performance regression

_No response_

### Misc discussion on performance

_No response_

### Your current environment (if you think it is necessary)




### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Performance]: Option for disabling model info collection in subprocess #19317

Proposal to improve performance

Report of performance regression

Misc discussion on performance

Your current environment (if you think it is necessary)

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Performance]: Option for disabling model info collection in subprocess #19317

Description

Proposal to improve performance

Report of performance regression

Misc discussion on performance

Your current environment (if you think it is necessary)

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions