Skip to content

[Feature][P0]: Switch to Runtime Base Image #28643

@rzabarazesh

Description

@rzabarazesh

🚀 The feature, motivation and pitch

Description

The Dockerfile currently uses nvidia/cuda:12.9.1-devel-ubuntu22.04 as the final base image. The devel variant includes the full CUDA compiler toolchain (~7GB) which is only needed during build, not at runtime. Switching to the runtime variant will significantly reduce image size.

What You'll Do

  1. Change FINAL_BASE_IMAGE from devel to runtime (line 24)
  2. Analyze if any runtime components actually need build tools
  3. Handle FlashInfer JIT compilation requirements:
    • Test if AOT wheels work without build deps
    • If needed, add conditional minimal build tools
  4. Verify all GPU functionality works with runtime image
  5. Update documentation

Deliverables

  • Modified Dockerfile with runtime base image
  • Conditional build dependency installation for FlashInfer (if needed)
  • GPU functionality test results
  • Before/after image size comparison

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions