A high-performance RDMA distributed file system for fast LLM Inference and GPU Training
python big-data cpp gpu cuda rdma infiniband distributed-cache kv-cache ucx llm-serving vllm llm-framework sglang gpu-multiplexing
-
Updated
Nov 25, 2025 - C++