PoC router/proxy server #34

ngxson · 2025-11-10T15:21:57Z

NOTE:

This is a very hacky PoC, only tested on Mac
Not yet API for download model
Not yet API for unload model
Not yet support for streaming

To download a model to cache:

llama-cli -hf ggml-org/gemma-3-4b-it-GGUF:latest

Then, run server:

llama-server      # note: do not specify -m

API:

GET http://localhost:8080/models

{
    "models": [
        {
            "model": "ggml-org/Qwen2.5-Omni-3B-GGUF:Q4_K_M",
            "loaded": false
        },
        {
            "model": "ggml-org/gemma-3-4b-it-GGUF:Q4_K_M",
            "loaded": false
        }
    ]
}

Load a model:

POST: http://localhost:8080/models/load
body: { "model": "ggml-org/gemma-3-4b-it-GGUF:latest" }

Then, run a completion:

POST: http://localhost:8080/v1/chat/completions
body:
{
  "model": "ggml-org/gemma-3-4b-it-GGUF:latest",
  "messages": [
    {
      "role": "user",
      "content": "who are you"
    }
  ],
  "stream": false,
  "max_tokens": 16
}

coderabbitai · 2025-11-10T15:22:19Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch xsn/poc_proxy_router

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

poc router/proxy server

4ba49f4

github-actions bot added examples server testing labels Nov 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PoC router/proxy server #34

PoC router/proxy server #34

Uh oh!

ngxson commented Nov 10, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Nov 10, 2025

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PoC router/proxy server #34

Are you sure you want to change the base?

PoC router/proxy server #34

Uh oh!

Conversation

ngxson commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Nov 10, 2025

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ngxson commented Nov 10, 2025 •

edited

Loading