You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"text": "Q: Name the planets in the solar system? A: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune and Pluto.",
@@ -51,6 +51,27 @@ pip install llama-cpp-python
51
51
}
52
52
```
53
53
54
+
## Web Server
55
+
56
+
`llama-cpp-python` offers a web server which aims to act as a drop-in replacement for the OpenAI API.
57
+
This allows you to use llama.cpp compatible models with any OpenAI compatible client (language libraries, services, etc).
58
+
59
+
To install the server package and get started:
60
+
61
+
```bash
62
+
pip install llama-cpp-python[server]
63
+
export MODEL=./models/7B/ggml-model.bin
64
+
python3 -m llama_cpp.server
65
+
```
66
+
67
+
Navigate to [http://localhost:8000/docs](http://localhost:8000/docs) to see the OpenAPI documentation.
68
+
69
+
## Low-level API
70
+
71
+
The low-level API is a direct `ctypes` binding to the C API provided by `llama.cpp`.
72
+
The entire API can be found in [llama_cpp/llama_cpp.py](https://github.com/abetlen/llama-cpp-python/blob/master/llama_cpp/llama_cpp.py) and should mirror [llama.h](https://github.com/ggerganov/llama.cpp/blob/master/llama.h).
73
+
74
+
54
75
# Documentation
55
76
56
77
Documentation is available at [https://abetlen.github.io/llama-cpp-python](https://abetlen.github.io/llama-cpp-python).
"text": "Q: Name the planets in the solar system? A: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune and Pluto.",
@@ -49,8 +53,33 @@ pip install llama-cpp-python
49
53
}
50
54
```
51
55
56
+
## Web Server
57
+
58
+
`llama-cpp-python` offers a web server which aims to act as a drop-in replacement for the OpenAI API.
59
+
This allows you to use llama.cpp compatible models with any OpenAI compatible client (language libraries, services, etc).
60
+
61
+
To install the server package and get started:
62
+
63
+
```bash
64
+
pip install llama-cpp-python[server]
65
+
export MODEL=./models/7B/ggml-model.bin
66
+
python3 -m llama_cpp.server
67
+
```
68
+
69
+
Navigate to [http://localhost:8000/docs](http://localhost:8000/docs) to see the OpenAPI documentation.
70
+
71
+
## Low-level API
72
+
73
+
The low-level API is a direct `ctypes` binding to the C API provided by `llama.cpp`.
74
+
The entire API can be found in [llama_cpp/llama_cpp.py](https://github.com/abetlen/llama-cpp-python/blob/master/llama_cpp/llama_cpp.py) and should mirror [llama.h](https://github.com/ggerganov/llama.cpp/blob/master/llama.h).
75
+
76
+
52
77
## Development
53
78
79
+
This package is under active development and I welcome any contributions.
80
+
81
+
To get started, clone the repository and install the package in development mode:
0 commit comments