@@ -7,7 +7,7 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
77| Model architectures | Gemma <br /> Llama \* <br /> Mistral + <br />Phi <br /> | | |
88| Platform | Linux <br /> Windows | | |
99| Architecture | x86 <br /> x64 <br /> | Arm64 | |
10- | Hardware Acceleration | CUDA<br />DirectML<br />IpexLLM | QNN <br /> ROCm | OpenVINO |
10+ | Hardware Acceleration | CUDA<br />DirectML<br />IpexLLM< br />OpenVINO | QNN <br /> ROCm | |
1111
1212\* The Llama model architecture supports similar model families such as CodeLlama, Vicuna, Yi, and more.
1313
@@ -21,9 +21,6 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
2121## Table Content
2222
2323- [ Supported Models] ( #supported-models-quick-start )
24- - [ Onnxruntime DirectML Models] ( ./docs/model/onnxruntime_directml_models.md )
25- - [ Onnxruntime CPU Models] ( ./docs/model/onnxruntime_cpu_models.md )
26- - [ Ipex-LLM Models] ( ./docs/model/ipex_models.md )
2724- [ Getting Started] ( #getting-started )
2825 - [ Installation From Source] ( #installation )
2926 - [ Launch OpenAI API Compatible Server] ( #launch-openai-api-compatible-server )
@@ -34,22 +31,10 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
3431- [ Acknowledgements] ( #acknowledgements )
3532
3633## Supported Models (Quick Start)
37-
38- | Models | Parameters | Context Length | Link |
39- | --- | --- | --- | --- |
40- | Gemma-2b-Instruct v1 | 2B | 8192 | [ EmbeddedLLM/gemma-2b-it-onnx] ( https://huggingface.co/EmbeddedLLM/gemma-2b-it-onnx ) |
41- | Llama-2-7b-chat | 7B | 4096 | [ EmbeddedLLM/llama-2-7b-chat-int4-onnx-directml] ( https://huggingface.co/EmbeddedLLM/llama-2-7b-chat-int4-onnx-directml ) |
42- | Llama-2-13b-chat | 13B | 4096 | [ EmbeddedLLM/llama-2-13b-chat-int4-onnx-directml] ( https://huggingface.co/EmbeddedLLM/llama-2-13b-chat-int4-onnx-directml ) |
43- | Llama-3-8b-chat | 8B | 8192 | [ luweigen/Llama-3-8B-Instruct-int4-onnx-directml] ( https://huggingface.co/luweigen/Llama-3-8B-Instruct-int4-onnx-directml ) |
44- | Mistral-7b-v0.3-instruct | 7B | 32768 | [ EmbeddedLLM/mistral-7b-instruct-v0.3-onnx] ( https://huggingface.co/EmbeddedLLM/mistral-7b-instruct-v0.3-onnx ) |
45- | Phi-3-mini-4k-instruct-062024 | 3.8B | 4096 | [ EmbeddedLLM/Phi-3-mini-4k-instruct-062024-onnx] ( https://huggingface.co/EmbeddedLLM/Phi-3-mini-4k-instruct-062024-onnx/tree/main/onnx/directml/Phi-3-mini-4k-instruct-062024-int4 ) |
46- | Phi3-mini-4k-instruct | 3.8B | 4096 | [ microsoft/Phi-3-mini-4k-instruct-onnx] ( https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx ) |
47- | Phi3-mini-128k-instruct | 3.8B | 128k | [ microsoft/Phi-3-mini-128k-instruct-onnx] ( https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx ) |
48- | Phi3-medium-4k-instruct | 17B | 4096 | [ microsoft/Phi-3-medium-4k-instruct-onnx-directml] ( https://huggingface.co/microsoft/Phi-3-medium-4k-instruct-onnx-directml ) |
49- | Phi3-medium-128k-instruct | 17B | 128k | [ microsoft/Phi-3-medium-128k-instruct-onnx-directml] ( https://huggingface.co/microsoft/Phi-3-medium-128k-instruct-onnx-directml ) |
50- | Openchat-3.6-8b | 8B | 8192 | [ EmbeddedLLM/openchat-3.6-8b-20240522-onnx] ( https://huggingface.co/EmbeddedLLM/openchat-3.6-8b-20240522-onnx ) |
51- | Yi-1.5-6b-chat | 6B | 32k | [ EmbeddedLLM/01-ai_Yi-1.5-6B-Chat-onnx] ( https://huggingface.co/EmbeddedLLM/01-ai_Yi-1.5-6B-Chat-onnx ) |
52- | Phi-3-vision-128k-instruct | | 128k | [ EmbeddedLLM/Phi-3-vision-128k-instruct-onnx] ( https://huggingface.co/EmbeddedLLM/Phi-3-vision-128k-instruct-onnx/tree/main/onnx/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4 ) |
34+ * Onnxruntime DirectML Models [ Link] ( ./docs/model/onnxruntime_directml_models.md )
35+ * Onnxruntime CPU Models [ Link] ( ./docs/model/onnxruntime_cpu_models.md )
36+ * Ipex-LLM Models [ Link] ( ./docs/model/ipex_models.md )
37+ * OpenVINO-LLM Models [ Link] ( ./docs/model/openvino_models.md )
5338
5439## Getting Started
5540
@@ -122,7 +107,7 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
122107
123108### Launch Chatbot Web UI
124109
125- 1. `ellm_chatbot --port 7788 --host localhost --server_port <ellm_server_port> --server_host localhost`. **Note:** To find out more of the supported arguments. `ellm_chatbot --help`.
110+ 1. `ellm_chatbot --port 7788 --host localhost --server_port <ellm_server_port> --server_host localhost --model_name <served_model_name> `. **Note:** To find out more of the supported arguments. `ellm_chatbot --help`.
126111
127112
128113
0 commit comments