You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+29-10Lines changed: 29 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
<divalign="center">
2
2
<h1>
3
-
Codefuse-ModelCache
3
+
ModelCache
4
4
</h1>
5
5
</div>
6
6
@@ -25,6 +25,7 @@ Codefuse-ModelCache
25
25
-[Acknowledgements](#Acknowledgements)
26
26
-[Contributing](#Contributing)
27
27
## news
28
+
- 🔥🔥[2024.04.09] Add Redis Search to store and retrieve embeddings in multi-tenant scene, this can reduce the interaction time between Cache and vector databases to 10ms.
28
29
- 🔥🔥[2023.12.10] we integrate LLM embedding frameworks such as 'llmEmb', 'ONNX', 'PaddleNLP', 'FastText', alone with the image embedding framework 'timm', to bolster embedding functionality.
29
30
- 🔥🔥[2023.11.20] codefuse-ModelCache has integrated local storage, such as sqlite and faiss, providing users with the convenience of quickly initiating tests.
30
31
-[2023.08.26] codefuse-ModelCache...
@@ -39,20 +40,26 @@ The project's startup scripts are divided into flask4modelcache.py and flask4mod
39
40
- Python version: 3.8 and above
40
41
- Package Installation
41
42
```shell
42
-
pip install requirements.txt
43
+
pip install -r requirements.txt
43
44
```
44
45
### Service Startup
45
46
#### Demo Service Startup
46
47
1. Download the embedding model bin file from the following address: [https://huggingface.co/shibing624/text2vec-base-chinese/tree/main](https://huggingface.co/shibing624/text2vec-base-chinese/tree/main). Place the downloaded bin file in the model/text2vec-base-chinese folder.
47
48
2. Start the backend service using the flask4modelcache_dome.py script.
49
+
```shell
50
+
cd CodeFuse-ModelCache
51
+
```
52
+
```shell
53
+
python flask4modelcache_demo.py
54
+
```
48
55
49
56
#### Normal Service Startup
50
57
Before starting the service, the following environment configurations should be performed:
51
-
1. Install the relational database MySQL and import the SQL file to create the data tables. The SQL file can be found at: reference_doc/create_table.sql
58
+
1. Install the relational database MySQL and import the SQL file to create the data tables. The SQL file can be found at: ```reference_doc/create_table.sql```
52
59
2. Install the vector database Milvus.
53
60
3. Add the database access information to the configuration files:
54
-
1. modelcache/config/milvus_config.ini
55
-
2. modelcache/config/mysql_config.ini
61
+
1.```modelcache/config/milvus_config.ini ```
62
+
2.```modelcache/config/mysql_config.ini```
56
63
4. Download the embedding model bin file from the following address: [https://huggingface.co/shibing624/text2vec-base-chinese/tree/main](https://huggingface.co/shibing624/text2vec-base-chinese/tree/main). Place the downloaded bin file in the model/text2vec-base-chinese folder.
57
64
5. Start the backend service using the flask4modelcache.py script.
58
65
## Service-Access
@@ -99,7 +106,7 @@ res = requests.post(url, headers=headers, json=json.dumps(data))
In terms of functionality, we have made several changes to the git repository. Firstly, we have addressed the network issues with huggingface and enhanced the inference speed by introducing local inference capabilities for embeddings. Additionally, considering the limitations of the SqlAlchemy framework, we have completely revamped the module responsible for interacting with relational databases, enabling more flexible database operations. In practical scenarios, LLM products often require integration with multiple users and multiple models. Hence, we have added support for multi-tenancy in the ModelCache, while also making preliminary compatibility adjustments for system commands and multi-turn dialogue.
105
112
@@ -244,11 +251,23 @@ In ModelCache, we adopted the main idea of GPTCache, includes core modules: ada
244
251
- Asynchronous log write-back capability for data analysis and statistics.
245
252
- Added model field and data statistics field for feature expansion.
246
253
247
-
Future Features Under Development:
254
+
## Todo List
255
+
### Adapter
256
+
-[ ] Register adapter for Milvus:Based on the "model" parameter in the scope, initialize the corresponding Collection and perform the load operation.
257
+
### Embedding model&inference
258
+
-[ ] Inference Optimization: Optimizing the speed of embedding inference, compatible with inference engines such as FasterTransformer, TurboTransformers, and ByteTransformer.
259
+
-[ ] Compatibility with Hugging Face models and ModelScope models, offering more methods for model loading.
260
+
### Scalar Storage
261
+
-[ ] Support MongoDB
262
+
-[ ] Support ElasticSearch
263
+
### Vector Storage
264
+
-[ ] Adapts Faiss storage in multimodal scenarios.
265
+
### Ranking
266
+
-[ ] Add ranking model to refine the order of data after embedding recall.
267
+
### Service
268
+
-[ ] Supports FastAPI.
269
+
-[ ] Add visual interface to offer a more direct user experience.
248
270
249
-
-[ ] Data isolation based on hyperparameters.
250
-
-[ ] System prompt partitioning storage capability to enhance accuracy and efficiency of similarity matching.
251
-
-[ ] More versatile embedding models and similarity evaluation algorithms.
252
271
## Acknowledgements
253
272
This project has referenced the following open-source projects. We would like to express our gratitude to the projects and their developers for their contributions and research.<br />[GPTCache](https://github.com/zilliztech/GPTCache)
0 commit comments