Skip to content
This repository was archived by the owner on Oct 25, 2024. It is now read-only.

Commit 989671d

Browse files
authored
Add neuralchat quick start example (#1401)
1 parent feff1ec commit 989671d

File tree

13 files changed

+769
-0
lines changed

13 files changed

+769
-0
lines changed
Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
2+
# Build Your Chatbot with Intel® Extension for Transformers neural-chat
3+
4+
# 1 Setup Environment
5+
6+
## 1.1 Install intel-extension-for-transformers
7+
8+
```
9+
conda create -n itrex-chatbot python=3.9
10+
conda activate itrex-chatbot
11+
pip install intel-extension-for-transformers==1.3.2
12+
```
13+
## 1.2 Install neural-chat dependency
14+
15+
```
16+
pip install accelerate
17+
pip install transformers_stream_generator
18+
19+
git clone https://github.com/intel/intel-extension-for-transformers.git ~/itrex
20+
cd ~/itrex
21+
git checkout v1.3.2
22+
23+
cd ~/itrex/intel_extension_for_transformers/neural_chat
24+
```
25+
26+
Setup CPU platform go to [1.2.1](#121-cpu-platform)
27+
28+
Setup GPU platform go to [1.2.2](#122-GPU-Platform)
29+
30+
### 1.2.1 CPU Platform
31+
`pip install -r requirements_cpu.txt`
32+
33+
Got to [Section 2](#2-Run-the-chatbot-in-command-mode).
34+
35+
### 1.2.2 GPU Platform
36+
37+
#### prerequisite
38+
GPU driver and oneAPI 2024.0 is required.
39+
40+
`pip install -r requirements_xpu.txt`
41+
42+
# 2 Run the chatbot in command mode
43+
44+
## Usage
45+
46+
Go back to the quick_example folder and run the example
47+
48+
```
49+
source /opt/intel/oneapi/setvars.sh
50+
python chatbot.py
51+
```
52+
53+
```
54+
/home/xiguiwang/anaconda3/envs/test/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
55+
warn(
56+
2024-03-20 11:22:33,191 - datasets - INFO - PyTorch version 2.1.0a0+cxx11.abi available.
57+
2024-03-20 11:22:33,191 - datasets - INFO - TensorFlow version 2.16.1 available.
58+
/home/xiguiwang/anaconda3/envs/test/lib/python3.9/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
59+
warnings.warn(
60+
Loading model Intel/neural-chat-7b-v3-1
61+
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:01<00:00, 1.77it/s]
62+
2024-03-20 11:22:38,398 - root - INFO - Model loaded.
63+
Once upon a time, a little girl lived in a quaint village nestled among rolling hills. She had a heart filled with curiosity and dreams of adventure. One day, she decided to leave her cozy home behind and set out on a journey to explore the world beyond her familiar surroundings.
64+
65+
As she ventured forth, she encountered many wondrous sights and met fascinating people along the way. The little girl learned about different cultures, customs, and traditions that broadened her perspective and enriched her life. Her experiences taught her valuable lessons about kindness, courage, and resilience.
66+
67+
Throughout her travels, she made lifelong friends who shared her passion for discovery. Together, they faced challenges and celebrated triumphs, forming unbreakable bonds that would last a lifetime.
68+
69+
Eventually, the little girl returned to her village, now a wise and compassionate young woman. She brought back knowledge and memories that inspired others to dream big and follow their hearts. As she grew older, she continued to share her stories and wisdom with those around her, inspiring future generations to embrace the beauty of the unknown and never stop seeking new horizons.
70+
71+
```
72+
73+
# 3. Run chatbot in server mode with UI
74+
75+
## 3.1 Start the service
76+
77+
```
78+
python chatbot_server.py
79+
```
80+
81+
Here is the completely output:
82+
```
83+
(/home/xiguiwang/ws2/conda/itrex-rag) xiguiwang@icx02-tce-atsm:~/ws2/AI-examples/chatbot$ python chatbot_server.py /home/xiguiwang/ws2/conda/itrex-rag/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
84+
warn(
85+
2024-03-18 16:28:01,454 - datasets - INFO - PyTorch version 2.1.0a0+cxx11.abi available.
86+
/home/xiguiwang/ws2/conda/itrex-rag/lib/python3.9/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
87+
warnings.warn(
88+
Loading model Intel/neural-chat-7b-v3-1
89+
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████| 2/2 [00:01<00:00, 1.55it/s]
90+
2024-03-18 16:28:08,634 - root - INFO - Model loaded.
91+
Loading config settings from the environment...
92+
INFO: Started server process [1544268]
93+
INFO: Waiting for application startup.
94+
INFO: Application startup complete.
95+
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
96+
97+
```
98+
99+
100+
### 3.1.1 Verify the client connection to server is OK.
101+
102+
Open a new linux console, run following command
103+
104+
`curl -vv -X POST http://127.0.0.1:8000/v1/chat/completions`
105+
106+
Check the output. Make sure there is no network connection and proxy setting issue at Client side
107+
108+
### 3.1.2 Test request command at client side
109+
110+
Sent a request to chatbat-server from client
111+
112+
113+
```
114+
curl http://127.0.0.1:8000/v1/chat/completions \
115+
-H "Content-Type: application/json" \
116+
-d '{
117+
"model": "Intel/neural-chat-7b-v3-1",
118+
"messages": [
119+
{"role": "system", "content": "You are a helpful assistant."},
120+
{"role": "user", "content": "Tell me about Intel Xeon Scalable Processors."}
121+
]
122+
}'
123+
```
124+
125+
At the server side, there is message:
126+
```
127+
INFO: 127.0.0.1:52532 - "POST /v1/chat/completions HTTP/1.1" 200 OK
128+
```
129+
130+
At the client side, the response are similar message as following.
131+
The message contains the LLM answer and other information about the request.
132+
```
133+
{"id":"chatcmpl-29GVLhfoSJHeHTgqL4HgxP","object":"chat.completion","created":1710750809,"model":"Intel/neural-chat-7b-v3-1","choices":[{"index":0,"message":{"role":"assistant","content":"Intel Xeon Scalable Processors are a series of high-performance central processing units (CPUs) designed for data centers, cloud computing, and other demanding computing environments. They are part of Intel's Xeon family of processors, which are specifically tailored for server and workstation applications.\n\nThe Xeon Scalable Processors were introduced in 2017 and are based on Intel's Skylake microarchitecture. They offer significant improvements in performance, efficiency, and scalability compared to their predecessors. These processors are available in various configurations, including single-socket, dual-socket, and multi-socket systems, catering to different workloads and requirements.\n\nSome key features of Intel Xeon Scalable Processors include:\n\n1. Scalable performance: The processors can be configured to meet specific workload needs, allowing for better resource utilization and improved performance.\n\n2. High-speed memory support: They support up to 6 channels of DDR4 memory, enabling faster data transfer and improved system performance.\n\n3. Advanced security features: The processors come with built-in security features, such as Intel Software Guard Extensions (SGX), which help protect sensitive data and applications from potential threats.\n\n4. Enhanced virtualization capabilities: The Xeon Scalable Processors are designed to support multiple virtual machines, making them suitable for virtualized environments.\n\n5. Improved energy efficiency: The processors are designed to optimize power consumption, reducing operational costs and minimizing environmental impact.\n\nOverall, Intel Xeon Scalable Processors are a powerful and versatile choice for organizations seeking high-performance computing solutions in data centers, cloud environments, and other demanding applications."},"finish_reason":"stop"}],"usage":{"prompt_tokens":0,"total_tokens":0,"completion_tokens":0}}
134+
```
135+
136+
## 3.2 Set up Server mode UI
137+
138+
Create UI conda envitonment
139+
```
140+
conda create -n chatbot-ui python=3.9
141+
conda activate chatbot-ui
142+
143+
cd ~/itrex/intel_extension_for_transformers/neural_chat/ui/gradio/basic
144+
pip install -r requirements.txt
145+
146+
pip install gradio==3.36.0
147+
pip install pydantic==1.10.13
148+
```
149+
150+
## 3.3 Start the web service
151+
152+
Set the default service port
153+
Edit app.py line 745, set the server port. For example we set port as 8008.
154+
155+
```
156+
demo.queue(
157+
concurrency_count=concurrency_count, status_update_rate=10, api_open=False
158+
).launch(
159+
server_name=host, server_port=8008, share=share, max_threads=200
160+
)
161+
```
162+
163+
Start the service:
164+
`python app.py`
165+
166+
The output is as following:
167+
```
168+
/home/xiguiwang/ws2/conda/chatbot-ui/lib/python3.9/site-packages/gradio_client/documentation.py:103: UserWarning: Could not get documentation group for <class 'gradio.mix.Parallel'>: No known documentation group for module 'gradio.mix'
169+
warnings.warn(f"Could not get documentation group for {cls}: {exc}")
170+
/home/xiguiwang/ws2/conda/chatbot-ui/lib/python3.9/site-packages/gradio_client/documentation.py:103: UserWarning: Could not get documentation group for <class 'gradio.mix.Series'>: No known documentation group for module 'gradio.mix'
171+
warnings.warn(f"Could not get documentation group for {cls}: {exc}")
172+
2024-03-27 11:00:24 | INFO | gradio_web_server | Models: ['Intel/neural-chat-7b-v3-1']
173+
2024-03-27 11:00:26 | ERROR | stderr | sys:1: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
174+
2024-03-27 11:00:26 | INFO | stdout | Running on local URL: http://0.0.0.0:8008
175+
2024-03-27 11:00:26 | INFO | stdout |
176+
2024-03-27 11:00:26 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
177+
```
178+
179+
The log shows the service is started on prot 8008.
180+
You can access chatbot through web browser on port 8008 now.
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Copyright (c) 2024 Intel Corporation
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
from intel_extension_for_transformers.neural_chat import build_chatbot
16+
17+
chatbot = build_chatbot()
18+
19+
response = chatbot.predict("Once upon a time, a little girl")
20+
21+
print(response)
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Copyright (c) 2024 Intel Corporation
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
from intel_extension_for_transformers.neural_chat import NeuralChatServerExecutor
16+
17+
server_executor = NeuralChatServerExecutor()
18+
server_executor(config_file="./neuralchat.yaml", log_file="./neuralchat.log")
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
2+
# Copyright (c) 2024 Intel Corporation
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
pip install intel-extension-for-transformers==1.3.2
17+
18+
git clone https://github.com/intel/intel-extension-for-transformers.git itrex
19+
cd itrex
20+
git checkout v1.3.2 #
21+
22+
cd intel_extension_for_transformers/neural_chat
23+
24+
# Install neural-chat dependency for Intel GPU
25+
pip install -r requirements_xpu.txt
26+
27+
pip install accelerate==0.28.0
28+
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
#!/usr/bin/env python
2+
# -*- coding: utf-8 -*-
3+
#
4+
# Copyright (c) 2023 Intel Corporation
5+
#
6+
# Licensed under the Apache License, Version 2.0 (the "License");
7+
# you may not use this file except in compliance with the License.
8+
# You may obtain a copy of the License at
9+
#
10+
# http://www.apache.org/licenses/LICENSE-2.0
11+
#
12+
# Unless required by applicable law or agreed to in writing, software
13+
# distributed under the License is distributed on an "AS IS" BASIS,
14+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
# See the License for the specific language governing permissions and
16+
# limitations under the License.
17+
18+
# This is the parameter configuration file for NeuralChat Serving.
19+
20+
#################################################################################
21+
# SERVER SETTING #
22+
#################################################################################
23+
host: 0.0.0.0
24+
port: 8000
25+
26+
model_name_or_path: "Intel/neural-chat-7b-v3-1"
27+
# tokenizer_name_or_path: ""
28+
# peft_model_path: ""
29+
device: "auto"
30+
31+
asr:
32+
enable: false
33+
args:
34+
# support cpu, hpu, xpu, cuda
35+
device: "cpu"
36+
# support openai/whisper series
37+
model_name_or_path: "openai/whisper-small"
38+
# only can be set to true when the device is set to "cpu"
39+
bf16: false
40+
41+
tts:
42+
enable: false
43+
args:
44+
device: "cpu"
45+
voice: "default"
46+
stream_mode: false
47+
output_audio_path: "./output_audio.wav"
48+
49+
asr_chinese:
50+
enable: false
51+
52+
tts_chinese:
53+
enable: false
54+
args:
55+
device: "cpu"
56+
spk_id: 0
57+
stream_mode: false
58+
output_audio_path: "./output_audio.wav"
59+
60+
retrieval:
61+
enable: false
62+
args:
63+
retrieval_type: "default"
64+
input_path: "./text/"
65+
embedding_model: "BAAI/bge-base-en-v1.5"
66+
persist_dir: "./output"
67+
max_length: 512
68+
process: false
69+
70+
cache:
71+
enable: false
72+
args:
73+
config_dir: "../../pipeline/plugins/caching/cache_config.yaml"
74+
embedding_model_dir: "hkunlp/instructor-large"
75+
76+
safety_checker:
77+
enable: false
78+
79+
ner:
80+
enable: false
81+
args:
82+
spacy_model: "en_core_web_lg"
83+
84+
85+
# task choices = ['textchat', 'voicechat', 'retrieval', 'text2image', 'image2image', 'finetune', 'photoai', 'codegen']
86+
tasks_list: ['textchat', 'retrieval']

0 commit comments

Comments
 (0)