Skip to content
This repository was archived by the owner on Oct 25, 2024. It is now read-only.

Commit 85f2495

Browse files
authored
[DOC]Add modelscope example (#1578)
* add modelscope example Signed-off-by: intellinjun <jun.lin@intel.com> * Update README.md Signed-off-by: intellinjun <105184542+intellinjun@users.noreply.github.com> * update support model Signed-off-by: intellinjun <jun.lin@intel.com> * Update README.md Signed-off-by: intellinjun <105184542+intellinjun@users.noreply.github.com> * update requirements Signed-off-by: intellinjun <jun.lin@intel.com> --------- Signed-off-by: intellinjun <jun.lin@intel.com> Signed-off-by: intellinjun <105184542+intellinjun@users.noreply.github.com>
1 parent f978bcf commit 85f2495

File tree

4 files changed

+68
-0
lines changed

4 files changed

+68
-0
lines changed

examples/modelscope/README.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# ModelScope with ITREX
2+
3+
Intel® Extension for Transformers(ITREX) support almost all the LLMs in Pytorch format from ModelScope such as phi, Qwen, ChatGLM, Baichuan, gemma, etc.
4+
5+
## Usage Example
6+
7+
ITREX provides a script that demonstrates the use of modelscope. Use numactl to improve performance and run it with the following command:
8+
```bash
9+
OMP_NUM_THREADS=num_cores numactl -l -C 0-num_cores-1 python run_modelscope_example.py --model=qwen/Qwen-7B --prompt=你好
10+
```
11+
12+
## Supported and Validated Models
13+
We have validated the majority of existing models using modelscope==1.13.1:
14+
* [qwen/Qwen-7B](https://www.modelscope.cn/models/qwen/Qwen-7B/summary)
15+
* [ZhipuAI/ChatGLM-6B](https://www.modelscope.cn/models/ZhipuAI/ChatGLM-6B/summary)(transformers=4.33.1)
16+
* [ZhipuAI/chatglm2-6b](https://www.modelscope.cn/models/ZhipuAI/chatglm2-6b/summary)(transformers=4.33.1)
17+
* [ZhipuAI/chatglm3-6b](https://www.modelscope.cn/models/ZhipuAI/chatglm3-6b/summary)(transformers=4.33.1)
18+
* [baichuan-inc/Baichuan2-7B-Chat](https://www.modelscope.cn/models/baichuan-inc/Baichuan2-7B-Chat/summary)(transformers=4.33.1)
19+
* [baichuan-inc/Baichuan2-13B-Chat](https://www.modelscope.cn/models/baichuan-inc/Baichuan2-13B-Chat/summary)(transformers=4.33.1)
20+
* [LLM-Research/Phi-3-mini-4k-instruct](https://www.modelscope.cn/models/LLM-Research/Phi-3-mini-4k-instruct/summary)
21+
* [LLM-Research/Phi-3-mini-128k-instruct](https://www.modelscope.cn/models/LLM-Research/Phi-3-mini-128k-instruct/summary)
22+
* [AI-ModelScope/gemma-2b](https://www.modelscope.cn/models/AI-ModelScope/gemma-2b/summary)
23+
24+
If you encounter any problems, please let us know.
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
intel_extension_for_transformers
2+
neural-speed
3+
lm-eval
4+
sentencepiece
5+
gguf
6+
--extra-index-url https://download.pytorch.org/whl/cpu
7+
torch==2.3.0+cpu
8+
transformers
9+
intel_extension_for_pytorch==2.3.0
10+
tiktoken
11+
transformers_stream_generator
12+
zipfile38
13+
modelscope
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
from transformers import TextStreamer
2+
from modelscope import AutoTokenizer
3+
from intel_extension_for_transformers.transformers import AutoModelForCausalLM
4+
from typing import List, Optional
5+
import argparse
6+
7+
def main(args_in: Optional[List[str]] = None) -> None:
8+
parser = argparse.ArgumentParser()
9+
parser.add_argument("--model", type=str, help="Model name: String", required=True, default="qwen/Qwen-7B")
10+
parser.add_argument(
11+
"-p",
12+
"--prompt",
13+
type=str,
14+
help="Prompt to start generation with: String (default: empty)",
15+
default="你好,你可以做点什么?",
16+
)
17+
parser.add_argument("--benchmark", action="store_true")
18+
parser.add_argument("--use_neural_speed", action="store_true")
19+
args = parser.parse_args(args_in)
20+
print(args)
21+
model_name = args.model # Modelscope model_id or local model
22+
prompt = args.prompt
23+
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_4bit=True, model_hub="modelscope")
24+
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
25+
inputs = tokenizer(prompt, return_tensors="pt").input_ids
26+
streamer = TextStreamer(tokenizer)
27+
outputs = model.generate(inputs, streamer=streamer, max_new_tokens=300)
28+
29+
if __name__ == "__main__":
30+
main()

intel_extension_for_transformers/transformers/modeling/modeling_auto.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -322,6 +322,7 @@ class _BaseQBitsAutoModelClass:
322322
"whisper",
323323
"qwen2",
324324
"gemma",
325+
"phi3",
325326
"tinyllama",
326327
]
327328

0 commit comments

Comments
 (0)