Skip to content

Commit 0bb013c

Browse files
authored
Update deploy-openai-llm-byoc.md
1 parent f5b7963 commit 0bb013c

File tree

1 file changed

+56
-52
lines changed

1 file changed

+56
-52
lines changed

LLM/deploy-openai-llm-byoc.md

Lines changed: 56 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -11,53 +11,19 @@ Add these [policies](https://github.com/oracle-samples/oci-data-science-ai-sampl
1111

1212

1313

14-
```python
14+
```shell
1515
# Install required python packages
1616

17-
!pip install oracle-ads
18-
!pip install oci
19-
!pip install huggingface_hub
17+
pip install oracle-ads
18+
pip install oci
19+
pip install huggingface_hub
2020
```
2121

2222

23-
```python
24-
# Uncomment this code and set the correct proxy links if have to setup proxy for internet
25-
# import os
26-
# os.environ['http_proxy']="http://myproxy"
27-
# os.environ['https_proxy']="http://myproxy"
28-
29-
# Use os.environ['no_proxy'] to route traffic directly
30-
```
31-
32-
33-
```python
34-
import ads
35-
import os
36-
37-
ads.set_auth("resource_principal")
38-
```
39-
40-
41-
```python
42-
# Extract region information from the Notebook environment variables and signer.
43-
ads.common.utils.extract_region()
44-
```
4523

4624
### Common variables
4725

4826

49-
```python
50-
# change as required for your environment
51-
compartment_id = os.environ["PROJECT_COMPARTMENT_OCID"]
52-
project_id = os.environ["PROJECT_OCID"]
53-
54-
log_group_id = "ocid1.loggroup.oc1.xxx.xxxxx"
55-
log_id = "ocid1.log.oc1.xxx.xxxxx"
56-
57-
instance_shape = "BM.GPU.H100.8"
58-
59-
region = "<your-region>"
60-
```
6127

6228
## API Endpoint Usage
6329

@@ -75,41 +41,79 @@ To prepare Model artifacts for LLM model deployment:
7541
### Model Download from HuggingFace Model Hub
7642

7743

78-
```shell
79-
# Login to huggingface using env variable
80-
huggingface-cli login --token <HUGGINGFACE_TOKEN>
81-
```
8244

83-
[This](https://huggingface.co/docs/huggingface_hub/en/guides/cli#download-an-entire-repository) provides more information on using `huggingface-cli` to download an entire repository at a given revision. Models in the HuggingFace hub are stored in their own repository.
45+
46+
[This documentation](https://huggingface.co/docs/huggingface_hub/en/guides/cli#download-an-entire-repository) provides more information on using `huggingface-cli` to download an entire repository at a given revision. Models in the HuggingFace hub are stored in their own repository.
8447

8548

8649
```shell
8750
# Select the the model that you want to deploy.
8851

89-
huggingface-cli download openai/gpt-oss-120b --local-dir models/gpt-oss-120b
52+
huggingface-cli download openai/gpt-oss-120b --local-dir models/gpt-oss-120b --exclude metal/*
9053
```
9154

55+
Download the titoken file -
56+
57+
```shell
58+
wget -P models/gpt-oss-120b https://openaipublic.blob.core.windows.net/encodings/o200k_base.tiktoken
59+
```
9260
## Upload Model to OCI Object Storage
9361

62+
**Note**: **The bucket has to be versioned bucket**
9463

9564
```shell
96-
oci os object bulk-upload --src-dir $local_dir --prefix gpt-oss-120b/ -bn <bucket_name> -ns <bucket_namespace> --auth "resource_principal"
65+
oci os object bulk-upload --src-dir models/gpt-oss-120b --prefix gpt-oss-120b/ -bn <bucket_name> -ns <bucket_namespace> --auth "resource_principal"
9766
```
9867

9968
## Create Model by Reference using ADS
10069

70+
```python
71+
# Uncomment this code and set the correct proxy links if have to setup proxy for internet
72+
# import os
73+
# os.environ['http_proxy']="http://myproxy"
74+
# os.environ['https_proxy']="http://myproxy"
75+
76+
# Use os.environ['no_proxy'] to route traffic directly
77+
```
10178

10279

80+
```python
81+
import ads
82+
import os
83+
84+
ads.set_auth("resource_principal")
85+
86+
87+
# Extract region information from the Notebook environment variables and signer.
88+
ads.common.utils.extract_region()
89+
```
90+
91+
```python
92+
# change as required for your environment
93+
compartment_id = os.environ["PROJECT_COMPARTMENT_OCID"]
94+
project_id = os.environ["PROJECT_OCID"]
95+
96+
log_group_id = "ocid1.loggroup.oc1.xxx.xxxxx"
97+
log_id = "ocid1.log.oc1.xxx.xxxxx"
98+
99+
instance_shape = "BM.GPU.H100.8"
100+
101+
region = ads.common.utils.extract_region()
102+
```
103+
103104
```python
104105
from ads.model.datascience_model import DataScienceModel
105106

106-
artifact_path = f"oci://{bucket}@{namespace}/{model_prefix}"
107+
bucket=<bucket-name>
108+
namespace=<namespace>
109+
110+
artifact_path = f"oci://{bucket}@{namespace}/gpt-oss-120b"
107111

108112
model = (
109113
DataScienceModel()
110114
.with_compartment_id(compartment_id)
111115
.with_project_id(project_id)
112-
.with_display_name("gpt-oss-120b ")
116+
.with_display_name("gpt-oss-120b")
113117
.with_artifact(artifact_path)
114118
)
115119

@@ -190,11 +194,13 @@ infrastructure = (
190194
```python
191195
env_var = {
192196
"MODEL_DEPLOY_PREDICT_ENDPOINT": "/v1/chat/completions",
197+
"SHM_SIZE": "10g",
198+
"TIKTOKEN_RS_CACHE_DIR":"/opt/ds/model/gpt-oss-120b"
193199
}
194200
195201
cmd_var = [
196202
"--model",
197-
f"/opt/ds/model/deployed_model/{model_prefix}",
203+
f"/opt/ds/model/deployed_model/gpt-oss-120b",
198204
"--tensor-parallel-size",
199205
"8",
200206
"--port",
@@ -228,8 +234,8 @@ container_runtime = (
228234
```python
229235
deployment = (
230236
ModelDeployment()
231-
.with_display_name(f"{model_prefix} MD with BYOC")
232-
.with_description(f"Deployment of {model_prefix} MD with vLLM BYOC container")
237+
.with_display_name(f"gpt-oss-120b MD with BYOC")
238+
.with_description(f"Deployment of gpt-oss-120b MD with vLLM BYOC container")
233239
.with_infrastructure(infrastructure)
234240
.with_runtime(container_runtime)
235241
).deploy(wait_for_completion=False)
@@ -255,8 +261,6 @@ endpoint = f"https://modeldeployment.us-ashburn-1.oci.customer-oci.com/{deployme
255261
256262
current_date = datetime.now().strftime("%d %B %Y")
257263
258-
prompt="What amateur radio bands are best to use when there are solar flares?"
259-
260264
body = {
261265
"model": "openai/gpt-oss-120b", # this is a constant
262266
"messages":[

0 commit comments

Comments
 (0)