You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: model-deployment/containers/nim/README-Nemotron.md
+7-8Lines changed: 7 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,10 +3,6 @@
3
3
The NVIDIA Nemotron family of multimodal models provides state-of-the-art agentic reasoning for graduate-level scientific reasoning, advanced math, coding, instruction following, tool calling, and visual reasoning. Nemotron models excel in vision for enterprise optical character recognition (OCR) and in reasoning for building agentic AI.
4
4
In this guide, we are going to deploy [Mistral-Nemo-12B-Instruct](https://catalog.ngc.nvidia.com/orgs/nim/teams/nv-mistralai/containers/mistral-nemo-12b-instruct) on OCI Data Science.
5
5
6
-
We will describe an [AirGap solution](https://docs.nvidia.com/nim/large-language-models/latest/deploy-air-gap.html) where all pre-requisities like model and inference engine will be first brought to OCI and then used for deployment. Hence, there will be no need for any external connectivity.
7
-
8
-
* Download [nvidia/Mistral-NeMo-12B-Instruct model](https://huggingface.co/nvidia/Mistral-NeMo-12B-Instruct/tree/main) from huggingface.
9
-
* Utilising Object storage to store the model and creating a model catalog pointing to Object storage bucket [Refer](https://github.com/oracle-samples/oci-data-science-ai-samples/tree/main/model-deployment/containers/nim/README-MODEL-CATALOG.md)
10
6
11
7
## Prerequisites
12
8
* Access the corresponding NIM container llm. Click Get Container Button and click Request Access for NIM. At the time of writing this blog, you need a business email address to get access to NIM.
@@ -53,8 +49,7 @@ Once you built and pushed the NIM container, you can now use the `Bring Your Own
53
49
54
50
### Creating Model catalog
55
51
56
-
Follow the steps mentioned [here](https://github.com/oracle-samples/oci-data-science-ai-samples/blob/main/model-deployment/containers/llama2/README.md#model-store-export-api-for-creating-model-artifacts-greater-than-6-gb-in-size)), refer the section One time download to OCI Model Catalog.
57
-
52
+
Create a dummy model catalog entry to link it with a model deployment using any zip file.
58
53
We would utilise the above created model in the next steps to create the Model Deployment.
59
54
60
55
### Create Model deploy
@@ -69,9 +64,13 @@ We would utilise the above created model in the next steps to create the Model D
* Key: `NIM_MAX_MODEL_LEN`, Value: `4000` Note: Can be increased based on instance shape used
72
71
* Under `Models` click on the `Select` button and selectthe Model Catalog entry we created earlier
73
72
* Under `Compute` and then`Specialty and previous generation`selectthe`VM.GPU.A10.2` instance
74
-
* Under `Networking` choose the `Default Networking` option.
73
+
* Under `Networking` choose the `Custom Networking` option and bring the VCN and subnet, which allows Internet access.
75
74
* Under `Logging`selectthe Log Group where you've created your predict and access log and select those correspondingly
76
75
* Select the custom container option `Use a Custom Container Image` and click `Select`
77
76
* Select the OCIR repository and image we pushed earlier
@@ -90,7 +89,7 @@ We would utilise the above created model in the next steps to create the Model D
90
89
oci raw-request \
91
90
--http-method POST \
92
91
--target-uri <MODEL-DEPLOY-ENDPOINT> \
93
-
--request-body '{"model": "/opt/ds/model/deployed_model", "messages": [ { "role":"user", "content":"Hello! How are you?" }, { "role":"assistant", "content":"Hi! I am quite well, how can I help you today?" }, { "role":"user", "content":"Can you write me a song?" } ], "top_p": 1, "n": 1, "max_tokens": 200, "stream": false, "frequency_penalty": 1.0, "stop": ["hello"] }' \
92
+
--request-body '{"model": "mistral-nemo-12b-instruct", "messages": [ { "role":"user", "content":"Hello! How are you?" }, { "role":"assistant", "content":"Hi! I am quite well, how can I help you today?" }, { "role":"user", "content":"Can you write me a song?" } ], "top_p": 1, "n": 1, "max_tokens": 200, "stream": false, "frequency_penalty": 1.0, "stop": ["hello"] }' \
0 commit comments