Skip to content

Commit c9d97fb

Browse files
adding supporting writeups (#617)
Co-authored-by: Deepkant Jain <deepkant.jain@oracle.com>
1 parent 122b12e commit c9d97fb

File tree

2 files changed

+420
-0
lines changed

2 files changed

+420
-0
lines changed
Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
# Heterogeneous Model Group Deployment
2+
3+
## Description
4+
5+
A **Heterogeneous Model Group** comprises models built on different ML frameworks, such as PyTorch, TensorFlow, ONNX, etc. This group type allows for the deployment of diverse model architectures within a single serving environment.
6+
7+
> ℹ️ Heterogeneous model groups **do not** require a shared model group artifact, as models in the group may rely on different runtimes.
8+
9+
## Use Case
10+
11+
Ideal for scenarios requiring multiple models with different architectures or frameworks deployed together under a unified endpoint.
12+
13+
## Supported Containers
14+
15+
- **BYOC (Bring Your Own Container)** that satisfies the **BYOC Contract** requirements.
16+
- Customers are encouraged to use **NVIDIA Triton Inference Server**, which provides built-in support for diverse frameworks.
17+
18+
## Serving Mechanism
19+
20+
- Customers should use the **BYOC** deployment flow.
21+
- **NVIDIA Triton Inference Server** is recommended for hosting models built with PyTorch, TensorFlow, ONNX Runtime, Custom Python, etc.
22+
- Each model is routed to its corresponding backend automatically.
23+
- **Triton** handles load balancing, routing, and execution optimization across model types.
24+
25+
For details on dependency management, refer to the section [Dependency Management for Heterogeneous Model Group](#dependency-management-for-heterogeneous-model-group).
26+
27+
## Heterogeneous Model Group Structure
28+
29+
```json
30+
{
31+
"modelGroupsDetails": {
32+
"modelGroupConfigurationDetails": {
33+
"modelGroupType": "HETEROGENEOUS"
34+
},
35+
"modelIds": [
36+
{
37+
"inferenceKey": "model1",
38+
"modelId": "ocid.datasciencemodel.xxx1"
39+
},
40+
{
41+
"inferenceKey": "model2",
42+
"modelId": "ocid.datasciencemodel.xxx2"
43+
},
44+
{
45+
"inferenceKey": "model3",
46+
"modelId": "ocid.datasciencemodel.xxx3"
47+
}
48+
]
49+
}
50+
}
51+
```
52+
53+
> **Note:**
54+
> For **BYOC**, Model Deployment enforces a **contract** that containers must follow:
55+
> - Must expose a web server.
56+
> - Must include all runtime dependencies needed to load and run the ML model binaries.
57+
58+
## Dependency Management for Heterogeneous Model Group
59+
60+
> **Note:** This section is applicable only when using the **NVIDIA Triton Inference Server** for Heterogeneous deployments.
61+
62+
### Overview
63+
64+
Triton supports multiple ML frameworks and serves them through corresponding backends.
65+
66+
Triton loads models from one or more **model repositories**, each containing framework-specific models and configuration files.
67+
68+
### Natively Supported Backends
69+
70+
For native backends (e.g., ONNX, TF, PT), models must be organized as per **Triton model repository format**.
71+
72+
#### Sample ONNX Model Directory Structure
73+
74+
```
75+
model_repository/
76+
└── onnx_model/
77+
├── 1/
78+
│ └── model.onnx
79+
└── config.pbtxt
80+
```
81+
82+
#### Sample `config.pbtxt`
83+
84+
```text
85+
name: "onnx_model"
86+
platform: "onnxruntime_onnx"
87+
input [
88+
{
89+
name: "input_tensor"
90+
data_type: TYPE_FP32
91+
dims: [ -1, 3, 224, 224 ]
92+
}
93+
]
94+
output [
95+
{
96+
name: "output_tensor"
97+
data_type: TYPE_FP32
98+
dims: [ -1, 1000 ]
99+
}
100+
]
101+
```
102+
103+
✅ No dependency conflicts are expected for natively supported models.
104+
105+
### Using Python Backend
106+
107+
For models that are not supported natively, Triton provides a **Python backend**.
108+
109+
#### Python Model Directory Structure
110+
111+
```
112+
models/
113+
└── add_sub/
114+
├── 1/
115+
│ └── model.py
116+
└── config.pbtxt
117+
```
118+
119+
#### If Python Version Differs (Custom Stub)
120+
121+
If the default Python version is insufficient, compile a **custom Python backend stub**.
122+
123+
```
124+
models/
125+
└── model_a/
126+
├── 1/
127+
│ └── model.py
128+
├── config.pbtxt
129+
└── triton_python_backend_stub
130+
```
131+
132+
### Models with Custom Execution Environments
133+
134+
Use **Conda-Pack** to bundle all Python dependencies and isolate them per model.
135+
136+
#### Sample Structure with Conda-Pack
137+
138+
```
139+
models/
140+
└── model_a/
141+
├── 1/
142+
│ └── model.py
143+
├── config.pbtxt
144+
├── env/
145+
│ └── model_a_env.tar.gz
146+
└── triton_python_backend_stub
147+
```
148+
149+
#### Add This to `config.pbtxt` for Custom Environment
150+
151+
```text
152+
name: "model_a"
153+
backend: "python"
154+
155+
parameters: {
156+
key: "EXECUTION_ENV_PATH",
157+
value: {string_value: "$$TRITON_MODEL_DIRECTORY/env/model_a_env.tar.gz"}
158+
}
159+
```

0 commit comments

Comments
 (0)