Skip to content

Commit b4f5054

Browse files
Added Llama 3.3 Nemotron Super 49B examples (#280)
1 parent d8882c5 commit b4f5054

File tree

5 files changed

+243
-0
lines changed

5 files changed

+243
-0
lines changed
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
3.13
Lines changed: 193 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Detailed Thinking Mode with Llama 3.3 Nemotron Super 49B\n",
8+
"\n",
9+
"In this notebook, we'll explore how simple it is to leverage thinking mode on, and off, using the Llama 3.3 Nemotron Super 49B NIM. \n",
10+
"\n",
11+
"If you'd like to learn more about this model - please check out our [blog](https://developer.nvidia.com/blog/build-enterprise-ai-agents-with-advanced-open-nvidia-llama-nemotron-reasoning-models/), which goes into exactly how this model was produced.\n",
12+
"\n",
13+
"> NOTE: In order to move forward in this notebook - please ensure you've followed the instructions on the [README.md](./README.md)"
14+
]
15+
},
16+
{
17+
"cell_type": "markdown",
18+
"metadata": {},
19+
"source": [
20+
"### Detailed Thinking Mode On\n",
21+
"\n",
22+
"For the first example, we'll look at the model with detailed thinking \"on\". With the system prompt set to enable detailed thinking, the model will behave as a long-thinking reasoning model and is most effective for complex reasoning tasks. Before outputting a final response, the model will generate a number of tokens enclosed by \"thinking\" tags. "
23+
]
24+
},
25+
{
26+
"cell_type": "code",
27+
"execution_count": null,
28+
"metadata": {},
29+
"outputs": [
30+
{
31+
"name": "stdout",
32+
"output_type": "stream",
33+
"text": [
34+
"<think>\n",
35+
"Okay, so I need to solve the equation x*(sin(x) + 2) = 0. Hmm, let me think. I remember from algebra that if a product of two things is zero, then at least one of them has to be zero. That's the zero product property. So, in this case, either x = 0 or sin(x) + 2 = 0. Let me write that down:\n",
36+
"\n",
37+
"1. x = 0\n",
38+
"2. sin(x) + 2 = 0\n",
39+
"\n",
40+
"Alright, starting with the first one, x = 0. That seems straightforward. If x is 0, then plugging back into the original equation: 0*(sin(0) + 2) = 0*(0 + 2) = 0*2 = 0. Yep, that works. So x = 0 is definitely a solution.\n",
41+
"\n",
42+
"Now, the second part: sin(x) + 2 = 0. Let me solve for sin(x). Subtract 2 from both sides: sin(x) = -2. Wait a minute, sin(x) equals -2? But the sine function has a range of [-1, 1]. That means sin(x) can never be less than -1 or greater than 1. So sin(x) = -2 is impossible. Therefore, there are no solutions from the second equation.\n",
43+
"\n",
44+
"So, putting it all together, the only solution is x = 0. Let me double-check. If x is 0, then the equation holds true. If x is any other number, sin(x) + 2 would be at least -1 + 2 = 1, so the product x*(something at least 1) would only be zero if x is zero. Yeah, that makes sense.\n",
45+
"\n",
46+
"Wait, but what if x is a complex number? The problem didn't specify, but usually when solving equations like this without context, we assume real numbers. So I think it's safe to stick with real solutions here. In the complex plane, sine can take on values outside [-1, 1], but solving sin(x) = -2 in complex analysis is more complicated and probably beyond the scope of this problem. The question seems to be expecting real solutions.\n",
47+
"\n",
48+
"Therefore, the only real solution is x = 0.\n",
49+
"\n",
50+
"**Final Answer**\n",
51+
"The solution is \\boxed{0}.\n",
52+
"</think>\n",
53+
"\n",
54+
"To solve the equation \\( x(\\sin(x) + 2) = 0 \\), we use the zero product property, which states that if a product of two factors is zero, then at least one of the factors must be zero. This gives us two cases to consider:\n",
55+
"\n",
56+
"1. \\( x = 0 \\)\n",
57+
"2. \\( \\sin(x) + 2 = 0 \\)\n",
58+
"\n",
59+
"For the first case, \\( x = 0 \\):\n",
60+
"- Substituting \\( x = 0 \\) into the original equation, we get \\( 0(\\sin(0) + 2) = 0 \\), which is true. Therefore, \\( x = 0 \\) is a solution.\n",
61+
"\n",
62+
"For the second case, \\( \\sin(x) + 2 = 0 \\):\n",
63+
"- Solving for \\( \\sin(x) \\), we get \\( \\sin(x) = -2 \\). However, the sine function has a range of \\([-1, 1]\\), so \\( \\sin(x) = -2 \\) is impossible. Therefore, there are no solutions from this case.\n",
64+
"\n",
65+
"Since the second case yields no solutions, the only solution is \\( x = 0 \\).\n",
66+
"\n",
67+
"\\[\n",
68+
"\\boxed{0}\n",
69+
"\\]"
70+
]
71+
}
72+
],
73+
"source": [
74+
"from openai import OpenAI\n",
75+
"\n",
76+
"client = OpenAI(\n",
77+
" base_url = \"http://0.0.0.0:8000/v1\",\n",
78+
" api_key = \"not used\"\n",
79+
")\n",
80+
"\n",
81+
"completion = client.chat.completions.create(\n",
82+
" model=\"nvidia/llama-3.3-nemotron-super-49b-v1\",\n",
83+
" messages=[\n",
84+
" {\"role\":\"system\",\"content\":\"detailed thinking on\"},\n",
85+
" {\"role\":\"user\",\"content\":\"Solve x*(sin(x)+2)=0\"}\n",
86+
" ],\n",
87+
" temperature=0.6,\n",
88+
" top_p=0.95,\n",
89+
" max_tokens=32768,\n",
90+
" frequency_penalty=0,\n",
91+
" presence_penalty=0,\n",
92+
" stream=True\n",
93+
")\n",
94+
"\n",
95+
"for chunk in completion:\n",
96+
" if chunk.choices[0].delta.content is not None:\n",
97+
" print(chunk.choices[0].delta.content, end=\"\")"
98+
]
99+
},
100+
{
101+
"cell_type": "markdown",
102+
"metadata": {},
103+
"source": [
104+
"### Detailed Thinking Mode Off\n",
105+
"\n",
106+
"For our second example, we will look at the model with detailed thinking \"off\". With the system prompt set to disable detailed thinking, the model will behave as a typical instruction-tuned model. It will immediately begin generating the final response, with no thinking tokens produced. This mode is most effective for things like Tool Calling, Chat applications, or other use-cases where a direct response is preferred."
107+
]
108+
},
109+
{
110+
"cell_type": "code",
111+
"execution_count": 9,
112+
"metadata": {},
113+
"outputs": [
114+
{
115+
"name": "stdout",
116+
"output_type": "stream",
117+
"text": [
118+
"NVIDIA is a leading American technology company known for designing and manufacturing a wide range of products, but most notably for its graphics processing units (GPUs), which have become indispensable in various fields. Here's a breakdown of what NVIDIA is and what it does across its main areas of focus:\n",
119+
"\n",
120+
"### 1. **Graphics Processing Units (GPUs) for Gaming**\n",
121+
"- **Primary Use**: Enhancing gaming experiences by accelerating graphics rendering.\n",
122+
"- **Products**: GeForce series (e.g., GeForce RTX 30 series) for consumers and enthusiasts.\n",
123+
"- **Key Features**: High-resolution gaming, ray tracing, artificial intelligence (AI) enhanced graphics, and more.\n",
124+
"\n",
125+
"### 2. **Professional Graphics (Quadro)**\n",
126+
"- **Primary Use**: For professionals requiring high-end graphics capabilities (e.g., 3D modeling, video editing, engineering).\n",
127+
"- **Products**: Quadro series, designed for reliability and performance in professional applications.\n",
128+
"\n",
129+
"### 3. **Datacenter and AI Computing**\n",
130+
"- **Primary Use**: Accelerating compute-intensive workloads in data centers, including AI, deep learning, and high-performance computing (HPC).\n",
131+
"- **Products**: Tesla series (for data centers and cloud computing), H100 for AI and HPC.\n",
132+
"- **Key Technologies**: NVIDIA's CUDA platform for parallel computing, Tensor Cores for AI acceleration.\n",
133+
"\n",
134+
"### 4. **Automotive**\n",
135+
"- **Primary Use**: Developing and supplying technologies for autonomous vehicles, including computer vision, sensor fusion, and AI processing.\n",
136+
"- **Products/Platforms**: DRIVE series, including hardware (e.g., DRIVE PX) and software (e.g., DRIVE OS) for autonomous vehicle development.\n",
137+
"\n",
138+
"### 5. **Other Areas**\n",
139+
"- **NVIDIA Shield**: A series of Android-based devices for gaming and streaming media.\n",
140+
"- **OEM (Original Equipment Manufacturer) Supply**: NVIDIA chips and technologies are integrated into various devices by other manufacturers.\n",
141+
"- **Research and Development**: Actively involved in advancing fields like AI, robotics, and healthcare through technological innovations.\n",
142+
"\n",
143+
"### Key Technologies and Initiatives:\n",
144+
"- **CUDA**: A parallel computing platform and programming model for NVIDIA GPUs.\n",
145+
"- **TensorRT**: For optimizing and deploying AI models in production environments.\n",
146+
"- **NVIDIA Research**: Focused on future technologies, including AI, computer vision, and more.\n",
147+
"- **Acquisitions and Partnerships**: NVIDIA engages in strategic acquisitions (e.g., Arm Ltd. acquisition attempt) and partnerships to expand its ecosystem and capabilities.\n",
148+
"\n",
149+
"### Summary:\n",
150+
"NVIDIA is a multifaceted technology company that:\n",
151+
"- **Drives Gaming Innovation** with consumer and enthusiast GPUs.\n",
152+
"- **Empowers Professionals** with high-end graphics solutions.\n",
153+
"- **Accelerates Datacenter and AI Workloads** with specialized GPUs and software.\n",
154+
"- **Pioneers Autonomous Vehicle Technologies**.\n",
155+
"- **Continuously Innovates** across various technological fronts."
156+
]
157+
}
158+
],
159+
"source": [
160+
"from openai import OpenAI\n",
161+
"\n",
162+
"client = OpenAI(\n",
163+
" base_url = \"http://0.0.0.0:8000/v1\",\n",
164+
" api_key = \"not used\"\n",
165+
")\n",
166+
"\n",
167+
"completion = client.chat.completions.create(\n",
168+
" model=\"nvidia/llama-3.3-nemotron-super-49b-v1\",\n",
169+
" messages=[\n",
170+
" {\"role\":\"system\",\"content\":\"detailed thinking off\"},\n",
171+
" {\"role\":\"user\",\"content\":\"What is NVIDIA?\"}\n",
172+
" ],\n",
173+
" temperature=0,\n",
174+
" max_tokens=32768,\n",
175+
" frequency_penalty=0,\n",
176+
" presence_penalty=0,\n",
177+
" stream=True\n",
178+
")\n",
179+
"\n",
180+
"for chunk in completion:\n",
181+
" if chunk.choices[0].delta.content is not None:\n",
182+
" print(chunk.choices[0].delta.content, end=\"\")"
183+
]
184+
}
185+
],
186+
"metadata": {
187+
"language_info": {
188+
"name": "python"
189+
}
190+
},
191+
"nbformat": 4,
192+
"nbformat_minor": 2
193+
}
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# Detailed Thinking Mode with Llama 3.3 Nemotron Super 49B
2+
3+
In the notebook in this directory, we'll explore how simple it is to leverage thinking mode on, and off, using the Llama 3.3 Nemotron Super 49B NIM - a single model with the ability to modify how it generates responses through a simple toggle in the system prompt.
4+
5+
If you'd like to learn more about this model - please check out our [blog](https://developer.nvidia.com/blog/build-enterprise-ai-agents-with-advanced-open-nvidia-llama-nemotron-reasoning-models/), which goes into exactly how this model was produced.
6+
7+
To begin, we'll first need to download our NIM - which we can do following the detailed instructions on the Llama 3.3 Nemotron Super 49B [model card on build.nvidia.com](https://build.nvidia.com/nvidia/llama-3_3-nemotron-super-49b-v1).
8+
9+
## Downloading Our NIM
10+
11+
First, we'll need to generate our API key, you can find this by navigating to the "Deploy" tab on the build.nvidia.com website.
12+
13+
![image](./images/api_key.png)
14+
15+
Next, let's login to the NVIDIA Container Registry using the following command:
16+
17+
```bash
18+
docker login nvcr.io
19+
```
20+
21+
Next, all we need to do is run the following command and wait for our NIM to spin up!
22+
23+
```bash
24+
export NGC_API_KEY=<PASTE_API_KEY_HERE>
25+
export LOCAL_NIM_CACHE=~/.cache/nim
26+
mkdir -p "$LOCAL_NIM_CACHE"
27+
docker run -it --rm \
28+
--gpus all \
29+
--shm-size=16GB \
30+
-e NGC_API_KEY \
31+
-v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
32+
-u $(id -u) \
33+
-p 8000:8000 \
34+
nvcr.io/nim/nvidia/llama-3.3-nemotron-super-49b-v1:latest
35+
```
36+
37+
## Using Our NIM!
38+
39+
We'll follow [this notebook](./Detailed%20Thinking%20Mode%20with%20Llama%203.3%20Nemotron%20Super%2049B.ipynb) for some examples on how to use the Llama 3.3 Nemotron Super 49B NIM in both Detailed Think On, and Off mode!
22 KB
Loading
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
[project]
2+
name = "nimexample"
3+
version = "0.1.0"
4+
description = "Add your description here"
5+
readme = "README.md"
6+
requires-python = ">=3.13"
7+
dependencies = [
8+
"jupyter>=1.1.1",
9+
"openai>=1.66.3",
10+
]

0 commit comments

Comments
 (0)