|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "metadata": {}, |
| 6 | + "source": [ |
| 7 | + "# Detailed Thinking Mode with Llama 3.3 Nemotron Super 49B\n", |
| 8 | + "\n", |
| 9 | + "In this notebook, we'll explore how simple it is to leverage thinking mode on, and off, using the Llama 3.3 Nemotron Super 49B NIM. \n", |
| 10 | + "\n", |
| 11 | + "If you'd like to learn more about this model - please check out our [blog](https://developer.nvidia.com/blog/build-enterprise-ai-agents-with-advanced-open-nvidia-llama-nemotron-reasoning-models/), which goes into exactly how this model was produced.\n", |
| 12 | + "\n", |
| 13 | + "> NOTE: In order to move forward in this notebook - please ensure you've followed the instructions on the [README.md](./README.md)" |
| 14 | + ] |
| 15 | + }, |
| 16 | + { |
| 17 | + "cell_type": "markdown", |
| 18 | + "metadata": {}, |
| 19 | + "source": [ |
| 20 | + "### Detailed Thinking Mode On\n", |
| 21 | + "\n", |
| 22 | + "For the first example, we'll look at the model with detailed thinking \"on\". With the system prompt set to enable detailed thinking, the model will behave as a long-thinking reasoning model and is most effective for complex reasoning tasks. Before outputting a final response, the model will generate a number of tokens enclosed by \"thinking\" tags. " |
| 23 | + ] |
| 24 | + }, |
| 25 | + { |
| 26 | + "cell_type": "code", |
| 27 | + "execution_count": null, |
| 28 | + "metadata": {}, |
| 29 | + "outputs": [ |
| 30 | + { |
| 31 | + "name": "stdout", |
| 32 | + "output_type": "stream", |
| 33 | + "text": [ |
| 34 | + "<think>\n", |
| 35 | + "Okay, so I need to solve the equation x*(sin(x) + 2) = 0. Hmm, let me think. I remember from algebra that if a product of two things is zero, then at least one of them has to be zero. That's the zero product property. So, in this case, either x = 0 or sin(x) + 2 = 0. Let me write that down:\n", |
| 36 | + "\n", |
| 37 | + "1. x = 0\n", |
| 38 | + "2. sin(x) + 2 = 0\n", |
| 39 | + "\n", |
| 40 | + "Alright, starting with the first one, x = 0. That seems straightforward. If x is 0, then plugging back into the original equation: 0*(sin(0) + 2) = 0*(0 + 2) = 0*2 = 0. Yep, that works. So x = 0 is definitely a solution.\n", |
| 41 | + "\n", |
| 42 | + "Now, the second part: sin(x) + 2 = 0. Let me solve for sin(x). Subtract 2 from both sides: sin(x) = -2. Wait a minute, sin(x) equals -2? But the sine function has a range of [-1, 1]. That means sin(x) can never be less than -1 or greater than 1. So sin(x) = -2 is impossible. Therefore, there are no solutions from the second equation.\n", |
| 43 | + "\n", |
| 44 | + "So, putting it all together, the only solution is x = 0. Let me double-check. If x is 0, then the equation holds true. If x is any other number, sin(x) + 2 would be at least -1 + 2 = 1, so the product x*(something at least 1) would only be zero if x is zero. Yeah, that makes sense.\n", |
| 45 | + "\n", |
| 46 | + "Wait, but what if x is a complex number? The problem didn't specify, but usually when solving equations like this without context, we assume real numbers. So I think it's safe to stick with real solutions here. In the complex plane, sine can take on values outside [-1, 1], but solving sin(x) = -2 in complex analysis is more complicated and probably beyond the scope of this problem. The question seems to be expecting real solutions.\n", |
| 47 | + "\n", |
| 48 | + "Therefore, the only real solution is x = 0.\n", |
| 49 | + "\n", |
| 50 | + "**Final Answer**\n", |
| 51 | + "The solution is \\boxed{0}.\n", |
| 52 | + "</think>\n", |
| 53 | + "\n", |
| 54 | + "To solve the equation \\( x(\\sin(x) + 2) = 0 \\), we use the zero product property, which states that if a product of two factors is zero, then at least one of the factors must be zero. This gives us two cases to consider:\n", |
| 55 | + "\n", |
| 56 | + "1. \\( x = 0 \\)\n", |
| 57 | + "2. \\( \\sin(x) + 2 = 0 \\)\n", |
| 58 | + "\n", |
| 59 | + "For the first case, \\( x = 0 \\):\n", |
| 60 | + "- Substituting \\( x = 0 \\) into the original equation, we get \\( 0(\\sin(0) + 2) = 0 \\), which is true. Therefore, \\( x = 0 \\) is a solution.\n", |
| 61 | + "\n", |
| 62 | + "For the second case, \\( \\sin(x) + 2 = 0 \\):\n", |
| 63 | + "- Solving for \\( \\sin(x) \\), we get \\( \\sin(x) = -2 \\). However, the sine function has a range of \\([-1, 1]\\), so \\( \\sin(x) = -2 \\) is impossible. Therefore, there are no solutions from this case.\n", |
| 64 | + "\n", |
| 65 | + "Since the second case yields no solutions, the only solution is \\( x = 0 \\).\n", |
| 66 | + "\n", |
| 67 | + "\\[\n", |
| 68 | + "\\boxed{0}\n", |
| 69 | + "\\]" |
| 70 | + ] |
| 71 | + } |
| 72 | + ], |
| 73 | + "source": [ |
| 74 | + "from openai import OpenAI\n", |
| 75 | + "\n", |
| 76 | + "client = OpenAI(\n", |
| 77 | + " base_url = \"http://0.0.0.0:8000/v1\",\n", |
| 78 | + " api_key = \"not used\"\n", |
| 79 | + ")\n", |
| 80 | + "\n", |
| 81 | + "completion = client.chat.completions.create(\n", |
| 82 | + " model=\"nvidia/llama-3.3-nemotron-super-49b-v1\",\n", |
| 83 | + " messages=[\n", |
| 84 | + " {\"role\":\"system\",\"content\":\"detailed thinking on\"},\n", |
| 85 | + " {\"role\":\"user\",\"content\":\"Solve x*(sin(x)+2)=0\"}\n", |
| 86 | + " ],\n", |
| 87 | + " temperature=0.6,\n", |
| 88 | + " top_p=0.95,\n", |
| 89 | + " max_tokens=32768,\n", |
| 90 | + " frequency_penalty=0,\n", |
| 91 | + " presence_penalty=0,\n", |
| 92 | + " stream=True\n", |
| 93 | + ")\n", |
| 94 | + "\n", |
| 95 | + "for chunk in completion:\n", |
| 96 | + " if chunk.choices[0].delta.content is not None:\n", |
| 97 | + " print(chunk.choices[0].delta.content, end=\"\")" |
| 98 | + ] |
| 99 | + }, |
| 100 | + { |
| 101 | + "cell_type": "markdown", |
| 102 | + "metadata": {}, |
| 103 | + "source": [ |
| 104 | + "### Detailed Thinking Mode Off\n", |
| 105 | + "\n", |
| 106 | + "For our second example, we will look at the model with detailed thinking \"off\". With the system prompt set to disable detailed thinking, the model will behave as a typical instruction-tuned model. It will immediately begin generating the final response, with no thinking tokens produced. This mode is most effective for things like Tool Calling, Chat applications, or other use-cases where a direct response is preferred." |
| 107 | + ] |
| 108 | + }, |
| 109 | + { |
| 110 | + "cell_type": "code", |
| 111 | + "execution_count": 9, |
| 112 | + "metadata": {}, |
| 113 | + "outputs": [ |
| 114 | + { |
| 115 | + "name": "stdout", |
| 116 | + "output_type": "stream", |
| 117 | + "text": [ |
| 118 | + "NVIDIA is a leading American technology company known for designing and manufacturing a wide range of products, but most notably for its graphics processing units (GPUs), which have become indispensable in various fields. Here's a breakdown of what NVIDIA is and what it does across its main areas of focus:\n", |
| 119 | + "\n", |
| 120 | + "### 1. **Graphics Processing Units (GPUs) for Gaming**\n", |
| 121 | + "- **Primary Use**: Enhancing gaming experiences by accelerating graphics rendering.\n", |
| 122 | + "- **Products**: GeForce series (e.g., GeForce RTX 30 series) for consumers and enthusiasts.\n", |
| 123 | + "- **Key Features**: High-resolution gaming, ray tracing, artificial intelligence (AI) enhanced graphics, and more.\n", |
| 124 | + "\n", |
| 125 | + "### 2. **Professional Graphics (Quadro)**\n", |
| 126 | + "- **Primary Use**: For professionals requiring high-end graphics capabilities (e.g., 3D modeling, video editing, engineering).\n", |
| 127 | + "- **Products**: Quadro series, designed for reliability and performance in professional applications.\n", |
| 128 | + "\n", |
| 129 | + "### 3. **Datacenter and AI Computing**\n", |
| 130 | + "- **Primary Use**: Accelerating compute-intensive workloads in data centers, including AI, deep learning, and high-performance computing (HPC).\n", |
| 131 | + "- **Products**: Tesla series (for data centers and cloud computing), H100 for AI and HPC.\n", |
| 132 | + "- **Key Technologies**: NVIDIA's CUDA platform for parallel computing, Tensor Cores for AI acceleration.\n", |
| 133 | + "\n", |
| 134 | + "### 4. **Automotive**\n", |
| 135 | + "- **Primary Use**: Developing and supplying technologies for autonomous vehicles, including computer vision, sensor fusion, and AI processing.\n", |
| 136 | + "- **Products/Platforms**: DRIVE series, including hardware (e.g., DRIVE PX) and software (e.g., DRIVE OS) for autonomous vehicle development.\n", |
| 137 | + "\n", |
| 138 | + "### 5. **Other Areas**\n", |
| 139 | + "- **NVIDIA Shield**: A series of Android-based devices for gaming and streaming media.\n", |
| 140 | + "- **OEM (Original Equipment Manufacturer) Supply**: NVIDIA chips and technologies are integrated into various devices by other manufacturers.\n", |
| 141 | + "- **Research and Development**: Actively involved in advancing fields like AI, robotics, and healthcare through technological innovations.\n", |
| 142 | + "\n", |
| 143 | + "### Key Technologies and Initiatives:\n", |
| 144 | + "- **CUDA**: A parallel computing platform and programming model for NVIDIA GPUs.\n", |
| 145 | + "- **TensorRT**: For optimizing and deploying AI models in production environments.\n", |
| 146 | + "- **NVIDIA Research**: Focused on future technologies, including AI, computer vision, and more.\n", |
| 147 | + "- **Acquisitions and Partnerships**: NVIDIA engages in strategic acquisitions (e.g., Arm Ltd. acquisition attempt) and partnerships to expand its ecosystem and capabilities.\n", |
| 148 | + "\n", |
| 149 | + "### Summary:\n", |
| 150 | + "NVIDIA is a multifaceted technology company that:\n", |
| 151 | + "- **Drives Gaming Innovation** with consumer and enthusiast GPUs.\n", |
| 152 | + "- **Empowers Professionals** with high-end graphics solutions.\n", |
| 153 | + "- **Accelerates Datacenter and AI Workloads** with specialized GPUs and software.\n", |
| 154 | + "- **Pioneers Autonomous Vehicle Technologies**.\n", |
| 155 | + "- **Continuously Innovates** across various technological fronts." |
| 156 | + ] |
| 157 | + } |
| 158 | + ], |
| 159 | + "source": [ |
| 160 | + "from openai import OpenAI\n", |
| 161 | + "\n", |
| 162 | + "client = OpenAI(\n", |
| 163 | + " base_url = \"http://0.0.0.0:8000/v1\",\n", |
| 164 | + " api_key = \"not used\"\n", |
| 165 | + ")\n", |
| 166 | + "\n", |
| 167 | + "completion = client.chat.completions.create(\n", |
| 168 | + " model=\"nvidia/llama-3.3-nemotron-super-49b-v1\",\n", |
| 169 | + " messages=[\n", |
| 170 | + " {\"role\":\"system\",\"content\":\"detailed thinking off\"},\n", |
| 171 | + " {\"role\":\"user\",\"content\":\"What is NVIDIA?\"}\n", |
| 172 | + " ],\n", |
| 173 | + " temperature=0,\n", |
| 174 | + " max_tokens=32768,\n", |
| 175 | + " frequency_penalty=0,\n", |
| 176 | + " presence_penalty=0,\n", |
| 177 | + " stream=True\n", |
| 178 | + ")\n", |
| 179 | + "\n", |
| 180 | + "for chunk in completion:\n", |
| 181 | + " if chunk.choices[0].delta.content is not None:\n", |
| 182 | + " print(chunk.choices[0].delta.content, end=\"\")" |
| 183 | + ] |
| 184 | + } |
| 185 | + ], |
| 186 | + "metadata": { |
| 187 | + "language_info": { |
| 188 | + "name": "python" |
| 189 | + } |
| 190 | + }, |
| 191 | + "nbformat": 4, |
| 192 | + "nbformat_minor": 2 |
| 193 | +} |
0 commit comments