Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,13 @@ All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.


## [2.3.0] - 2025-10-20
## [2.2] - 2025-11-04

### Changed
- Updated branding from "vGPU Sizing Advisor" to "AI vWS Sizing Advisor" throughout UI and documentation
- Improved user-facing verbiage for better clarity and consistency

## [2.1] - 2025-10-20

This release focuses on local deployment improvements, enhanced workload differentiation, and improved user experience with advanced configuration options.

Expand Down Expand Up @@ -52,7 +58,7 @@ This release focuses on local deployment improvements, enhanced workload differe
- Better visual feedback and status indicators
- Improved configuration wizard flow

## [2.2.0] - 2025-10-13
## [2.0] - 2025-10-13

This release focuses on the AI vWS Sizing Advisor with enhanced deployment capabilities, improved user experience, and zero external dependencies for SSH operations.

Expand Down Expand Up @@ -137,8 +143,7 @@ This release focuses on the AI vWS Sizing Advisor with enhanced deployment capab
- SSH key-based authentication (more secure than passwords)
- Automatic key generation with proper permissions (700/600)

## [2.1.0] - 2025-05-13

## [1.2] - 2025-05-13

This release reduces overall GPU requirement for the deployment of the blueprint. It also improves the performance and stability for both docker and helm based deployments.

Expand Down Expand Up @@ -168,7 +173,7 @@ This release reduces overall GPU requirement for the deployment of the blueprint

A detailed guide is available [here](./docs/migration_guide.md) for easing developers experience, while migrating from older versions.

## [2.0.0] - 2025-03-18
## [1.1] - 2025-03-18

This release adds support for multimodal documents using [Nvidia Ingest](https://github.com/NVIDIA/nv-ingest) including support for parsing PDFs, Word and PowerPoint documents. It also significantly improves accuracy and perf considerations by refactoring the APIs, architecture as well as adds a new developer friendly UI.

Expand Down Expand Up @@ -202,7 +207,7 @@ This release adds support for multimodal documents using [Nvidia Ingest](https:/

A detailed guide is available [here](./docs/migration_guide.md) for easing developers experience, while migrating from older versions.

## [1.0.0] - 2025-01-15
## [1.0] - 2025-01-15

### Added

Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# vGPU Sizing Advisor for AI vWS
# AI vWS Sizing Advisor

## Overview

vGPU Sizing Advisor is a RAG-powered tool that helps you determine the optimal NVIDIA vGPU configuration for AI workloads on NVIDIA AI Virtual Workstation (AI vWS). Using NVIDIA vGPU documentation and best practices, it provides tailored recommendations for optimal performance and resource efficiency.
AI vWS Sizing Advisor is a RAG-powered tool that helps you determine the optimal NVIDIA vGPU sizing configuration for AI workloads on NVIDIA AI Virtual Workstation (AI vWS). Using NVIDIA vGPU documentation and best practices, it provides tailored recommendations for optimal performance and resource efficiency.

Enter your workload requirements and receive validated recommendations including:

Expand Down Expand Up @@ -52,7 +52,7 @@ docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi
**1. Clone and navigate:**
```bash
git clone https://github.com/NVIDIA/GenerativeAIExamples.git
cd GenerativeAIExamples/community/vgpu-sizing-advisor
cd GenerativeAIExamples/community/ai-vws-sizing-advisor
```

**2. Set NGC API key:**
Expand Down Expand Up @@ -145,6 +145,6 @@ Models governed by [NVIDIA AI Foundation Models Community License](https://docs.

---

**Version:** 2.3.0 (October 2025) - See [CHANGELOG.md](./CHANGELOG.md)
**Version:** 2.2 (November 2025) - See [CHANGELOG.md](./CHANGELOG.md)

**Support:** [GitHub Issues](https://github.com/NVIDIA/GenerativeAIExamples/issues) | [NVIDIA Forums](https://forums.developer.nvidia.com/)
Original file line number Diff line number Diff line change
Expand Up @@ -619,7 +619,7 @@ export default function ApplyConfigurationForm({
<div className="p-6 border-b border-neutral-700">
<div className="flex items-start justify-between">
<div>
<h2 className="text-xl font-semibold text-white">Apply Configuration</h2>
<h2 className="text-xl font-semibold text-white">Deploy Locally</h2>
<p className="text-sm text-gray-400 mt-1">
Deploy vLLM locally using Docker with your recommended configuration
</p>
Expand Down Expand Up @@ -715,8 +715,8 @@ export default function ApplyConfigurationForm({
: isSubmitting
? "Deploying..."
: isConfigurationComplete
? "Apply Configuration Again"
: "Apply Configuration"}
? "Deploy Locally Again"
: "Deploy Locally"}
</button>
</form>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -476,7 +476,7 @@ export default function WorkloadConfigWizard({
<div className="bg-gradient-to-r from-green-600 to-green-700 text-white p-6 rounded-t-lg">
<div className="flex items-center justify-between">
<div>
<h2 className="text-xl font-bold">AI Workload Configuration Wizard</h2>
<h2 className="text-xl font-bold">AI vWS Sizing Advisor Wizard</h2>
<p className="text-green-100 text-sm mt-1">
Configure your AI workload to get personalized vGPU recommendations
</p>
Expand Down Expand Up @@ -510,7 +510,7 @@ export default function WorkloadConfigWizard({
{currentStep === 1 && (
<div className="space-y-6">
<div>
<h3 className="text-lg font-semibold text-white mb-4">What type of AI workload do you need?</h3>
<h3 className="text-lg font-semibold text-white mb-4">What type of AI workload are you running?</h3>
<div className="grid grid-cols-1 md:grid-cols-2 gap-3">
{workloadTypes.map((type) => (
<button
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ export default function Header({ onToggleSidebar, activePanel }: HeaderProps) {
width={128}
height={24}
/>
<span className="text-lg font-semibold text-white">vGPU Sizing Advisor</span>
<span className="text-lg font-semibold text-white">AI vWS Sizing Advisor</span>
</div>

<div className="absolute left-1/2 -translate-x-1/2 transform"></div>
Expand Down
14 changes: 14 additions & 0 deletions nemo/data-flywheel/embedding-finetuning/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,20 @@ Refer to the [platform prerequisites and installation guide](https://docs.nvidia

> **NOTE:** Fine-tuning for embedding models is supported starting with NeMo Microservices version 25.8.0. Please ensure you deploy NeMo Microservices Helm chart version 25.8.0 or later to use these notebooks.

### Register the Base Model

After deploying NeMo Microservices, register the `llama-3.2-nv-embedqa-1b-v2` base model with NeMo Customizer:

```bash
helm upgrade nemo nmp/nemo-microservices-helm-chart --namespace default --reuse-values \
--set customizer.customizationTargets.overrideExistingTargets=false \
--set 'customizer.customizationTargets.targets.nvidia/llama-3\.2-nv-embedqa-1b@v2.enabled=true' && \
kubectl delete pod -n default -l app.kubernetes.io/name=nemo-customizer && \
kubectl wait --for=condition=ready pod -l app.kubernetes.io/name=nemo-customizer -n default --timeout=5m
```

This restarts the customizer to register the model (~2-3 minutes). The base checkpoint downloads from NGC on first use.

### Client-Side Requirements

Ensure you have access to:
Expand Down