Skip to content

Conversation

@lyarwood
Copy link

@lyarwood lyarwood commented Oct 21, 2025

Summary

This PR introduces an initial KubeVirt toolset for the kubernetes-mcp-server, enabling AI agents to create and manage virtual machines through MCP tools.

New Tools

vm_create

Creates VirtualMachines with intelligent parameter handling:

  • Workload resolution: Accepts OS names (fedora, ubuntu, rhel) or full container disk URLs
  • Automatic resource selection: Resolves preferences and instance types from size/performance hints
  • Flexible configuration: Supports explicit instancetype specification or automatic selection
  • Autostart option: Optional parameter to create running VMs (runStrategy: Always)

Example:

{
  "name": "vm_create",
  "arguments": {
    "namespace": "my-namespace",
    "name": "my-vm",
    "workload": "fedora",
    "size": "large",
    "performance": "compute-optimized"
  }
}

vm_start

Starts halted VirtualMachines by changing runStrategy to Always.

vm_stop

Stops running VirtualMachines by changing runStrategy to Halted.

vm_troubleshoot

Provides comprehensive diagnostic guidance for broken VMs:

  • Step-by-step troubleshooting instructions
  • Common issues and solutions
  • Checks for VirtualMachine, VirtualMachineInstance, DataVolumes, virt-launcher pods, and events

Key Features

  • Single-call VM creation: No multi-stage plan/execute workflow
  • Intelligent defaults: Automatic resolution of container disk images, preferences, and instance types
  • Complete lifecycle management: Create → Start → Stop → Troubleshoot
  • Idempotent operations: Tools can be called multiple times safely
  • Non-destructive: Provides safe alternatives to delete/recreate patterns

Testing Results

Validated using the gevals (Generative AI Evaluations) framework across 5 different AI agents/models with 6 tasks each:

Version Success Rate Notes
Without toolset (baseline) 23.3% (7/30) Generic Kubernetes tools only
Original toolset (vm_create + vm_troubleshoot) 93.3% (28/30) 300% improvement
Improved toolset (+ vm_start/stop) 100% (30/30) Perfect reliability

Agent Performance (Improved Toolset)

Agent Success Rate Improvement from Baseline
Claude Code 100% (6/6) +100%
Gemini 100% (6/6) +500%
gemini-2.0-flash 100% (6/6) +500%
gemini-2.5-pro 100% (6/6) +200%
Granite 3.3 8B 100% (6/6) ∞ (from 0%)

Key Achievement: The toolset enables even smaller models like Granite 3.3 8B (which failed all tasks without specialized tools) to achieve perfect success rates.

Design Principles

  1. Single-purpose tools: Each tool has one clear responsibility
  2. Intelligent parameter handling: Flexible inputs with smart defaults
  3. Abstraction of complexity: Hides KubeVirt API details from agents
  4. Complete domain coverage: No gaps forcing inappropriate tool usage
  5. Error recovery support: Multiple valid approaches enable graceful fallback

Implementation Details

  • Location: pkg/toolsets/kubevirt/
  • Structure:
    • vm/create/ - VM creation with template rendering
    • vm/start/ - Non-destructive VM startup
    • vm/stop/ - Non-destructive VM shutdown
    • vm/troubleshoot/ - Diagnostic guidance generation
  • Dependencies: Uses Kubernetes dynamic client for VirtualMachine resources
  • Testing: Comprehensive unit tests and integration tests via gevals

A set of gevals test result/report documents is available outside of this PR under pkg/toolsets/kubevirt/tests/results/ setting out how gevals has been used to first introduce and then improve the new toolset.

@lyarwood lyarwood force-pushed the kubevirt branch 2 times, most recently from 9aa2732 to 4eac816 Compare November 3, 2025 20:27
@lyarwood lyarwood force-pushed the kubevirt branch 2 times, most recently from a135353 to 24ad072 Compare November 4, 2025 19:02
@lyarwood

This comment was marked as outdated.

@Cali0707
Copy link
Collaborator

Cali0707 commented Nov 5, 2025

it would be nice if gevals included /cost and /context data allowing us to assert against it potentially.

@lyarwood +1 from my side on that being nice, the reason we haven't been able to add it there is that we use claude code in the non-interactive setup in gevals, and we haven't been able to figure out how to get that information in the non interactive setup. If you have any ideas, let me know and/or open a PR!

Introduce a comprehensive gevals testing framework to validate VM
lifecycle operations including creation with various configurations
(basic, Ubuntu, instancetypes, performance, sizing) and troubleshooting
scenarios. This enables automated verification of the KubeVirt toolset's
functionality and regression prevention.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
Introduces a new KubeVirt toolset providing virtual machine management
capabilities through MCP tools. The vm_create tool generates comprehensive
creation plans with pre-creation validation of instance types, preferences,
and container disk images, enabling AI assistants to help users create
VirtualMachines with appropriate resource configurations.

The tool supports:
- Workload specification via OS names or container disk URLs
- Auto-selection of instance types based on size/performance hints
- DataSource integration for common OS images
- Comprehensive validation and planning before resource creation

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
Add lifecycle management tools for starting and stopping VirtualMachines.
These tools provide simple, single-action operations that prevent
destructive workarounds like delete/recreate.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
Add optional autostart parameter to vm_create tool that sets runStrategy
to Always instead of Halted, allowing VMs to be created and started in
a single operation.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
@lyarwood lyarwood marked this pull request as ready for review November 7, 2025 15:23
@lyarwood lyarwood changed the title WIP feat(kubevirt): Add VM management toolset feat(kubevirt): Add VM management toolset Nov 7, 2025
@lyarwood lyarwood changed the title feat(kubevirt): Add VM management toolset feat(kubevirt): Add basic VM management toolset Nov 7, 2025
Copy link

@codingben codingben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work Lee. Please consider creating a single VM package with tool.go that will have all VM's actions in one single place, it would help to avoid duplications and will be much better in terms of readablity and maintainability.

Add GetRequiredString, GetOptionalString, and GetOptionalBool methods to
ToolHandlerParams type to eliminate code duplication across kubevirt VM
tools. These methods provide a cleaner, reusable API for extracting
parameters from tool call arguments.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
Extend GetOptionalString method to accept a default value parameter,
allowing callers to specify what value to return when a parameter is
missing or invalid. This simplifies code by eliminating post-call
default value checks.

Use variadic parameters to make the default value argument optional in
GetOptionalString. Callers can now either provide a default value or
omit it to get empty string behavior. This provides more flexibility and
cleaner call sites when empty string is the desired default.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
Add parameter validation tests for the vm_start and vm_stop tools. These
tests verify that required parameters (namespace, name) are properly
validated and that appropriate error messages are returned for missing
or invalid parameters.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
Refactor all kubevirt VM tool tests to use external test packages (_test
suffix) and test only public behavior through the Tools() API. This
ensures tests verify the public interface rather than implementation
details.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
Add comprehensive DEV.md documenting the eval-first development
methodology for extending the KubeVirt toolset. This approach ensures
new features are validated with AI agents before implementation.

The guide promotes writing geval tests BEFORE implementing features,
using the test results to drive API design and validate that tools
work well with AI agents. This ensures features are user-focused
and AI-friendly from day one.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
Copy link

@codingben codingben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Lee.

Just an opinion: For the beginning I see too many folders and scripts to achieve a few VM's actions via MCP tooling. I'd ask the project's maintainers opinion about this, eventually this repository's codebase can be very huge.

@lyarwood
Copy link
Author

Thanks Lee.

Just an opinion: For the beginning I see too many folders and scripts to achieve a few VM's actions via MCP tooling. I'd ask the project's maintainers opinion about this, eventually this repository's codebase can be very huge.

Assuming you are talking about the test directory, I agree it is indeed pretty large at the moment but I've already spoken to folks about improvements to the gevals framework that would help reduce that. I plan on working on introducing builtin agent support and configurable models for the openai-agent this week but until that is merged we will need to carry the extra scripts and config for now.

Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
Signed-off-by: Lee Yarwood <lyarwood@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants