-
Notifications
You must be signed in to change notification settings - Fork 260
Sv2 long running pipeline #4112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new long-running test stage to the SwiftV2 pipeline that repeatedly creates and deletes PodNetwork (pn), PodNetworkInstance (pni), and Pod objects to test the datapath over extended periods.
Key changes:
- New Ginkgo test suite that runs indefinitely, cycling through resource creation and deletion every 35 minutes
- Helper functions for Azure resource queries and Kubernetes operations
- Pipeline configuration updates to run the new test stage with unlimited timeout
- Refactored VNet creation scripts to use loops and dynamic cluster naming
Reviewed Changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/integration/swiftv2/longRunningCluster/datapath_test.go | New test suite for long-running datapath tests with resource lifecycle management |
| test/integration/swiftv2/longRunningCluster/datapath.go | Helper functions to create Kubernetes resources from templates |
| test/integration/swiftv2/helpers/az_helpers.go | New helper functions for Azure CLI operations and Kubernetes resource management |
| test/integration/manifests/swiftv2/long-running-cluster/*.yaml | Kubernetes resource templates for PodNetwork, PodNetworkInstance, and Pods |
| hack/aks/Makefile | New targets for delegated subnet creation and dummy cluster provisioning |
| go.mod & go.sum | Updated Go version and dependencies, removed version pinning for Ginkgo/Gomega |
| .pipelines/swiftv2-long-running/template/long-running-pipeline-template.yaml | New test stage with unlimited timeout |
| .pipelines/swiftv2-long-running/scripts/create_vnets.sh | Refactored to use loops and create delegation clusters dynamically |
| .pipelines/swiftv2-long-running/scripts/create_aks.sh | Added kubeconfig export for test consumption |
| .pipelines/swiftv2-long-running/pipeline.yaml | Updated default VM size |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for { | ||
| iteration++ |
Copilot
AI
Nov 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The infinite loop lacks any graceful termination mechanism or error handling for catastrophic failures. If the test encounters repeated failures, it will continue indefinitely. Consider adding a context with cancellation or a maximum iteration count to allow controlled shutdown.
| func runAzCommand(cmd string, args ...string) string { | ||
| out, err := exec.Command(cmd, args...).CombinedOutput() | ||
| if err != nil { | ||
| panic(fmt.Sprintf("Failed to run %s %v: %s", cmd, args, string(out))) | ||
| } | ||
| return strings.TrimSpace(string(out)) | ||
| } | ||
|
|
||
| func GetVnetGUID(rg, vnet string) string { | ||
| return runAzCommand("az", "network", "vnet", "show", "--resource-group", rg, "--name", vnet, "--query", "resourceGuid", "-o", "tsv") | ||
| } | ||
|
|
||
| func GetSubnetARMID(rg, vnet, subnet string) string { | ||
| return runAzCommand("az", "network", "vnet", "subnet", "show", "--resource-group", rg, "--vnet-name", vnet, "--name", subnet, "--query", "id", "-o", "tsv") | ||
| } | ||
|
|
||
| func GetSubnetGUID(rg, vnet, subnet string) string { | ||
| subnetID := GetSubnetARMID(rg, vnet, subnet) |
Copilot
AI
Nov 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using panic in a helper function is not idiomatic for Go libraries and makes error handling difficult for callers. Return an error instead and let the caller decide how to handle failures. This is especially important in test code where you want meaningful test failures, not panics.
| func runAzCommand(cmd string, args ...string) string { | |
| out, err := exec.Command(cmd, args...).CombinedOutput() | |
| if err != nil { | |
| panic(fmt.Sprintf("Failed to run %s %v: %s", cmd, args, string(out))) | |
| } | |
| return strings.TrimSpace(string(out)) | |
| } | |
| func GetVnetGUID(rg, vnet string) string { | |
| return runAzCommand("az", "network", "vnet", "show", "--resource-group", rg, "--name", vnet, "--query", "resourceGuid", "-o", "tsv") | |
| } | |
| func GetSubnetARMID(rg, vnet, subnet string) string { | |
| return runAzCommand("az", "network", "vnet", "subnet", "show", "--resource-group", rg, "--vnet-name", vnet, "--name", subnet, "--query", "id", "-o", "tsv") | |
| } | |
| func GetSubnetGUID(rg, vnet, subnet string) string { | |
| subnetID := GetSubnetARMID(rg, vnet, subnet) | |
| func runAzCommand(cmd string, args ...string) (string, error) { | |
| out, err := exec.Command(cmd, args...).CombinedOutput() | |
| if err != nil { | |
| return "", fmt.Errorf("failed to run %s %v: %s", cmd, args, string(out)) | |
| } | |
| return strings.TrimSpace(string(out)), nil | |
| } | |
| func GetVnetGUID(rg, vnet string) (string, error) { | |
| return runAzCommand("az", "network", "vnet", "show", "--resource-group", rg, "--name", vnet, "--query", "resourceGuid", "-o", "tsv") | |
| } | |
| func GetSubnetARMID(rg, vnet, subnet string) (string, error) { | |
| return runAzCommand("az", "network", "vnet", "subnet", "show", "--resource-group", rg, "--vnet-name", vnet, "--name", subnet, "--query", "id", "-o", "tsv") | |
| } | |
| func GetSubnetGUID(rg, vnet, subnet string) (string, error) { | |
| subnetID, err := GetSubnetARMID(rg, vnet, subnet) | |
| if err != nil { | |
| return "", fmt.Errorf("failed to get subnet ARM ID: %w", err) | |
| } |
| @i=1; \ | ||
| for subnet in $(EXTRA_SUBNETS); do \ | ||
| prefix=$$(echo $(EXTRA_SUBNET_CIDRS) | cut -d',' -f$$i); \ | ||
| $(AZCLI) network vnet subnet create -g $(GROUP) --vnet-name $(VNET) --name $$subnet --address-prefixes $$prefix -o none; \ | ||
| i=$$((i+1)); \ |
Copilot
AI
Nov 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop increments i after creating each subnet, but if EXTRA_SUBNETS and EXTRA_SUBNET_CIDRS have mismatched counts, this will silently use empty prefixes or wrong prefixes for later subnets. There's no validation that the two lists have the same number of elements. Add validation or use a safer iteration method.
| @i=1; \ | |
| for subnet in $(EXTRA_SUBNETS); do \ | |
| prefix=$$(echo $(EXTRA_SUBNET_CIDRS) | cut -d',' -f$$i); \ | |
| $(AZCLI) network vnet subnet create -g $(GROUP) --vnet-name $(VNET) --name $$subnet --address-prefixes $$prefix -o none; \ | |
| i=$$((i+1)); \ | |
| @subnets="$(EXTRA_SUBNETS)"; \ | |
| cidrs="$(EXTRA_SUBNET_CIDRS)"; \ | |
| subnet_count=$$(echo $$subnets | awk '{print NF}'); \ | |
| cidr_count=$$(echo $$cidrs | awk -F',' '{print NF}'); \ | |
| if [ "$$subnet_count" -ne "$$cidr_count" ]; then \ | |
| echo "Error: Number of EXTRA_SUBNETS ($$subnet_count) does not match number of EXTRA_SUBNET_CIDRS ($$cidr_count)"; \ | |
| exit 1; \ | |
| fi; \ | |
| i=1; \ | |
| for subnet in $$subnets; do \ | |
| prefix=$$(echo $$cidrs | cut -d',' -f$$i); \ | |
| $(AZCLI) network vnet subnet create -g $(GROUP) --vnet-name $(VNET) --name $$subnet --address-prefixes $$prefix -o none; \ | |
| i=$$((i+1)); \ |
Reason for Change:
Added a new stage to the long running cluster to create and delete pn, pni and pod objects repeatedly.
Requirements:
Notes: