Skip to content

Conversation

@aThorp96
Copy link
Member

@aThorp96 aThorp96 commented Nov 5, 2025

📝 Description of the Change

Calls to vcs.GetFiles() only result in one or two API calls (paging
aside). However the method is called each time a CEL expression
references "path".pathChanged() or the on-path-change annotation is
checked for matches. Because of this, because matching is evaluated
several times in a row, and because some users have tens of PipelineRuns
in their .tekton directory, in some cases a single push event may cause
PaC to make hundreds of API requests listing the same files repeatedly.

Also fixes a bug in the golang-ci settings and enables exhaustive.default-signifies-exhaustive

👨🏻‍ Linked Jira

https://issues.redhat.com/browse/SRVKP-9228

🔗 Linked GitHub Issue

Fixes #

🚀 Type of Change

  • 🐛 Bug fix (fix:)
  • ✨ New feature (feat:)
  • 💥 Breaking change (feat!:, fix!:)
  • 📚 Documentation update (docs:)
  • ⚙️ Chore (chore:)
  • 💅 Refactor (refactor:)
  • 🔧 Enhancement (enhance:)
  • 📦 Dependency update (deps:)

🧪 Testing Strategy

  • Unit tests
  • Integration tests
  • End-to-end tests
  • Manual testing
  • Not Applicable

🤖 AI Assistance

  • I have not used any AI assistance for this PR.
  • I have used AI assistance for this PR.

If you have used AI assistance, please provide the following details:

Which LLM was used?

  • GitHub Copilot
  • ChatGPT (OpenAI)
  • Claude (Anthropic)
  • Cursor
  • Gemini (Google)
  • Other: ____________

Extent of AI Assistance:

  • Documentation and research only
  • Unit tests or E2E tests only
  • Code generation (parts of the code)
  • Full code generation (most of the PR)
  • PR description and comments
  • Commit message(s)

Important

If the majority of the code in this PR was generated by an AI, please add a Co-authored-by trailer to your commit message.
For example:

Co-authored-by: Gemini gemini@google.com
Co-authored-by: ChatGPT noreply@chatgpt.com
Co-authored-by: Claude noreply@anthropic.com
Co-authored-by: Cursor noreply@cursor.com
Co-authored-by: Copilot Copilot@users.noreply.github.com

**💡You can use the script ./hack/add-llm-coauthor.sh to automatically add
these co-author trailers to your commits.

✅ Submitter Checklist

  • 📝 My commit messages are clear, informative, and follow the project's How to write a git commit message guide. The Gitlint linter ensures in CI it's properly validated
  • ✨ I have ensured my commit message prefix (e.g., fix:, feat:) matches the "Type of Change" I selected above.
  • ♽ I have run make test and make lint locally to check for and fix any
    issues. For an efficient workflow, I have considered installing
    pre-commit and running pre-commit install to
    automate these checks.
  • 📖 I have added or updated documentation for any user-facing changes.
  • 🧪 I have added sufficient unit tests for my code changes.
  • 🎁 I have added end-to-end tests where feasible. See README for more details.
  • 🔎 I have addressed any CI test flakiness or provided a clear reason to bypass it.
  • If adding a provider feature, I have filled in the following and updated the provider documentation:
    • GitHub App
    • GitHub Webhook
    • Gitea/Forgejo
    • GitLab
    • Bitbucket Cloud
    • Bitbucket Data Center

@gemini-code-assist
Copy link

Summary of Changes

Hello @aThorp96, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant performance enhancement by implementing a caching mechanism for changed files within the Git provider integrations. Previously, repeated calls to vcs.GetFiles() could lead to an excessive number of API requests, particularly in scenarios involving complex CEL expressions or numerous PipelineRuns. By caching these results, the system now avoids redundant API calls, thereby improving overall efficiency and reducing the likelihood of hitting API rate limits. Additionally, the PR includes a minor fix to the golang-ci configuration and enables a new linter setting for more comprehensive code analysis.

Highlights

  • Performance Optimization: Implemented caching for changed files in Git providers (Bitbucket Data Center, GitHub, GitLab) to drastically reduce redundant API calls during event processing.
  • Reduced API Calls: Prevents hundreds of repeated API requests for the same files, especially when evaluating CEL expressions or on-path-change annotations, improving efficiency and avoiding rate limits.
  • Linter Configuration Update: Fixed a bug in .golangci.yml by updating misspell linter's ignore-words to ignore-rules and enabled exhaustive.default-signifies-exhaustive.
  • Centralized Metric Reset: Introduced a new utility function pkg/test/metrics.ResetMetrics() to streamline metric cleanup in tests across the codebase.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces caching for the GetFiles method in the Bitbucket Data Center, GitHub, and GitLab providers. This is a great performance enhancement that will reduce the number of API calls made during event processing. The implementation is consistent across all providers, and the tests have been updated to verify the new caching behavior and related metrics. The refactoring of test helper functions into a shared package is also a welcome improvement for maintainability. I've added a few suggestions to further reduce code duplication within the GetFiles methods, which should improve the code's readability and make it easier to maintain in the future.

Comment on lines +364 to 448
switch runevent.TriggerTarget {
case triggertype.PullRequest:
opts := &scm.ListOptions{Page: 1, Size: apiResponseLimit}
changedFiles := changedfiles.ChangedFiles{}
for {
changes, _, err := v.Client().PullRequests.ListChanges(ctx, OrgAndRepo, runevent.PullRequestNumber, opts)
if err != nil {
return changedfiles.ChangedFiles{}, fmt.Errorf("failed to list changes for pull request: %w", err)
}
if c.Deleted {
changedFiles.Deleted = append(changedFiles.Deleted, c.Path)
}
}

// In the Jenkins-x/go-scm package, the `isLastPage` field is not available, and the value of
// `response.Page.Last` is set to `0`. Therefore, to determine if there are more items to fetch,
// we can check if the length of the currently fetched items is less than the specified limit.
// If the length is less than the limit, it indicates that there are no more items to retrieve.
if len(changes) < apiResponseLimit {
break
}
for _, c := range changes {
changedFiles.All = append(changedFiles.All, c.Path)
if c.Added {
changedFiles.Added = append(changedFiles.Added, c.Path)
}
if c.Modified {
changedFiles.Modified = append(changedFiles.Modified, c.Path)
}
if c.Renamed {
changedFiles.Renamed = append(changedFiles.Renamed, c.Path)
}
if c.Deleted {
changedFiles.Deleted = append(changedFiles.Deleted, c.Path)
}
}

opts.Page++
}
return changedFiles, nil
}
// In the Jenkins-x/go-scm package, the `isLastPage` field is not available, and the value of
// `response.Page.Last` is set to `0`. Therefore, to determine if there are more items to fetch,
// we can check if the length of the currently fetched items is less than the specified limit.
// If the length is less than the limit, it indicates that there are no more items to retrieve.
if len(changes) < apiResponseLimit {
break
}

if runevent.TriggerTarget == triggertype.Push {
opts := &scm.ListOptions{Page: 1, Size: apiResponseLimit}
changedFiles := changedfiles.ChangedFiles{}
for {
changes, _, err := v.Client().Git.ListChanges(ctx, OrgAndRepo, runevent.SHA, opts)
if err != nil {
return changedfiles.ChangedFiles{}, fmt.Errorf("failed to list changes for commit %s: %w", runevent.SHA, err)
opts.Page++
}

for _, c := range changes {
changedFiles.All = append(changedFiles.All, c.Path)
if c.Added {
changedFiles.Added = append(changedFiles.Added, c.Path)
}
if c.Modified {
changedFiles.Modified = append(changedFiles.Modified, c.Path)
v.changedFiles = &changedFiles
case triggertype.Push:
opts := &scm.ListOptions{Page: 1, Size: apiResponseLimit}
changedFiles := changedfiles.ChangedFiles{}
for {
changes, _, err := v.Client().Git.ListChanges(ctx, OrgAndRepo, runevent.SHA, opts)
if err != nil {
return changedfiles.ChangedFiles{}, fmt.Errorf("failed to list changes for commit %s: %w", runevent.SHA, err)
}
if c.Renamed {
changedFiles.Renamed = append(changedFiles.Renamed, c.Path)

for _, c := range changes {
changedFiles.All = append(changedFiles.All, c.Path)
if c.Added {
changedFiles.Added = append(changedFiles.Added, c.Path)
}
if c.Modified {
changedFiles.Modified = append(changedFiles.Modified, c.Path)
}
if c.Renamed {
changedFiles.Renamed = append(changedFiles.Renamed, c.Path)
}
if c.Deleted {
changedFiles.Deleted = append(changedFiles.Deleted, c.Path)
}
}
if c.Deleted {
changedFiles.Deleted = append(changedFiles.Deleted, c.Path)

if len(changes) < apiResponseLimit {
break
}
}

if len(changes) < apiResponseLimit {
break
opts.Page++
}

opts.Page++
v.changedFiles = &changedFiles
default:
v.changedFiles = &changedfiles.ChangedFiles{}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for processing the []*scm.Change slice is duplicated in the PullRequest and Push cases. This could be refactored into a helper function to reduce code duplication and improve maintainability.

For example, you could create a helper function like this:

func processScmChanges(changes []*scm.Change, changedFiles *changedfiles.ChangedFiles) {
	for _, c := range changes {
		changedFiles.All = append(changedFiles.All, c.Path)
		if c.Added {
			changedFiles.Added = append(changedFiles.Added, c.Path)
		}
		if c.Modified {
			changedFiles.Modified = append(changedFiles.Modified, c.Path)
		}
		if c.Renamed {
			changedFiles.Renamed = append(changedFiles.Renamed, c.Path)
		}
		if c.Deleted {
			changedFiles.Deleted = append(changedFiles.Deleted, c.Path)
		}
	}
}

And then call it from both case blocks.

Comment on lines +507 to 582
switch runevent.TriggerTarget {
case triggertype.PullRequest:
opt := &github.ListOptions{PerPage: v.PaginedNumber}
changedFiles := changedfiles.ChangedFiles{}
for {
repoCommit, resp, err := wrapAPI(v, "list_pull_request_files", func() ([]*github.CommitFile, *github.Response, error) {
return v.Client().PullRequests.ListFiles(ctx, runevent.Organization, runevent.Repository, runevent.PullRequestNumber, opt)
})
if err != nil {
return changedfiles.ChangedFiles{}, err
}
for j := range repoCommit {
changedFiles.All = append(changedFiles.All, *repoCommit[j].Filename)
if *repoCommit[j].Status == "added" {
changedFiles.Added = append(changedFiles.Added, *repoCommit[j].Filename)
}
if *repoCommit[j].Status == "removed" {
changedFiles.Deleted = append(changedFiles.Deleted, *repoCommit[j].Filename)
}
if *repoCommit[j].Status == "modified" {
changedFiles.Modified = append(changedFiles.Modified, *repoCommit[j].Filename)
}
if *repoCommit[j].Status == "renamed" {
changedFiles.Renamed = append(changedFiles.Renamed, *repoCommit[j].Filename)
}
}
if resp.NextPage == 0 {
break
}
opt.Page = resp.NextPage
}
v.changedFiles = &changedFiles
case triggertype.Push:
changedFiles := changedfiles.ChangedFiles{}
rC, _, err := wrapAPI(v, "get_commit_files", func() (*github.RepositoryCommit, *github.Response, error) {
return v.Client().Repositories.GetCommit(ctx, runevent.Organization, runevent.Repository, runevent.SHA, &github.ListOptions{})
})
if err != nil {
return changedfiles.ChangedFiles{}, err
}
for j := range repoCommit {
changedFiles.All = append(changedFiles.All, *repoCommit[j].Filename)
if *repoCommit[j].Status == "added" {
changedFiles.Added = append(changedFiles.Added, *repoCommit[j].Filename)
for i := range rC.Files {
changedFiles.All = append(changedFiles.All, *rC.Files[i].Filename)
if *rC.Files[i].Status == "added" {
changedFiles.Added = append(changedFiles.Added, *rC.Files[i].Filename)
}
if *repoCommit[j].Status == "removed" {
changedFiles.Deleted = append(changedFiles.Deleted, *repoCommit[j].Filename)
if *rC.Files[i].Status == "removed" {
changedFiles.Deleted = append(changedFiles.Deleted, *rC.Files[i].Filename)
}
if *repoCommit[j].Status == "modified" {
changedFiles.Modified = append(changedFiles.Modified, *repoCommit[j].Filename)
if *rC.Files[i].Status == "modified" {
changedFiles.Modified = append(changedFiles.Modified, *rC.Files[i].Filename)
}
if *repoCommit[j].Status == "renamed" {
changedFiles.Renamed = append(changedFiles.Renamed, *repoCommit[j].Filename)
if *rC.Files[i].Status == "renamed" {
changedFiles.Renamed = append(changedFiles.Renamed, *rC.Files[i].Filename)
}
}
if resp.NextPage == 0 {
break
}
opt.Page = resp.NextPage
}
return changedFiles, nil
}

if runevent.TriggerTarget == "push" {
changedFiles := changedfiles.ChangedFiles{}
rC, _, err := wrapAPI(v, "get_commit_files", func() (*github.RepositoryCommit, *github.Response, error) {
return v.Client().Repositories.GetCommit(ctx, runevent.Organization, runevent.Repository, runevent.SHA, &github.ListOptions{})
})
if err != nil {
return changedfiles.ChangedFiles{}, err
}
for i := range rC.Files {
changedFiles.All = append(changedFiles.All, *rC.Files[i].Filename)
if *rC.Files[i].Status == "added" {
changedFiles.Added = append(changedFiles.Added, *rC.Files[i].Filename)
}
if *rC.Files[i].Status == "removed" {
changedFiles.Deleted = append(changedFiles.Deleted, *rC.Files[i].Filename)
}
if *rC.Files[i].Status == "modified" {
changedFiles.Modified = append(changedFiles.Modified, *rC.Files[i].Filename)
}
if *rC.Files[i].Status == "renamed" {
changedFiles.Renamed = append(changedFiles.Renamed, *rC.Files[i].Filename)
}
v.changedFiles = &changedFiles
default:
v.changedFiles = &changedfiles.ChangedFiles{}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for processing []*github.CommitFile is duplicated for PullRequest and Push cases. This can be refactored into a helper function. Additionally, using a switch statement on *repoCommit[j].Status would be more readable and slightly more efficient than multiple if statements.

Example helper function:

func processGithubCommitFiles(files []*github.CommitFile, changedFiles *changedfiles.ChangedFiles) {
	for i := range files {
		filename := *files[i].Filename
		changedFiles.All = append(changedFiles.All, filename)
		switch *files[i].Status {
		case "added":
			changedFiles.Added = append(changedFiles.Added, filename)
		case "removed":
			changedFiles.Deleted = append(changedFiles.Deleted, filename)
		case "modified":
			changedFiles.Modified = append(changedFiles.Modified, filename)
		case "renamed":
			changedFiles.Renamed = append(changedFiles.Renamed, filename)
		}
	}
}

Calls to `vcs.GetFiles()` only result in one or two API calls (paging
aside). However the method is called each time a CEL expression
references `"path".pathChanged()` or the `on-path-change` annotation is
checked for matches. Because of this, because matching is evaluated
several times in a row, and because some users have tens of PipelineRuns
in their `.tekton` directory, in some cases a single push event may cause
PaC to make hundreds of API requests listing the same files repeatedly.
By default the exhaustive CI rule does not count a `default` switch case
as exhaustive.

Also fixes typo in misspell settings
@aThorp96 aThorp96 force-pushed the cache-changed-files branch from 0a29e4f to 838d438 Compare November 6, 2025 13:08
@pipelines-as-code
Copy link

🔍 PR Lint Feedback

Note: This automated check helps ensure your PR follows our contribution guidelines.

⚠️ Items that need attention:

🤖 AI attribution

The following commits lack an explicit AI attribution footer:

  • 658148f feat(perf): cache vcs.GetFiles() to reduce redundant VCS API volume
  • 838d438 chore(lint): set golangci default switch case as exhaustive

If no AI assistance was used for a commit, you can ignore this warning.
Otherwise add an Assisted-by: or Co-authored-by: footer referencing the AI used.


ℹ️ Next Steps

  • Review and address the items above
  • Push new commits to update this PR
  • This comment will be automatically updated when issues are resolved
🔧 Admin Tools (click to expand)

Automated Issue/Ticket Creation:

  • /issue-create - Generate a GitHub issue from this PR content using AI
  • /jira-create - Create a SRVKP Jira ticket from this PR content using AI

⚠️ Important: Always review and edit generated content before finalizing tickets/issues.
The AI-generated content should be used as a starting point and may need adjustments.

These commands are available to maintainers and will post the generated content as PR comments for review.

🤖 This feedback was generated automatically by the PR CI system

}
if c.Renamed {
changedFiles.Renamed = append(changedFiles.Renamed, c.Path)
if v.changedFiles == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aThorp96 another approach is instead of making changes in every provider, I see that wherever in a func GetFiles is called, there has event reference. so can you please check in the code for changedFiles where the GetFiles is called, making changedFiles a field in event struct?

e.g.

if event.changedFiles == nil {
    event.changedFiles = v.GetFiles(ctx, event)
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

3 participants