Skip to content

Conversation

@paldepind
Copy link

@paldepind paldepind commented Nov 6, 2025

This PR changes the "Compare Results" feature to use codeql bqrs diff in some cases.

This has a few benefits:

  1. When comparing queries results with a large number of results the comparison view fails with "Cannot create a string longer than 0x1fffffe8 characters". This is because the output of bqrs decode is read into a JavaScript string, which apparently can't exceed roughly 512MB in V8. Using bqrs diff fixes this bug so this PR partially addresses Failure to compare large results #1364.

  2. The diff that the extension produces is not always correct. I've observed that the number of added and removed results doesn't always add up to actual difference in results between the compared query results. bqrs diff seems to not have this issue.

  3. bqrs diff is faster.

Unfortunately, we can't use bqrs diff in all cases, as the extension sometimes compares bqrs files with different result set names for each file. bqrs diff does not support this.

I think the way forward is to add this capability to bqrs diff. We'll then be able to completely remove the diffing inside VSCode and fix #1364 in all cases.

Note to reviewer: The first commit is basically me reading the code with a few side-effects. The second commit makes the actual changes.

@paldepind paldepind requested a review from a team as a code owner November 6, 2025 14:33
Copilot AI review requested due to automatic review settings November 6, 2025 14:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request refactors the query results comparison functionality to improve performance and code quality. The key change introduces an optimization using bqrs diff CLI command for comparing result sets with the same name, which is more efficient than loading and comparing full result sets in-memory.

  • Optimizes comparison by using bqrs diff CLI command when comparing same-named result sets
  • Performs early validation checks on schemas before loading result data
  • Refactors code for improved readability and consistency (e.g., replacing || with ??, extracting helper functions, using ternary expressions)

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
extensions/ql-vscode/src/compare/compare-view.ts Adds findSchema helper function, implements optimized bqrs diff comparison path, and moves validation checks earlier in the process
extensions/ql-vscode/src/compare/resultsDiff.ts Removes validation checks that were moved upstream to compare-view.ts
extensions/ql-vscode/src/codeql-cli/cli.ts Adds bqrsDiff method, refactors code for readability, and makes several methods private
extensions/ql-vscode/src/common/bqrs-raw-results-mapper.ts Simplifies result set mapping by inlining variable assignments
extensions/ql-vscode/src/view/compare/CompareTable.tsx Moves openQuery function outside component to prevent unnecessary re-creation
extensions/ql-vscode/src/view/compare/Compare.tsx Replaces `
extensions/ql-vscode/src/codeql-cli/query-language.ts Replaces `

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

type OnLineCallback = (
line: string,
) => Promise<string | undefined> | string | undefined;
type OnLineCallback = (line: string) => Promise<string | undefined>;
Copy link

Copilot AI Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type signature change from (line: string) => Promise<string | undefined> | string | undefined to (line: string) => Promise<string | undefined> removes support for synchronous callbacks. While this matches the actual usage in the codebase (line 929 uses an async function), this is a breaking API change. Consider whether this type is part of a public API that external code might depend on.

Suggested change
type OnLineCallback = (line: string) => Promise<string | undefined>;
type OnLineCallback = (line: string) => Promise<string | undefined> | string | undefined;

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an external API. Breakage is fine and TS would error if this type change was not ok. The changed type matches how we use it, and there's no reason to have it more general than that.

@asgerf
Copy link
Contributor

asgerf commented Nov 7, 2025

Nice!

I noticed that when comparing alerts it still falls back to the old diff method even though the schemas are the same. It would be nice to have that working as well, especially since it's the default and you can't actually switch to another result set without waiting for the slow diff to finish.

Screenshot 2025-11-07 at 10 02 53

@paldepind
Copy link
Author

I can see that alert is handled with a different code path that parses and diffs sarif files instead of bqrs files. That diffing seems to have some logic for doing a diff modulo some canonicalization that was introduced in #3772. We probably want to maintain that, and I don't know if we can replicate it with bqrs diff as-is?

@asgerf
Copy link
Contributor

asgerf commented Nov 7, 2025

For alert diffing we can diff the bqrs, then generate SARIF files from the diffed files using bqrs interpret and then feed the smaller SARIF files into the existing diffing logic. It means we still run the slow diff, but on a smaller input at least.

@paldepind
Copy link
Author

That sounds like a good idea. I'll give it a go.

@paldepind
Copy link
Author

I tried implementing this. It may be an improvement, but it does not preserve the current behavior.

As a test a ran a path query on a db with 32 results. I then added source.getLocation().getStartLine() < 50 to the source predicate and arrived at 16 results. "Compare results" then showed

  • Before this PR: "29 rows removed, 13 rows added"
  • On this PR: "16 rows removed, 0 rows added"

So the current comparison considers most the the completely unchanged results added and removed. I'm guessing there's something in the SARIF that is somehow different about the results? Whereas with the latest changes in this PR we just diff the tuples in #select.

In this particular test the change seems to match what one would expect. But I don't really understand how the SARIF diff works and I'm a bit worried that the change might be a step back in other cases?

(note that CI fails right now, this is expected as I haven't updated the tests b.c. I'm not sure if we want to proceed with the latest changes).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Failure to compare large results

2 participants