feat(ai-assistant): Assistant Eval improvements #7250

mongodben · 2025-08-28T16:31:20Z

Description

Sketch of how I'd improve the evaluation, the evaluability, and include system messages. I'd prefer if someone takes over the PR from me, rather than I work on it through til the end. I annotated the PR with some thoughts to guide reviewing and final implementation.

May want to break up the PR into a couple sub-PRs b/c it's a bit of a grab bag right now.

Feats:

Refactor LLM call to be more evaluable
Make eval cases arrays for different types (that way easier to add more)
Add custom system prompt for specified circumstances

Checklist

New tests and/or benchmarks are included
Documentation is changed or added
If this change updates the UI, screenshots/videos are added and a design review is requested
I have signed the MongoDB Contributor License Agreement (https://www.mongodb.com/legal/contributor-agreement)

Motivation and Context

Bugfix
New feature
Dependency update
Misc

Open Questions

Dependents

Types of changes

Backport Needed
Patch (non-breaking change which fixes an issue)
Minor (non-breaking change which adds functionality)
Major (fix or feature that would cause existing functionality to change)

mongodben · 2025-08-28T16:32:11Z

packages/compass-assistant/src/docs-provider-transport.ts

+    // TODO: pass the metadata correctly in a strongly typed manner.
+    // I'm not sure how to do this.
+    metadata,


pls help with the typescripting here. i couldnt easily figure this out

mongodben · 2025-08-28T16:32:34Z

packages/compass-assistant/src/docs-provider-transport.ts

  }
 }
+
+export function makeCreateResponse({


moved LLM call to a helper function which is used both here and in the evals

mongodben · 2025-08-28T16:33:16Z

packages/compass-assistant/src/prompts.ts

@@ -1,4 +1,6 @@
-export const buildExplainPlanPrompt = ({
+const explainPlanSystemPrompt = `TODO:....add custom prompt stuff here`;


add these prompts in future PR.

julian or whoever is working on this can iterate on the prompts and PR them

mongodben · 2025-08-28T16:33:57Z

packages/compass-assistant/test/assistant.eval.ts

      expected: {
        messages: [{ text: c.expected, sources: c.expectedSources || [] }],
      },
+      tags: c.tags ?? [],


tags are useful for looking at the evals as you iterate on them. i recommend inclusion

mongodben · 2025-08-28T16:35:05Z

packages/compass-assistant/test/assistant.eval.ts

+  // TODO: validate that this works as expected. If not, we must pull the sources out of the stream
+  const sources = (await result.sources).map((source) => {
+    return source.id;
  });


not sure if this will work as expected. if not you can get the sources from the events in the result.fullStream

mongodben · 2025-08-28T16:35:42Z

packages/compass-assistant/test/eval-cases/filter-docs-before-search.ts

renamed to atlas-search.ts to be more general

mongodben · 2025-08-28T16:36:14Z

packages/compass-assistant/src/prompts.ts

  };
 };
+
+export const buildPrompts = {


more extensible by using Record data structure

draft of changes

e8ba8d1

mongodben commented Aug 28, 2025

View reviewed changes

packages/compass-assistant/test/eval-cases/filter-docs-before-search.ts

Copy link

Author

mongodben Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to atlas-search.ts to be more general

mongodben commented Aug 28, 2025

View reviewed changes

packages/compass-assistant/src/prompts.ts

};

};

export const buildPrompts = {

Copy link

Author

mongodben Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more extensible by using Record data structure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(ai-assistant): Assistant Eval improvements #7250

feat(ai-assistant): Assistant Eval improvements #7250

Uh oh!

mongodben commented Aug 28, 2025 •

edited

Loading

Uh oh!

mongodben Aug 28, 2025

Uh oh!

mongodben Aug 28, 2025 •

edited

Loading

Uh oh!

mongodben Aug 28, 2025

Uh oh!

mongodben Aug 28, 2025

Uh oh!

mongodben Aug 28, 2025

Uh oh!

mongodben Aug 28, 2025

Uh oh!

mongodben Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@@ -1,4 +1,6 @@
		export const buildExplainPlanPrompt = ({
		const explainPlanSystemPrompt = `TODO:....add custom prompt stuff here`;

feat(ai-assistant): Assistant Eval improvements #7250

Are you sure you want to change the base?

feat(ai-assistant): Assistant Eval improvements #7250

Uh oh!

Conversation

mongodben commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Motivation and Context

Open Questions

Dependents

Types of changes

Uh oh!

mongodben Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

mongodben Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mongodben Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

mongodben Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

mongodben Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

mongodben Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

mongodben Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mongodben commented Aug 28, 2025 •

edited

Loading

mongodben Aug 28, 2025 •

edited

Loading