-
Notifications
You must be signed in to change notification settings - Fork 245
feat(ai-assistant): Assistant Eval improvements #7250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| // TODO: pass the metadata correctly in a strongly typed manner. | ||
| // I'm not sure how to do this. | ||
| metadata, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls help with the typescripting here. i couldnt easily figure this out
| } | ||
| } | ||
|
|
||
| export function makeCreateResponse({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved LLM call to a helper function which is used both here and in the evals
| @@ -1,4 +1,6 @@ | |||
| export const buildExplainPlanPrompt = ({ | |||
| const explainPlanSystemPrompt = `TODO:....add custom prompt stuff here`; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add these prompts in future PR.
julian or whoever is working on this can iterate on the prompts and PR them
| expected: { | ||
| messages: [{ text: c.expected, sources: c.expectedSources || [] }], | ||
| }, | ||
| tags: c.tags ?? [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tags are useful for looking at the evals as you iterate on them. i recommend inclusion
| // TODO: validate that this works as expected. If not, we must pull the sources out of the stream | ||
| const sources = (await result.sources).map((source) => { | ||
| return source.id; | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if this will work as expected. if not you can get the sources from the events in the result.fullStream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to atlas-search.ts to be more general
| }; | ||
| }; | ||
|
|
||
| export const buildPrompts = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more extensible by using Record data structure
Description
Sketch of how I'd improve the evaluation, the evaluability, and include system messages. I'd prefer if someone takes over the PR from me, rather than I work on it through til the end. I annotated the PR with some thoughts to guide reviewing and final implementation.
May want to break up the PR into a couple sub-PRs b/c it's a bit of a grab bag right now.
Feats:
Checklist
Motivation and Context
Open Questions
Dependents
Types of changes