-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Run unit tests with real LLM calls #8486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@TomeHirata This is awesome! One thing I want to discuss. I compared the time cost of running test before and after this PR, seems we are seeing a significant increase from 2m => 4m: 4m seems to be all right, but that's the outcome of only converting 6 test cases into using Ollama, which means it could become much slower over time. I am thinking about the following:
Let me know what you think! |
|
Hi, @chenmoneygithub. I agree with both points. I've split LLM call tests into a separate job. I completely agree with #2. We should limit the usage of LLM call in unit tests because of latency and potential flakiness. |
chenmoneygithub
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setup looks good!
chenmoneygithub
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!


This PR enables running unit tests that require real LLM calls, which have been skipped in CI.
The model for testing is configurable through
LLM_MODELenv variable, and we useollama/llama3.2:3bin the branch build to balance quality and latency. The Ollama model pulling only introduces ~13s latency, so this PR just enables real LLM tests in the branch build instead of nightly tests.