-
Notifications
You must be signed in to change notification settings - Fork 198
[OMNIML-2244] enable fp8 and int8 ONNX export #594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #594 +/- ##
==========================================
- Coverage 74.64% 74.57% -0.08%
==========================================
Files 183 183
Lines 18389 18412 +23
==========================================
+ Hits 13727 13730 +3
- Misses 4662 4682 +20 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
galagam
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
|
||
|
|
||
| def replace_zero_scale_with_smallest_nonzero(onnx_model: onnx.ModelProto) -> onnx.ModelProto: | ||
| """Replace zero scale values with smallest nonzero fp16 value in the ONNX model.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you document in what condition do we need to call this method here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I had already set the MR to auto merge. Will add this in a follow up MR.
What does this PR do?
Type of change:
Example update
Overview:
Usage
Testing
Validated the accuracy and latency of int8 and fp8 models:
Before your PR is "Ready for review"