Commit ca94c96
authored
[5620660][ONNX] Remove toposort after quantization (#524)
## What does this PR do?
**Type of change:** Bug fix
**Overview:** Loading the model with ONNX graphsurgeon after
quantization and FP16 conversion results in an ONNX model with FP16
output instead of FP32 even though the Cast_to_fp32 layer was correctly
placed in the graph output. This PR fixes that issue.
## Usage
```python
$ python -m modelopt.onnx.quantization --onnx_path=$MODEL_NAME.onnx --high_precision_dtype=fp16
```
## Testing
See bug 5620660.
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: No
- **Did you add or update any necessary documentation?**: No
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
No
Signed-off-by: gcunhase <4861122+gcunhase@users.noreply.github.com>1 parent fc92e98 commit ca94c96
1 file changed
+0
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
528 | 528 | | |
529 | 529 | | |
530 | 530 | | |
531 | | - | |
532 | | - | |
533 | | - | |
534 | | - | |
535 | 531 | | |
536 | 532 | | |
537 | 533 | | |
| |||
0 commit comments