Commit d006d84
committed
fix(quantization): Skip weight initialization for quantized models
This commit addresses the RuntimeError encountered when loading llmcompressor W8A8 quantized models, where `torch.nn.init.normal_()` is called on `int8` tensors during weight initialization.
The `_initialize_missing_keys` method in `modeling_utils.py` was unconditionally calling `self.initialize_weights()`. For quantized models, this initialization is unnecessary and causes a `RuntimeError` as `normal_()` does not support integer dtypes.
By adding a check `if not is_quantized:` before calling `self.initialize_weights()`, we ensure that this problematic initialization step is skipped for quantized models, resolving the `RuntimeError` and improving compatibility with `llmcompressor` W8A8 models.
Fixes #39366
Signed-off-by: Mauricio Harley <mauricioharley@gmail.com>1 parent 9f2d566 commit d006d84
1 file changed
+2
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5665 | 5665 | | |
5666 | 5666 | | |
5667 | 5667 | | |
5668 | | - | |
| 5668 | + | |
| 5669 | + | |
5669 | 5670 | | |
5670 | 5671 | | |
5671 | 5672 | | |
| |||
0 commit comments