-
Notifications
You must be signed in to change notification settings - Fork 31.3k
Fix _init_weights to safely skip int8 tensors in Qwen2_5_VL model #41490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix _init_weights to safely skip int8 tensors in Qwen2_5_VL model #41490
Conversation
|
cc @MekkCyber for quantization! |
MekkCyber
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @KaparthyReddy ! I think this the wrong commit
|
Thanks for the feedback! I’ve updated the PR to modify the correct file under src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py. _init_weights now safely skips int8 tensors while initializing float tensors correctly. |
…Reddy/hf-transformers-contributions into fix-llmcompressor-error the commit.
|
[For maintainers] Suggested jobs to run (before merge) run-slow: qwen2_5_vl |
ArthurZucker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey sorry we merged a big refacto in #41580 !
Summary
This PR fixes the
_init_weights()method inQwen2_5_VLForConditionalGenerationto safely skip int8 tensors during initialization. Previously, applyingnormal_()on int8 weights caused a RuntimeError when loading quantized models.Changes
_init_weights()to only initialize floating-point tensors (float16,float32,bfloat16).Motivation
Quantized models (W8A8, int8 weights) could not be loaded directly due to the previous
_init_weights()implementation. This fix allows smooth loading without RuntimeError, making contributions compatible with LLMCompressor quantized models.Verification
_init_weights()safely ignores int8 tensors.