Skip to content

Conversation

@kylesayrs
Copy link
Collaborator

@kylesayrs kylesayrs commented Oct 29, 2025

Purpose

  • Change imports to make sure that modules are registered, even if the context is imported not from prepare.py
  • Use RegistryMixin for registry functionality to make use of code reuse
  • Do not log every every time moe context is entered (some models are not moes, some users may be confused)
  • Add support for qwen3_moe_vl

Changes

  • Use RegistryMixin for registry functionality
  • Change logger level of moe entering message
  • Move registry init imports
  • Add CalibrateQwen3VLMoeTextSparseMoeBlock

Testing

  • MoE tests pass, ran moe examples and see that modules are being replaced

@github-actions
Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @kylesayrs, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the Mixture-of-Experts (MoE) calibration framework by integrating support for the qwen3_moe_vl model, thereby expanding the range of compatible models. It also addresses logging verbosity by downgrading a frequent log message to debug level, ensuring a cleaner console output during calibration. Additionally, the PR includes minor code refactorings to improve clarity and maintainability across several MoE-related files.

Highlights

  • Qwen3-VL MoE Support: Added a new calibration module (CalibrateQwen3VLMoeTextSparseMoeBlock) and its registration to support qwen3_moe_vl models.
  • Logging Verbosity Reduction: Changed the logging level for entering the MoE calibration context from "info" to "debug" to prevent excessive logging during frequent oneshot calls.
  • Code Cleanup & Refactoring: Performed general code cleanup, including import organization in prepare.py, explicit type hinting in llama4.py, and minor variable naming improvements in moe_context.py.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request cleans up MoE calibration logic, reduces logging verbosity, and adds support for qwen3_moe_vl. The changes are generally good, improving code structure and readability. However, I've found a few critical issues with the new qwen3_vl_moe implementation that will cause runtime errors. Specifically, the module registration for calibration is incorrect, the deprecated replace function is broken, and the corresponding test is not set up correctly and would fail. Please see my detailed comments for suggestions on how to fix these issues.

@kylesayrs kylesayrs added the ready When a PR is ready for review label Oct 30, 2025
@kylesayrs
Copy link
Collaborator Author

FYI @sairampillai

@sairampillai
Copy link
Contributor

Looks awesome! @kylesayrs

Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dsikka
Copy link
Collaborator

dsikka commented Nov 6, 2025

@kylesayrs can you address conflicts?

fynnsu
fynnsu previously approved these changes Nov 6, 2025
Copy link
Collaborator

@fynnsu fynnsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
This reverts commit 6dd0320.
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@kylesayrs kylesayrs dismissed stale reviews from fynnsu and brian-dellabetta via 215b95c November 7, 2025 19:35
@kylesayrs kylesayrs force-pushed the kylesayrs/moe-patch branch from 2274444 to 215b95c Compare November 7, 2025 19:35
Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment on restore for the qwen3 vl moe

fynnsu
fynnsu previously approved these changes Nov 7, 2025
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@kylesayrs kylesayrs enabled auto-merge (squash) November 11, 2025 13:21
@kylesayrs kylesayrs merged commit 407a091 into main Nov 11, 2025
9 checks passed
@kylesayrs kylesayrs deleted the kylesayrs/moe-patch branch November 11, 2025 14:07
@dsikka dsikka added the qwen For any PR / issue related to Qwen support label Nov 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

qwen For any PR / issue related to Qwen support ready When a PR is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants