[upstream] Expecting future huggingface/transformer incompatibility

Hello team,

I was looking into adding my first `MoECalibrationModule` for a currently unsupported model (GLM4.5-Air, to prepare for upcoming GLM4.6-Air and also GLM4.6V) as I expect my previous quant quality was not up to par (though can't test the KL-divergence #2031).

However as luck would have it, https://github.com/huggingface/transformer seems to have merged a breaking 8.7K lines PR just yesterday: https://github.com/huggingface/transformers/pull/41580

For example here is the new definition of Qwen3 MoE:

The `gate` has been removed from `Qwen3MoeSparseBlock` and moved to a `Qwen3MoeTopKRouter`

<img width="2177" height="1235" alt="Image" src="https://github.com/user-attachments/assets/c0e914cc-0a90-494a-8260-ebced752d34f" />

This departs from the current implementation in llmcompressor

<img width="1122" height="1326" alt="Image" src="https://github.com/user-attachments/assets/0c59ecbe-7057-4c0a-bb9e-f4db970fb4ac" />

Though I think we can just register both as calibration target and it should work :tm:.

Note that the Hugging face PR is marked as tentative for a v5 of transformers so we might be OK with the old scheme for a time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[upstream] Expecting future huggingface/transformer incompatibility #2036

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[upstream] Expecting future huggingface/transformer incompatibility #2036

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions