Skip to content

[upstream] Expecting future huggingface/transformer incompatibility #2036

@mratsim

Description

@mratsim

Hello team,

I was looking into adding my first MoECalibrationModule for a currently unsupported model (GLM4.5-Air, to prepare for upcoming GLM4.6-Air and also GLM4.6V) as I expect my previous quant quality was not up to par (though can't test the KL-divergence #2031).

However as luck would have it, https://github.com/huggingface/transformer seems to have merged a breaking 8.7K lines PR just yesterday: huggingface/transformers#41580

For example here is the new definition of Qwen3 MoE:

The gate has been removed from Qwen3MoeSparseBlock and moved to a Qwen3MoeTopKRouter

Image

This departs from the current implementation in llmcompressor

Image

Though I think we can just register both as calibration target and it should work ™️.

Note that the Hugging face PR is marked as tentative for a v5 of transformers so we might be OK with the old scheme for a time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions