[GPT-OSS] Add HF state dict adapter to support loading from HF checkpoints #2021

shuhuayu · 2025-11-12T20:35:21Z

As titled, this PR adds HF state dict adapter to support loading from GPT-OSS HF checkpoint. GPT-OSS checkpoint is quantized in MXPF4 format. The de-quantization steps are offloaded to the QuantizedHuggingFaceStorageReader in dcp, so this feature depends on this PR to update QuantizedHuggingFaceStorageReader (pytorch/pytorch#167672).

Test 1. We use dcp.load(hf_state_dict, storage_reader=QuantizedHuggingFaceStorageReader(path=input_dir)) to load from GPT-OSS HF checkpoint, and map the hf_state_dict back to TorchTitan state dict. We build one test input, and compare two outputs: 1. Using transformer library to load GPT-OSS HF checkpoint and run inference on the test input; 2. We use the converted TorchTitan model to run inference on the test input. We compare the outputs by comparing the KL divergence of two output probability distributions. The result shows two models are very similar.
Test 2. We load the model directly from quantized GPT-OSS HF checkpoint, and do a test training.

…oints

tianyu-l

SGTM!

shuhuayu requested review from fegin, tianyu-l, wconstab and wwwjn as code owners November 12, 2025 20:35

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 12, 2025

shuhuayu marked this pull request as draft November 12, 2025 20:42

shuhuayu added 2 commits November 12, 2025 13:48

[GPT-OSS] Add HF state dict adapter to support loading from HF checkp…

4294bf8

…oints

[GPT-OSS] Offload dequantization to QuantizedHuggingFaceStorageReader

768ede3

shuhuayu force-pushed the opt branch from 7661fb0 to 768ede3 Compare November 12, 2025 21:48

tianyu-l approved these changes Nov 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GPT-OSS] Add HF state dict adapter to support loading from HF checkpoints #2021

[GPT-OSS] Add HF state dict adapter to support loading from HF checkpoints #2021

Uh oh!

shuhuayu commented Nov 12, 2025

Uh oh!

tianyu-l left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[GPT-OSS] Add HF state dict adapter to support loading from HF checkpoints #2021

Are you sure you want to change the base?

[GPT-OSS] Add HF state dict adapter to support loading from HF checkpoints #2021

Uh oh!

Conversation

shuhuayu commented Nov 12, 2025

Uh oh!

tianyu-l left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants