Skip to content

Conversation

@vkuzo
Copy link
Owner

@vkuzo vkuzo commented Nov 7, 2025

Summary:

Requires pytorch/ao#3303
Requires https://www.internalfb.com/phabricator/paste/view/P2028176312
Requires huggingface/transformers#41894

time CUDA_LAUNCH_BLOCKING=0 with-proxy python quantize_hf_model_with_torchao.py --model_name "meta-llama/Llama-4-Scout-17B-16E-Instruct" --save_model_to_disk True --device_map "auto" --ffn_only_llama_4_scout True

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

@vkuzo vkuzo force-pushed the 20251106_llama4_expert_quant branch from 4d43646 to 0ae5120 Compare November 7, 2025 12:06
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
@vkuzo vkuzo force-pushed the 20251106_llama4_expert_quant branch from 0ae5120 to f0be3c9 Compare November 7, 2025 14:47
@vkuzo vkuzo merged commit 1b1cd42 into main Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants