Skip to content

Commit 3dcb958

Browse files
authored
Create README.md
1 parent c924f65 commit 3dcb958

File tree

1 file changed

+11
-0
lines changed

1 file changed

+11
-0
lines changed

hf_torchao_vllm/README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# HF -> torchao -> vLLM convenience scripts
2+
3+
Example
4+
5+
```bash
6+
# save a quantized model ot data/nvfp4-Qwen1.5-MoE-A2.7B
7+
python quantize_hf_model_with_torchao.py --model_name "Qwen/Qwen1.5-MoE-A2.7B" --experts_only_qwen_1_5_moe_a_2_7b True --save_model_to_disk True --quant_type nvfp4
8+
9+
# run the model from above in vLLM
10+
python run_quantized_model_in_vllm.py --model_name "data/nvfp4-Qwen1.5-MoE-A2.7B" --compile False
11+
```

0 commit comments

Comments
 (0)