Skip to content

Commit b2bddca

Browse files
committed
quantized model specifications
Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>
1 parent d886e1c commit b2bddca

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
model: granite-3.0-8b-base
2+
source: https://huggingface.co/ibm-granite/granite-3.0-8b-base
3+
4+
QUANTIZATION
5+
repo: https://github.com/foundation-model-stack/fms-model-optimizer
6+
mode: Direct Quantization
7+
weights: INT8 per-channel max
8+
activations: INT8 per-token max
9+
smoothquant: enabled, alpha = 0.5

0 commit comments

Comments
 (0)