Skip to content
This repository was archived by the owner on Oct 25, 2024. It is now read-only.

Commit b7cbdd8

Browse files
Migrate trainer INC 1.x API to 2.x (#1605)
* migrate INC 1.x quantization api to 2.x Signed-off-by: changwangss <chang1.wang@intel.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add examples Signed-off-by: changwangss <chang1.wang@intel.com> --------- Signed-off-by: changwangss <chang1.wang@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent ffb4c3d commit b7cbdd8

File tree

160 files changed

+1254
-19439
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

160 files changed

+1254
-19439
lines changed

docs/api_doc/optimization/optimizer.rst

Lines changed: 0 additions & 7 deletions
This file was deleted.

docs/devcatalog.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,8 @@ raw_datasets = raw_datasets.map(lambda e: tokenizer(e['sentence'], truncation=Tr
9999
Documentation for API usage can be found [here](https://github.com/intel/intel-extension-for-transformers/tree/main/docs)
100100

101101
```python
102-
from intel_extension_for_transformers.transformers import QuantizationConfig, metrics, objectives
102+
from intel_extension_for_transformers.transformers import metrics, objectives
103+
from neural_compressor.config import PostTrainingQuantConfig
103104
from intel_extension_for_transformers.transformers.trainer import NLPTrainer
104105
# load config, model and metric
105106
config = AutoConfig.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english",num_labels=2)
@@ -120,7 +121,9 @@ trainer = NLPTrainer(model=model,
120121
tokenizer=tokenizer
121122
)
122123
# model quantization using trainer
123-
q_config = QuantizationConfig(metrics=[metrics.Metric(name="eval_accuracy")])
124+
tune_metric = metrics.Metric(name="eval_accuracy")
125+
trainer.metrics = tune_metric
126+
q_config = PostTrainingQuantConfig()
124127
model = trainer.quantize(quant_config=q_config)
125128

126129
# test sentiment analysis with quantization

docs/distillation.md

Lines changed: 18 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -49,39 +49,20 @@ Where $D$ is a distance measurement as before, $F_t^{n_i}$ the output feature of
4949
## usage
5050
### Pytorch Script:
5151
```python
52-
from intel_extension_for_transformers.transformers import metric, objectives, DistillationConfig, Criterion
52+
5353
from intel_extension_for_transformers.transformers.trainer import NLPTrainer
54+
from neural_compressor.config import DistillationConfig
5455
# Replace transformers.Trainer with NLPTrainer
5556
# trainer = transformers.Trainer(......)
5657
trainer = NLPTrainer(......)
5758
metric = metrics.Metric(name="eval_accuracy")
58-
d_conf = DistillationConfig(metrics=tune_metric)
59-
model = trainer.distill(
60-
distillation_config=d_conf, teacher_model=teacher_model
61-
)
59+
trainer.metrics = metric
60+
d_conf = DistillationConfig(teacher_model=teacher_model, criterion=criterion)
61+
model = trainer.distill(distillation_config=d_conf)
6262
```
6363

6464
Please refer to [example](../examples/huggingface/pytorch/text-classification/distillation/run_glue.py) for the details.
6565

66-
### Tensorflow Script:
67-
```python
68-
from intel_extension_for_transformers.transformers import (DistillationConfig, metrics)
69-
from intel_extension_for_transformers.transformers.distillation import Criterion
70-
71-
optimizer = TFOptimization(...)
72-
metric_ = metrics.Metric(name="eval_accuracy")
73-
criterion = Criterion(name='KnowledgeLoss',
74-
layer_mappings=[['classifier', 'classifier']],
75-
loss_types=['CE', 'CE'],
76-
loss_weight_ratio=[0.5, 0.5],
77-
add_origin_loss=False)
78-
distillation_conf = DistillationConfig(metrics=metric_,
79-
criterion=criterion)
80-
distilled_model = optimizer.distill(
81-
distillation_config=distillation_conf,
82-
teacher_model=teacher_model)
83-
```
84-
Please refer to [example](../examples/huggingface/tensorflow/text-classification/distillation/run_glue.py) for the details.
8566
### Create an Instance of Metric
8667
The Metric defines which metric will be used to measure the performance of tuned models.
8768
- example:
@@ -94,19 +75,23 @@ The Metric defines which metric will be used to measure the performance of tuned
9475
### Create an Instance of Criterion(Optional)
9576
The criterion used in training phase.
9677

97-
- arguments:
78+
- KnowledgeDistillationLossConfig arguments:
9879
|Argument |Type |Description |Default value |
9980
|:----------|:----------|:-----------------------------------------------|:----------------|
100-
|name |String|Name of criterion, like:"KnowledgeLoss", "IntermediateLayersLoss" |"KnowledgeLoss"|
10181
|temperature|Float |parameter for KnowledgeDistillationLoss |1.0 |
10282
|loss_types|List of string|Type of loss |['CE', 'CE'] |
10383
|loss_weight_ratio|List of float|weight ratio of loss |[0.5, 0.5] |
84+
85+
- IntermediateLayersKnowledgeDistillationLossConfig arguments:
86+
|Argument |Type |Description |Default value |
87+
|:----------|:----------|:-----------------------------------------------|:----------------|
88+
|loss_types|List of string|Type of loss |['CE', 'CE'] |
89+
|loss_weight_ratio|List of float|weight ratio of loss |[0.5, 0.5] |
10490
|layer_mappings|List|parameter for IntermediateLayersLoss |[] |
10591
|add_origin_loss|bool|parameter for IntermediateLayersLoss |False |
106-
10792
- example:
10893
```python
109-
criterion = Criterion(name='KnowledgeLoss')
94+
criterion = KnowledgeDistillationLossConfig()
11095
```
11196

11297
### Create an Instance of DistillationConfig
@@ -115,20 +100,18 @@ The DistillationConfig contains all the information related to the model distill
115100
- arguments:
116101
|Argument |Type |Description |Default value |
117102
|:----------|:----------|:-----------------------------------------------|:----------------|
118-
|framework |string |which framework you used |"pytorch" |
119-
|criterion|Criterion |criterion of training |"KnowledgeLoss"|
120-
|metrics |Metric |Used to evaluate accuracy of tuning model, no need for NoTrainerOptimizer|None |
103+
|teacher_model |torch.nn.Module | teacher model object |None |
104+
|criterion|Criterion |criterion of training |KnowledgeLoss object|
105+
121106

122107
- example:
123108
```python
124-
d_conf = DistillationConfig(metrics=metric, criterion=criterion)
109+
d_conf = DistillationConfig(teacher_model=teacher_model, criterion=criterion)
125110
```
126111

127112
### Distill with Trainer
128113
- Distill with Trainer
129114
NLPTrainer inherits from transformers.Trainer, so you can create a trainer as in examples of Transformers. Then you can distill model with trainer.distill function.
130115
```python
131-
model = trainer.distill(
132-
distillation_config=d_conf, teacher_model=teacher_model
133-
)
116+
model = trainer.distill(distillation_config=d_conf)
134117
```

docs/examples.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -37,8 +37,8 @@ Intel Extension for Transformers is a powerful toolkit with multiple model optim
3737
<th>Model</th>
3838
<th>Task</th>
3939
<th>Dataset</th>
40-
<th>PostTrainingDynamic</th>
41-
<th>PostTrainingStatic</th>
40+
<th>dynamic</th>
41+
<th>static</th>
4242
</tr>
4343
</thead>
4444
<tbody align="center">
@@ -177,7 +177,7 @@ Intel Extension for Transformers is a powerful toolkit with multiple model optim
177177
<th>Model</th>
178178
<th>Task</th>
179179
<th>Dataset</th>
180-
<th>QuantizationAwareTraining</th>
180+
<th>qat</th>
181181
<th>No Trainer quantization</th>
182182
</tr>
183183
</thead>
@@ -206,7 +206,7 @@ Intel Extension for Transformers is a powerful toolkit with multiple model optim
206206
<th>Model</th>
207207
<th>Task</th>
208208
<th>Dataset</th>
209-
<th>PostTrainingStatic</th>
209+
<th>static</th>
210210
</tr>
211211
</thead>
212212
<tbody align="center">
@@ -232,7 +232,7 @@ Intel Extension for Transformers is a powerful toolkit with multiple model optim
232232
<th>Model</th>
233233
<th>Task</th>
234234
<th>Dataset</th>
235-
<th>PostTrainingStatic</th>
235+
<th>static</th>
236236
</tr>
237237
</thead>
238238
<tbody align="center">

docs/export.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,9 @@ We support exporting PyTorch models into ONNX models with our well-designed API
2222
| Input Model | Export FP32 | Export BF16 | Export INT8 |
2323
| --- | --- | --- | --- |
2424
| FP32 PyTorch Model | &#10004; | &#10004; | / |
25-
| INT8 PyTorch Model <br> (PostTrainingDynamic) | / | / | &#10004; |
26-
| INT8 PyTorch Model <br> (PostTrainingStatic) | / | / | &#10004; |
27-
| INT8 PyTorch Model <br> (QuantizationAwareTraining) | / | / | &#10004; |
25+
| INT8 PyTorch Model <br> (dynamic) | / | / | &#10004; |
26+
| INT8 PyTorch Model <br> (static) | / | / | &#10004; |
27+
| INT8 PyTorch Model <br> (qat) | / | / | &#10004; |
2828

2929

3030
## Examples

docs/get_started.md

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313

1414
## Quantization
1515
```python
16-
from intel_extension_for_transformers.transformers import QuantizationConfig, metrics, objectives
16+
from neural_compressor.config import PostTrainingQuantConfig
1717
from intel_extension_for_transformers.transformers.trainer import NLPTrainer
1818

1919
config = AutoConfig.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english",num_labels=2)
@@ -27,7 +27,9 @@ trainer = NLPTrainer(model=model,
2727
eval_dataset=raw_datasets["validation"],
2828
tokenizer=tokenizer
2929
)
30-
q_config = QuantizationConfig(metrics=[metrics.Metric(name="eval_loss", greater_is_better=False)])
30+
quantization_config = PostTrainingQuantConfig(
31+
approach="static",
32+
)
3133
model = trainer.quantize(quant_config=q_config)
3234

3335
input = tokenizer("I like Intel Extension for Transformers", return_tensors="pt")
@@ -73,17 +75,17 @@ model = trainer.distill(distillation_config=d_conf, teacher_model=teacher_model)
7375
## Quantized Length Adaptive Transformer
7476
Quantized Length Adaptive Transformer leverages sequence-length reduction and low-bit representation techniques to further enhance model inference performance, enabling adaptive sequence-length sizes to accommodate different computational budget requirements with an optimal accuracy efficiency tradeoff.
7577
```python
76-
from intel_extension_for_transformers.transformers import QuantizationConfig, DynamicLengthConfig, metric, objectives
78+
from intel_extension_for_transformers.transformers import DynamicLengthConfig, metric, objectives
79+
from neural_compressor.config import PostTrainingQuantConfig
7780
from intel_extension_for_transformers.transformers.trainer import NLPTrainer
7881

7982
# Replace transformers.Trainer with NLPTrainer
8083
# trainer = transformers.Trainer(...)
8184
trainer = NLPTrainer(...)
8285
metric = metrics.Metric(name="eval_f1", is_relative=True, criterion=0.01)
83-
q_config = QuantizationConfig(
84-
approach="PostTrainingStatic",
85-
metrics=[metric],
86-
objectives=[objectives.performance]
86+
trainer.metrics = metric
87+
q_config = PostTrainingQuantConfig(
88+
approach="static"
8789
)
8890
# Apply the length config
8991
dynamic_length_config = DynamicLengthConfig(length_config=length_config)

docs/pruning.md

Lines changed: 20 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -7,32 +7,23 @@ Pruning
77
## Introduction
88
Pruning is the process of removing redundant parameters of a network. The idea bears similarity to the ["optimal brain damage"](http://yann.lecun.com/exdb/publis/pdf/lecun-90b.pdf) hypothesis by Yann LeCun. There are two types of pruning: Unstructured and Structured. Unstructured pruning means finding and removing the less salient connection in the model, the place could be anywhere in the matrix. Structured pruning means deleting entire blocks, filters, or channels.
99

10-
## Pruning types
11-
12-
There are three pruning types in Intel® Extension for Transformers:
13-
14-
- Magnitude (Unstructured)
15-
- The algorithm prunes the weight by the lowest absolute value at each layer with a given sparsity target.
16-
17-
- Group Lasso (Structured)
18-
- The algorithm uses Group lasso regularization to prune entire rows, columns, or blocks of parameters that result in a smaller dense network.
19-
20-
- Pattern Lock (Unstructured & Structured)
21-
- The algorithm locks the sparsity pattern in fine tune phase by freezing those zero values of the weight tensor during the weight update of training.
22-
2310
## Usage
2411
### Script:
2512
```python
2613

27-
from intel_extension_for_transformers.transformers import metrics, objectives, PrunerConfig, PruningConfig,
14+
from intel_extension_for_transformers.transformers import metrics
15+
from neural_compressor.config import WeightPruningConfig
2816
from intel_extension_for_transformers.transformers.trainer import NLPTrainer
2917
# Replace transformers.Trainer with NLPTrainer
3018
# trainer = transformers.Trainer(......)
3119
trainer = NLPTrainer(......)
3220
metric = metrics.Metric(name="eval_accuracy")
33-
pruner_config = PrunerConfig(prune_type='BasicMagnitude', target_sparsity_ratio=0.9)
34-
p_conf = PruningConfig(pruner_config=[pruner_config], metrics=metric)
35-
model = trainer.prune(pruning_config=p_conf)
21+
trainer.metrics = tune_metric
22+
pruning_conf = WeightPruningConfig([{"start_step": 0, "end_step": 2}],
23+
target_sparsity=0,9,
24+
pruning_scope="local",
25+
pruning_type="magnitude")
26+
model = trainer.prune(pruning_config=pruning_conf)
3627
```
3728
Please refer to [example](../examples/huggingface/pytorch/text-classification/pruning) for the details.
3829

@@ -45,41 +36,27 @@ The Metric defines which metric will be used to measure the performance of tuned
4536

4637
Please refer to [metrics document](metrics.md) for the details.
4738

48-
### Create list of an instance of PrunerConfig(Optional)
49-
PrunerConfig defines which pruning algorithm to use and how to apply it during the training process. Intel® Extension for Transformers supports pruning types "BasicMagnitude", "PatternLock", and "GroupLasso". You can create different pruners for different layers.
39+
### Create an instance of WeightPruningConfig
40+
[WeightPruningConfig](neural-compressor_neural_compressor_config.py at master · intel_neural-compressor.html) defines which pruning algorithm to use and how to apply it during the training process. Intel® Extension for Transformers supports pruning types "magnitude", "pattern_lock", and "GroupLasso". You can create different pruners for different layers.
5041

5142
- arguments:
5243
|Argument |Type |Description |Default value |
5344
|:----------|:----------|:-----------------------------------------------|:----------------|
54-
|epoch_range|list of integer|Which epochs to pruning |[0, 4] |
55-
|initial_sparsity_ratio|float |Initial sparsity goal |0.0 |
56-
|target_sparsity_ratio|float |Target sparsity goal |0.97 |
45+
|pruning_configs |list of dicts|Which epochs to pruning |[{}] |
46+
|target_sparsity |float |Initial sparsity goal |0.90 |
5747
|update_frequency|integer|Frequency to updating sparsity |1 |
58-
|prune_type|string|Pruning algorithm |'BasicMagnitude' |
59-
|method|string|Pruning method |'per_tensor' |
60-
|names|list of string|List of weight name to be pruned. If no weight is specified, all weights of the model will be pruned|[]|
61-
|parameters|dict of string|The hyper-parameters for pruning, refer to [the link](https://github.com/intel/neural-compressor/blob/master/docs/source/pruning.md)|None|
48+
|pruning_type |string|Pruning algorithm |'snip_momentum' |
49+
6250

63-
- example:
64-
```python
65-
pruner_config = PrunerConfig(prune_type='BasicMagnitude', target_sparsity_ratio=0.9)
66-
```
67-
68-
### Create an instance of PruningConfig
69-
The PruningConfig contains all the information related to the model pruning behavior. If you have created Metric and PrunerConfig instance, then you can create an instance of PruningConfig. Metric and pruner are optional.
70-
71-
- arguments:
72-
|Argument |Type |Description |Default value |
73-
|:----------|:----------|:-----------------------------------------------|:----------------|
74-
|framework |string |Which framework you used |"pytorch" |
75-
|initial_sparsity_ratio|float |Initial sparsity goal, if pruner_config argument is defined, it didn't need |0.0|
76-
|target_sparsity_ratio|float |Target sparsity goal, if pruner argument is defined, it didn't need |0.97|
77-
|metrics |Metric |Used to evaluate accuracy of tuning model, no need for NoTrainerOptimizer|None |
78-
|pruner_config |PrunerConfig |Defined pruning behavior, if it is None, then NLP will create a default a pruner with 'BasicMagnitude' pruning type |None |
51+
The WeightPruningConfig contains all the information related to the model pruning behavior. If you have created Metric and WeightPruningConfig instance, then you can create an instance of WeightPruningConfig. Metric and pruner are optional.
7952

8053
- example:
8154
```python
82-
pruning_conf = PruningConfig(pruner_config=[pruner_config], metrics=tune_metric)
55+
from neural_compressor.config import WeightPruningConfig
56+
pruning_conf = WeightPruningConfig([{"start_step": 0, "end_step": 2}],
57+
target_sparsity=0,9,
58+
pruning_scope="local",
59+
pruning_type="magnitude")
8360
```
8461

8562
### Prune with Trainer

0 commit comments

Comments
 (0)