Skip to content

Commit 8ae588c

Browse files
committed
Updates with Poplar SDK 3.0 release
1 parent ffcfb13 commit 8ae588c

File tree

728 files changed

+35814
-7585
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

728 files changed

+35814
-7585
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@
1414
**/data
1515
**/logs
1616

17+
**/cifar-10-batches-bin
18+
1719
*.pyc
1820
__pycache__
1921
.cache
@@ -37,6 +39,7 @@ vars.capnp
3739

3840
# Remove VIM temp files
3941
*.swp
42+
**/.*.sw[a-p]
4043

4144
# C++ examples build into the "build" directory
4245
**/build/
@@ -53,4 +56,3 @@ nohup.*
5356

5457
utils/triton_server/backends
5558
!vision/cnns/pytorch/tests/tritonserver/models/*/*/*.json
56-

.pre-commit-config.yaml

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# NOTE: The versions can be updated by calling
2+
# pre-commit autoupdate
3+
repos:
4+
- repo: https://github.com/pre-commit/pre-commit-hooks
5+
rev: v3.3.0
6+
hooks:
7+
- id: no-commit-to-branch
8+
args: [--branch, master, --branch, main]
9+
- repo: https://github.com/pre-commit/mirrors-autopep8
10+
rev: v1.6.0 # v1.7.0 is not compatible with python3.6
11+
hooks:
12+
- id: autopep8
13+
args: [--in-place, --list-fixes, --ignore, 'E251,E303,E402,E501,E701,E226,E24,W50,W690']
14+
- repo: local
15+
hooks:
16+
- id: copyright-header-check
17+
name: Copyright header check
18+
description: Ensures that files have the proper copyright line at the top
19+
entry: python3 -m examples_utils test_copyright --amend --exclude_json utils/examples_tests/copyright_header_test_exclude.json
20+
pass_filenames: false
21+
language: python
22+
additional_dependencies:
23+
- 'git+https://github.com/graphcore/examples-utils.git@1aded5f35073d93fedcb516ad3782082daba3f87'

CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
* @graphcore/applications

README.md

Lines changed: 29 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,14 +40,15 @@ If you require POD128/256 setup and configuration for our applications, please c
4040
| Model | Domain | Type |Links |
4141
| ------- | ------- |------- | ------- |
4242
| ResNet | Image Classification | Training & Inference | [TensorFlow 1](vision/cnns/tensorflow1/) , [TensorFlow 2](vision/cnns/tensorflow2/), [PyTorch](vision/cnns/pytorch/), [PyTorch Lightning](https://github.com/graphcore/pytorch-lightning-examples/tree/release/applications)|
43-
| ResNeXt | Image Classification | Training & Inference | [TensorFlow 1](vision/cnns/tensorflow1/) , [PopART (Inference)](vision/resnext_inference/popart)
43+
| ResNeXt | Image Classification | Training & Inference | [TensorFlow 1](vision/cnns/tensorflow1/) , [PopART (Inference)](vision/resnext_inference/popart), [PyTorch (Inference)](vision/cnns/pytorch/inference)
4444
| EfficientNet | Image Classification | Training & Inference | [TensorFlow 1](vision/cnns/tensorflow1/) , [PyTorch](vision/cnns/pytorch/), [PyTorch Lightning](https://github.com/graphcore/pytorch-lightning-examples/tree/release/applications)|
4545
| MobileNet | Image Classification | Inference | [TensorFlow 1](vision/cnns/tensorflow1/inference) |
4646
| MobileNetv2 | Image Classification | Inference | [TensorFlow 1](vision/cnns/tensorflow1/inference) |
4747
| MobileNetv3 | Image Classification | Training & Inference | [PyTorch](vision/cnns/pytorch/) |
4848
| ViT(Vision Transformer) | Image Classification | Training| [PyTorch](vision/vit/pytorch/), [Hugging Face Optimum](https://huggingface.co/Graphcore/vit-base-ipu) |
4949
| DINO | Image Classification | Training| [PyTorch](vision/dino/pytorch) |
5050
| Swin | Image Classification | Training | [PyTorch](vision/swin/pytorch) |
51+
| MAE (Masked AutoEncoder) | Image Classification | Training | [PyTorch](vision/mae/pytorch) |
5152
| Yolov3 | Object Detection | Training & Inference | [TensorFlow 1](vision/yolo_v3/tensorflow1) |
5253
| Yolov4-P5 | Object Detection | Inference | [PyTorch](vision/yolo_v4/pytorch) |
5354
| Faster RCNN | Object Detection | Training & Inference | [PopART](vision/faster_rcnn/popart) |
@@ -66,6 +67,8 @@ If you require POD128/256 setup and configuration for our applications, please c
6667
| Group BERT | NLP | Training |[TensorFlow 1](nlp/bert/tensorflow1/README.md#GroupBERT_model) |
6768
| Packed BERT | NLP | Training |[PyTorch](nlp/bert/pytorch), [PopART](nlp/bert/popart) |
6869
| GPT2 | NLP | Training |[PyTorch](nlp/gpt2/pytorch) , [Hugging Face Optimum](https://huggingface.co/Graphcore/gpt2-medium-ipu) |
70+
| GPTJ | NLP | Training |[PopXL](nlp/gpt_j/popxl)|
71+
| GPT3-2.7B | NLP | Training |[PopXL](nlp/gpt3_2.7B/popxl) |
6972
| RoBERTa | NLP | Training | [Hugging Face Optimum](https://huggingface.co/Graphcore/roberta-large-ipu)|
7073
| DeBERTa | NLP | Training | [Hugging Face Optimum](https://huggingface.co/Graphcore/deberta-base-ipu)|
7174
| HuBERT | NLP | Training | [Hugging Face Optimum](https://huggingface.co/Graphcore/hubert-base-ipu)|
@@ -96,7 +99,8 @@ If you require POD128/256 setup and configuration for our applications, please c
9699
| miniDALL-E | multimodal | Training | [PyTorch](multimodal/mini_dalle/pytorch) |
97100
| CLIP | multimodal | Training |[PyTorch](multimodal/CLIP/pytorch)|
98101
| LXMERT | multimodal | Training | [Hugging Face Optimum](https://huggingface.co/Graphcore/lxmert-base-ipu)|
99-
102+
| Frozen in time | multimodal | Training & Inference |[PyTorch](multimodal/frozen_in_time/pytorch)|
103+
| ruDalle (Preview) | multimodal | Inference |[PopXL](preview/multimodal/rudalle)|
100104

101105
<br>
102106

@@ -184,6 +188,16 @@ The following applications have been archived. More information can be provided
184188

185189
<br>
186190

191+
## Benchmarking tools
192+
To easily run the examples with tested and optimised configurations and to reproduce the performance shown on our [performance results page](https://www.graphcore.ai/performance-results), you can use the examples-utils benchmarking module, which comes with every example when you install its requirements. To use this simple, shared interface for almost any of the examples provided here, locate and look through the example's `benchmarks.yml` file and run:
193+
194+
```python
195+
python3 -m examples_utils benchmark --spec <path to benchmarks.yml file> --benchmark <name of benchmark>
196+
```
197+
198+
For more information on using the examples-utils benchmarking module, please refer to [the README](https://github.com/graphcore/examples-utils/blob/master/examples_utils/benchmarks/README.md).
199+
200+
<br>
187201

188202
## PopVision™ Tools
189203
Visualise your code's inner workings with a user-friendly, graphical interface to optimise your machine learning models.
@@ -193,7 +207,8 @@ Visualise your code's inner workings with a user-friendly, graphical interface t
193207
<br>
194208

195209
## Support
196-
Please note we are not currently accepting pull requests or issues on this repository. If you are actively using this repository and want to report any issues, please raise a ticket through the [Graphcore support portal](https://support.graphcore.ai/).
210+
If you encounter a problem or want to suggest an improvement to our examples please raise a Github issue or contact us at
211+
[support@graphcore.ai](mailto:support@graphcore.ai?subject=General%20Feedback).
197212

198213
<br>
199214

@@ -211,6 +226,17 @@ Unless otherwise specified by a LICENSE file in a subdirectory, the LICENSE refe
211226
<br>
212227

213228
## Changelog
229+
230+
<details>
231+
<summary>Sep 2022</summary>
232+
<br>
233+
234+
* Added those models below to reference models
235+
* Vision : MAE (PyTorch), G16 EfficientNet (PyTorch)
236+
* NLP : GPTJ (PopXL), GPT3-2.7B (PopXL)
237+
* Multimodal : Frozen in time (PyTorchs), ruDalle(Preview) (PopXL)
238+
</details>
239+
214240
<details>
215241
<summary>Aug 2022</summary>
216242
<br>

ai_for_simulation/cosmoflow/tensorflow1/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,13 +33,13 @@ This README describes how to run a conv3D based model called CosmoFlow on IPU ha
3333
- run without tensorflow estimator, with 2 IPUs:
3434
The workload is heavily IO bound, so merely increasing IPUs without increasing CPU numa-aware threads to pre-process
3535
the dataset will show marginal scalability. We use poprun to increase threads involved in processing
36-
`poprun --num-replicas 2 --num-instances 2 --ipus-per-replica 1 --numa-aware 1 python train.py configs/graphcore.yaml`
36+
`poprun --num-replicas 2 --num-instances 2 --ipus-per-replica 1 python train.py configs/graphcore.yaml`
3737

3838
- run with tensorflow estimator, with 1 IPU:
3939
`python train.py configs/graphcore.yaml --use-estimator`
4040

4141
- run with tensorflow estimator, with 2 IPUs:
42-
`poprun --num-replicas 2 --num-instances 2 --ipus-per-replica 1 --numa-aware 1 python train.py configs/graphcore.yaml --use-estimator`
42+
`poprun --num-replicas 2 --num-instances 2 --ipus-per-replica 1 python train.py configs/graphcore.yaml --use-estimator`
4343

4444
## Licensing
4545

ai_for_simulation/deep_drive_md/tensorflow2/cvae/CVAE.py

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -52,11 +52,13 @@ def CVAE(input_shape, steps_per_exec, latent_dim=3):
5252
return autoencoder
5353

5454

55-
def create_datasets(x_train, y_train, x_val, y_val, batch_size):
56-
train_ds = tf.data.Dataset.from_tensor_slices(x_train).batch(
57-
batch_size, drop_remainder=True).repeat().prefetch(16)
58-
val_ds = tf.data.Dataset.from_tensor_slices((x_val, y_val)).batch(
59-
batch_size, drop_remainder=True).repeat().prefetch(16)
55+
def create_datasets(x_train, x_val, batch_size):
56+
train_ds = tf.data.Dataset.from_tensor_slices(x_train)
57+
train_ds = train_ds.map(lambda x : (x, 0.)) # 0. is a dummy value that will be ignored
58+
train_ds = train_ds.batch(batch_size, drop_remainder=True).repeat().prefetch(16)
59+
val_ds = tf.data.Dataset.from_tensor_slices(x_val)
60+
val_ds = val_ds.map(lambda x : (x, 0.)) # 0. is a dummy value that will be ignored
61+
val_ds = val_ds.batch(batch_size, drop_remainder=True).repeat().prefetch(16)
6062

6163
return train_ds, val_ds
6264

@@ -73,7 +75,7 @@ def run_cvae(hyper_dim=3, epochs=10, batch_size=200, cm_data_input=None, validat
7375
steps_epoch = len(cm_data_train) // batch_size
7476
steps_val = len(cm_data_val) // batch_size if validation else None
7577

76-
train_ds, val_ds = create_datasets(cm_data_train, cm_data_train, cm_data_val, cm_data_val, batch_size=batch_size)
78+
train_ds, val_ds = create_datasets(cm_data_train, cm_data_val, batch_size=batch_size)
7779
cm_data_train = train_ds
7880
cm_data_val = val_ds if validation else None
7981

ai_for_simulation/deep_drive_md/tensorflow2/cvae/vae_conv_new.py

Lines changed: 35 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,27 @@ def call(self, inputs):
8181
return z_mean + K.exp(0.5 * z_log_var) * epsilon
8282

8383

84+
class ReconstructionLossLayer(tf.keras.layers.Layer):
85+
86+
def call(self, inputs):
87+
data, reconstruction = inputs
88+
reconstruction_loss = tf.reduce_mean(
89+
tf.reduce_sum(
90+
binary_crossentropy(data, reconstruction), axis=(1, 2)
91+
)
92+
)
93+
return reconstruction_loss
94+
95+
96+
class KLLossLayer(tf.keras.layers.Layer):
97+
98+
def call(self, inputs):
99+
z_mean, z_log_var = inputs
100+
kl_loss = -0.5 * (1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var))
101+
kl_loss = tf.reduce_mean(tf.reduce_sum(kl_loss, axis=1))
102+
return kl_loss
103+
104+
84105
def encoder_decoder(latent_dim, channels, image_size, feature_maps, filter_shapes, activation,
85106
strides, conv_layers, dense_layers, dense_neurons, dense_dropouts, eps_mean,
86107
eps_std):
@@ -240,71 +261,23 @@ def __init__(self, image_size, channels, conv_layers, feature_maps, filter_shape
240261
latent_dim, channels, image_size, feature_maps, filter_shapes, activation,
241262
strides, conv_layers, dense_layers, dense_neurons, dense_dropouts, eps_mean,
242263
eps_std)
243-
self.total_loss_tracker = metrics.Mean(name="loss")
244-
self.reconstruction_loss_tracker = metrics.Mean(
245-
name="reconstruction_loss"
246-
)
247-
self.kl_loss_tracker = metrics.Mean(name="kl_loss")
264+
# Overriding train_step() is not supported at the moment, but the VAEs loss calculation requires customization.
265+
# Therefore, we define losses as layers so that they could be caclulated in call().
266+
self.reconstruction_loss = ReconstructionLossLayer(name='reconstruction')
267+
self.kl_loss = KLLossLayer(name='kl')
248268

249269
self.optimizer = RMSprop(lr=0.001, rho=0.9, epsilon=1e-08, decay=0.0)
250270
self.encoder.set_infeed_queue_options(prefetch_depth=16)
251271
self.decoder.set_infeed_queue_options(prefetch_depth=16)
252-
self.compile(optimizer=self.optimizer, steps_per_execution=steps_per_exec)
253-
self.inputs = self.encoder.inputs
254272
self.build(tf.TensorShape((1, image_size[0], image_size[1], channels)))
273+
self.compile(optimizer=self.optimizer, loss=self.dummy_loss, steps_per_execution=steps_per_exec)
274+
self.inputs = self.encoder.inputs
255275
self.summary()
256276

257-
@property
258-
def metrics(self):
259-
return [
260-
self.total_loss_tracker,
261-
self.reconstruction_loss_tracker,
262-
self.kl_loss_tracker,
263-
]
264-
265-
def train_step(self, data):
266-
with tf.GradientTape() as tape:
267-
z_mean, z_log_var, z = self.encoder(data)
268-
reconstruction = self.decoder(z)
269-
reconstruction_loss = tf.reduce_mean(
270-
tf.reduce_sum(
271-
binary_crossentropy(data, reconstruction), axis=(1, 2)
272-
)
273-
)
274-
kl_loss = -0.5 * (1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var))
275-
kl_loss = tf.reduce_mean(tf.reduce_sum(kl_loss, axis=1))
276-
total_loss = reconstruction_loss + kl_loss
277-
grads = tape.gradient(total_loss, self.trainable_weights)
278-
self.optimizer.apply_gradients(zip(grads, self.trainable_weights))
279-
self.total_loss_tracker.update_state(total_loss)
280-
self.reconstruction_loss_tracker.update_state(reconstruction_loss)
281-
self.kl_loss_tracker.update_state(kl_loss)
282-
return {
283-
"loss": self.total_loss_tracker.result(),
284-
"reconstruction_loss": self.reconstruction_loss_tracker.result(),
285-
"kl_loss": self.kl_loss_tracker.result(),
286-
}
287-
288-
def test_step(self, data):
289-
if isinstance(data, tuple):
290-
data = data[0]
291-
292-
z_mean, z_log_var, z = self.encoder(data)
293-
reconstruction = self.decoder(z)
294-
reconstruction_loss = tf.reduce_mean(
295-
tf.reduce_sum(
296-
binary_crossentropy(data, reconstruction), axis=(1, 2)
297-
)
298-
)
299-
kl_loss = 1 + z_log_var - tf.square(z_mean) - tf.exp(z_log_var)
300-
kl_loss = tf.reduce_mean(kl_loss)
301-
kl_loss *= -0.5
302-
total_loss = reconstruction_loss + kl_loss
303-
return {
304-
"loss": total_loss,
305-
"reconstruction_loss": reconstruction_loss,
306-
"kl_loss": kl_loss,
307-
}
277+
def dummy_loss(self, y_true, y_pred):
278+
# y_pred is already the loss since loss is calculated in call(), so y_true (which we defined as 0.) will be ignored
279+
loss = y_pred
280+
return loss
308281

309282
def save(self, filepath):
310283
'''
@@ -374,9 +347,11 @@ def generate(self, embedding):
374347
return self.decoder(embedding)
375348

376349
def call(self, inputs):
377-
_, _, z = self.encoder(inputs)
350+
z_mean, z_log_var, z = self.encoder(inputs)
378351
reconstruction = self.decoder(z)
379-
return reconstruction
352+
reconstruction_loss = self.reconstruction_loss([inputs, reconstruction])
353+
kl_loss = self.kl_loss([z_mean, z_log_var])
354+
return reconstruction_loss, kl_loss
380355

381356
def train(self, train_data, validation_data, batch_size, epochs, steps_per_epoch=1, validation_steps=1):
382357
self.fit(
Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,3 @@
11
zip
2-
unzip
2+
unzip
3+
libtiff5

gnn/cluster_gcn/tensorflow2/README.md

Lines changed: 16 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -131,6 +131,22 @@ For example, the following configuration will load the data from or download to
131131
}
132132
```
133133

134+
## Running and benchmarking
135+
136+
To run a tested and optimised configuration and to reproduce the performance shown on our [performance results page](https://www.graphcore.ai/performance-results), please follow the setup instructions in this README to setup the environment, and then use the `examples_utils` module (installed automatically as part of the environment setup) to run one or more benchmarks. For example:
137+
138+
```python
139+
python3 -m examples_utils benchmark --spec <path to benchmarks.yml file>
140+
```
141+
142+
Or to run a specific benchmark in the `benchmarks.yml` file provided:
143+
144+
```python
145+
python3 -m examples_utils benchmark --spec <path to benchmarks.yml file> --benchmark <name of benchmark>
146+
```
147+
148+
For more information on using the examples-utils benchmarking module, please refer to [the README](https://github.com/graphcore/examples-utils/blob/master/examples_utils/benchmarks/README.md).
149+
134150
## Run training and validation <a name='training_validation' ></a>
135151

136152
```shell
@@ -171,34 +187,3 @@ Note that the `NUM_INSTANCES` should be divisible by `NUM_REPLICAS`
171187
and it is recommended to use `EPOCHS_PER_EXECUTION` equal to the `NUM_INSTANCES`
172188
for best balance between accuracy and performance.
173189

174-
## Benchmarking
175-
176-
To reproduce the benchmarks, please follow the setup instructions in this README to setup the environment, and then from this dir, use the `examples_utils` module to run one or more benchmarks. For example:
177-
```
178-
python3 -m examples_utils benchmark --spec benchmarks.yml
179-
```
180-
181-
or to run a specific benchmark in the `benchmarks.yml` file provided:
182-
```
183-
python3 -m examples_utils benchmark --spec benchmarks.yml --benchmark <benchmark_name>
184-
```
185-
186-
For more information on how to use the examples_utils benchmark functionality, please see the <a>benchmarking readme<a href=<https://github.com/graphcore/examples-utils/tree/master/examples_utils/benchmarks>
187-
188-
## Profiling
189-
190-
Profiling can be done easily via the `examples_utils` module, simply by adding the `--profile` argument when using the `benchmark` submodule (see the <strong>Benchmarking</strong> section above for further details on use). For example:
191-
```
192-
python3 -m examples_utils benchmark --spec benchmarks.yml --profile
193-
```
194-
Will create folders containing popvision profiles in this applications root directory (where the benchmark has to be run from), each folder ending with "_profile".
195-
196-
The `--profile` argument works by allowing the `examples_utils` module to update the `POPLAR_ENGINE_OPTIONS` environment variable in the environment the benchmark is being run in, by setting:
197-
```
198-
POPLAR_ENGINE_OPTIONS = {
199-
"autoReport.all": "true",
200-
"autoReport.directory": <current_working_directory>,
201-
"autoReport.outputSerializedGraph": "false",
202-
}
203-
```
204-
Which can also be done manually by exporting this variable in the benchmarking environment, if custom options are needed for this variable.

0 commit comments

Comments
 (0)