Skip to content

Commit 678b470

Browse files
committed
Merge: [Syngen] 22.12 Release Demo notebooks description extension
2 parents ed28348 + 5c45d61 commit 678b470

File tree

10 files changed

+973
-321
lines changed

10 files changed

+973
-321
lines changed

Tools/DGLPyTorch/SyntheticGraphGeneration/demos/advanced_examples/e2e_cora_demo.ipynb

Lines changed: 220 additions & 45 deletions
Large diffs are not rendered by default.

Tools/DGLPyTorch/SyntheticGraphGeneration/demos/advanced_examples/e2e_ieee_demo.ipynb

Lines changed: 245 additions & 46 deletions
Large diffs are not rendered by default.

Tools/DGLPyTorch/SyntheticGraphGeneration/demos/advanced_examples/edge_classification_pretraining.ipynb

Lines changed: 96 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"cells": [
33
{
44
"cell_type": "raw",
5-
"id": "49d2aaf9",
5+
"id": "ea7ceaf7",
66
"metadata": {},
77
"source": [
88
"# Copyright 2023 NVIDIA Corporation. All Rights Reserved.\n",
@@ -23,24 +23,27 @@
2323
},
2424
{
2525
"cell_type": "markdown",
26-
"id": "fc0ea279",
26+
"id": "4e1a2027",
2727
"metadata": {},
2828
"source": [
2929
"# Edge Classification Pretraining demo (IEEE)"
3030
]
3131
},
3232
{
3333
"cell_type": "markdown",
34-
"id": "5a5196e2",
34+
"id": "05d3d798",
3535
"metadata": {},
3636
"source": [
3737
"## Overview\n",
38+
"\n",
39+
"Often times it is helpful to pre-train or initialize a network with learned weights on a downstream task of interest and further fine-tune.\n",
40+
"\n",
3841
"This notebook demonstrates the steps for pretraing a GNN on synthetic data and finetuning on real data. "
3942
]
4043
},
4144
{
4245
"cell_type": "markdown",
43-
"id": "01a6300c",
46+
"id": "26f39e76",
4447
"metadata": {},
4548
"source": [
4649
"### Imports"
@@ -49,7 +52,7 @@
4952
{
5053
"cell_type": "code",
5154
"execution_count": 1,
52-
"id": "69a28c16",
55+
"id": "f315cfcd",
5356
"metadata": {},
5457
"outputs": [
5558
{
@@ -94,16 +97,28 @@
9497
},
9598
{
9699
"cell_type": "markdown",
97-
"id": "01eddd70",
100+
"id": "20e3e3a6",
98101
"metadata": {},
99102
"source": [
100103
"### Generate synthetic data"
101104
]
102105
},
106+
{
107+
"cell_type": "markdown",
108+
"id": "5c3db76c",
109+
"metadata": {},
110+
"source": [
111+
"In the following cells, a synthesizer is instantiated and fitted on the IEEE dataset.\n",
112+
"\n",
113+
"Once fitted, the synthesizer is used to generate synthetic data with similar characteristics.\n",
114+
"\n",
115+
"For a more detailed explanation checkout the `e2e_ieee_demo.ipynb`"
116+
]
117+
},
103118
{
104119
"cell_type": "code",
105120
"execution_count": 2,
106-
"id": "65da8b0a",
121+
"id": "8f86bf18",
107122
"metadata": {},
108123
"outputs": [
109124
{
@@ -131,7 +146,7 @@
131146
{
132147
"cell_type": "code",
133148
"execution_count": 3,
134-
"id": "b0b64872",
149+
"id": "60bb8cfb",
135150
"metadata": {},
136151
"outputs": [],
137152
"source": [
@@ -145,7 +160,7 @@
145160
{
146161
"cell_type": "code",
147162
"execution_count": 4,
148-
"id": "ac0d50f7",
163+
"id": "37d4eb69",
149164
"metadata": {},
150165
"outputs": [
151166
{
@@ -164,7 +179,7 @@
164179
{
165180
"cell_type": "code",
166181
"execution_count": 5,
167-
"id": "84732600",
182+
"id": "873d0cf2",
168183
"metadata": {},
169184
"outputs": [
170185
{
@@ -204,7 +219,7 @@
204219
{
205220
"cell_type": "code",
206221
"execution_count": 6,
207-
"id": "b615610c",
222+
"id": "b08f1603",
208223
"metadata": {},
209224
"outputs": [
210225
{
@@ -251,15 +266,30 @@
251266
},
252267
{
253268
"cell_type": "markdown",
254-
"id": "66f7a839",
269+
"id": "03e21408",
255270
"metadata": {},
256271
"source": [
257272
"### Train GNN"
258273
]
259274
},
260275
{
261276
"cell_type": "markdown",
262-
"id": "07805108",
277+
"id": "a834318e",
278+
"metadata": {},
279+
"source": [
280+
"To train an example GNN we need the following:\n",
281+
"\n",
282+
"- a dataset object instantiated using either the synthetic or original data\n",
283+
"- the model, optimizer and hyperparameters defined\n",
284+
"\n",
285+
"In the tool an example dataloader is implemented for edge classification under `syngen/benchmark/data_loader`.\n",
286+
"\n",
287+
"This dataset object is used to great the dgl graphs corresponding to both the generated data and real data."
288+
]
289+
},
290+
{
291+
"cell_type": "markdown",
292+
"id": "28fabfa9",
263293
"metadata": {},
264294
"source": [
265295
"#### Create datasets"
@@ -268,7 +298,7 @@
268298
{
269299
"cell_type": "code",
270300
"execution_count": 7,
271-
"id": "0fe941f0",
301+
"id": "f7e8bd44",
272302
"metadata": {},
273303
"outputs": [],
274304
"source": [
@@ -279,16 +309,24 @@
279309
},
280310
{
281311
"cell_type": "markdown",
282-
"id": "a8b23137",
312+
"id": "b830709c",
283313
"metadata": {},
284314
"source": [
285315
"#### Create helper function\n"
286316
]
287317
},
318+
{
319+
"cell_type": "markdown",
320+
"id": "b959a3a2",
321+
"metadata": {},
322+
"source": [
323+
"The helper function defines a simple trianing loop and standard metrics for edge classification."
324+
]
325+
},
288326
{
289327
"cell_type": "code",
290328
"execution_count": 8,
291-
"id": "f46973e3",
329+
"id": "5c4bec86",
292330
"metadata": {},
293331
"outputs": [],
294332
"source": [
@@ -329,16 +367,24 @@
329367
},
330368
{
331369
"cell_type": "markdown",
332-
"id": "6ad092e6",
370+
"id": "dc4cea06",
333371
"metadata": {},
334372
"source": [
335373
"#### No-Pretrain"
336374
]
337375
},
376+
{
377+
"cell_type": "markdown",
378+
"id": "093203f8",
379+
"metadata": {},
380+
"source": [
381+
"Without pre-training the model is trained from scratch using the original data graph."
382+
]
383+
},
338384
{
339385
"cell_type": "code",
340386
"execution_count": 9,
341-
"id": "d4ad039a",
387+
"id": "93ab387d",
342388
"metadata": {},
343389
"outputs": [
344390
{
@@ -383,16 +429,26 @@
383429
},
384430
{
385431
"cell_type": "markdown",
386-
"id": "7f061442",
432+
"id": "08f5280a",
387433
"metadata": {},
388434
"source": [
389435
"#### Pretrain"
390436
]
391437
},
438+
{
439+
"cell_type": "markdown",
440+
"id": "18bebba4",
441+
"metadata": {},
442+
"source": [
443+
"In this example the model is first trained on the generated data for a certain epoch budget.\n",
444+
"\n",
445+
"Subsequently it is further trained on the original data graph."
446+
]
447+
},
392448
{
393449
"cell_type": "code",
394450
"execution_count": 10,
395-
"id": "2f3985b2",
451+
"id": "e21ab679",
396452
"metadata": {},
397453
"outputs": [
398454
{
@@ -438,7 +494,7 @@
438494
{
439495
"cell_type": "code",
440496
"execution_count": 11,
441-
"id": "f33bec4f",
497+
"id": "8b615c76",
442498
"metadata": {},
443499
"outputs": [
444500
{
@@ -458,15 +514,25 @@
458514
},
459515
{
460516
"cell_type": "markdown",
461-
"id": "a6f0cfbe",
517+
"id": "69b9e95c",
462518
"metadata": {},
463519
"source": [
464520
"### CLI example"
465521
]
466522
},
467523
{
468524
"cell_type": "markdown",
469-
"id": "2c48ec37",
525+
"id": "93fd05a0",
526+
"metadata": {},
527+
"source": [
528+
"The tool also provides this functionality through its CLI.\n",
529+
"\n",
530+
"The commands used to generate and pretrain/fine tune on the downstream tasks as done above are provided below."
531+
]
532+
},
533+
{
534+
"cell_type": "markdown",
535+
"id": "8de441fe",
470536
"metadata": {},
471537
"source": [
472538
"#### Generate synthetic graph"
@@ -475,7 +541,7 @@
475541
{
476542
"cell_type": "code",
477543
"execution_count": 1,
478-
"id": "b588c44a",
544+
"id": "af89d214",
479545
"metadata": {},
480546
"outputs": [
481547
{
@@ -553,7 +619,7 @@
553619
},
554620
{
555621
"cell_type": "markdown",
556-
"id": "01eeff23",
622+
"id": "7fef4fb7",
557623
"metadata": {},
558624
"source": [
559625
"#### Results without pretraining"
@@ -562,7 +628,7 @@
562628
{
563629
"cell_type": "code",
564630
"execution_count": 2,
565-
"id": "50238488",
631+
"id": "c65ab4be",
566632
"metadata": {},
567633
"outputs": [
568634
{
@@ -607,7 +673,7 @@
607673
},
608674
{
609675
"cell_type": "markdown",
610-
"id": "1a8474cb",
676+
"id": "e6655f58",
611677
"metadata": {},
612678
"source": [
613679
"#### Pretrain and finetune"
@@ -616,7 +682,7 @@
616682
{
617683
"cell_type": "code",
618684
"execution_count": 3,
619-
"id": "92039366",
685+
"id": "fd2b8caf",
620686
"metadata": {},
621687
"outputs": [
622688
{
@@ -668,7 +734,7 @@
668734
{
669735
"cell_type": "code",
670736
"execution_count": null,
671-
"id": "f0405bf2",
737+
"id": "2da530b6",
672738
"metadata": {},
673739
"outputs": [],
674740
"source": []
@@ -693,7 +759,7 @@
693759
"name": "python",
694760
"nbconvert_exporter": "python",
695761
"pygments_lexer": "ipython3",
696-
"version": "3.8.15"
762+
"version": "3.8.10"
697763
}
698764
},
699765
"nbformat": 4,

0 commit comments

Comments
 (0)