NVIDIA
diff --git a/‎Tools/DGLPyTorch/SyntheticGraphGeneration/demos/advanced_examples/e2e_cora_demo.ipynb‎
Lines changed: 220 additions & 45 deletions b/‎Tools/DGLPyTorch/SyntheticGraphGeneration/demos/advanced_examples/e2e_cora_demo.ipynb‎
Lines changed: 220 additions & 45 deletions
diff --git a/‎Tools/DGLPyTorch/SyntheticGraphGeneration/demos/advanced_examples/e2e_ieee_demo.ipynb‎
Lines changed: 245 additions & 46 deletions b/‎Tools/DGLPyTorch/SyntheticGraphGeneration/demos/advanced_examples/e2e_ieee_demo.ipynb‎
Lines changed: 245 additions & 46 deletions
diff --git a/‎Tools/DGLPyTorch/SyntheticGraphGeneration/demos/advanced_examples/edge_classification_pretraining.ipynb‎
Lines changed: 96 additions & 30 deletions b/‎Tools/DGLPyTorch/SyntheticGraphGeneration/demos/advanced_examples/edge_classification_pretraining.ipynb‎
Lines changed: 96 additions & 30 deletions
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "raw",
-   "id": "49d2aaf9",
+   "id": "ea7ceaf7",
    "metadata": {},
    "source": [
     "# Copyright 2023 NVIDIA Corporation. All Rights Reserved.\n",
@@ -23,24 +23,27 @@
   },
   {
    "cell_type": "markdown",
-   "id": "fc0ea279",
+   "id": "4e1a2027",
    "metadata": {},
    "source": [
     "# Edge Classification Pretraining demo (IEEE)"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "5a5196e2",
+   "id": "05d3d798",
    "metadata": {},
    "source": [
     "## Overview\n",
+    "\n",
+    "Often times it is helpful to pre-train or initialize a network with learned weights on a downstream task of interest and further fine-tune.\n",
+    "\n",
     "This notebook demonstrates the steps for pretraing a GNN on synthetic data and finetuning on real data. "
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "01a6300c",
+   "id": "26f39e76",
    "metadata": {},
    "source": [
     "### Imports"
@@ -49,7 +52,7 @@
   {
    "cell_type": "code",
    "execution_count": 1,
-   "id": "69a28c16",
+   "id": "f315cfcd",
    "metadata": {},
    "outputs": [
     {
@@ -94,16 +97,28 @@
   },
   {
    "cell_type": "markdown",
-   "id": "01eddd70",
+   "id": "20e3e3a6",
    "metadata": {},
    "source": [
     "### Generate synthetic data"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "5c3db76c",
+   "metadata": {},
+   "source": [
+    "In the following cells, a synthesizer is instantiated and fitted on the IEEE dataset.\n",
+    "\n",
+    "Once fitted, the synthesizer is used to generate synthetic data with similar characteristics.\n",
+    "\n",
+    "For a more detailed explanation checkout the `e2e_ieee_demo.ipynb`"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 2,
-   "id": "65da8b0a",
+   "id": "8f86bf18",
    "metadata": {},
    "outputs": [
     {
@@ -131,7 +146,7 @@
   {
    "cell_type": "code",
    "execution_count": 3,
-   "id": "b0b64872",
+   "id": "60bb8cfb",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -145,7 +160,7 @@
   {
    "cell_type": "code",
    "execution_count": 4,
-   "id": "ac0d50f7",
+   "id": "37d4eb69",
    "metadata": {},
    "outputs": [
     {
@@ -164,7 +179,7 @@
   {
    "cell_type": "code",
    "execution_count": 5,
-   "id": "84732600",
+   "id": "873d0cf2",
    "metadata": {},
    "outputs": [
     {
@@ -204,7 +219,7 @@
   {
    "cell_type": "code",
    "execution_count": 6,
-   "id": "b615610c",
+   "id": "b08f1603",
    "metadata": {},
    "outputs": [
     {
@@ -251,15 +266,30 @@
   },
   {
    "cell_type": "markdown",
-   "id": "66f7a839",
+   "id": "03e21408",
    "metadata": {},
    "source": [
     "### Train GNN"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "07805108",
+   "id": "a834318e",
+   "metadata": {},
+   "source": [
+    "To train an example GNN we need the following:\n",
+    "\n",
+    "- a dataset object instantiated using either the synthetic or original data\n",
+    "- the model, optimizer and hyperparameters defined\n",
+    "\n",
+    "In the tool an example dataloader is implemented for edge classification under `syngen/benchmark/data_loader`.\n",
+    "\n",
+    "This dataset object is used to great the dgl graphs corresponding to both the generated data and real data."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "28fabfa9",
    "metadata": {},
    "source": [
     "#### Create datasets"
@@ -268,7 +298,7 @@
   {
    "cell_type": "code",
    "execution_count": 7,
-   "id": "0fe941f0",
+   "id": "f7e8bd44",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -279,16 +309,24 @@
   },
   {
    "cell_type": "markdown",
-   "id": "a8b23137",
+   "id": "b830709c",
    "metadata": {},
    "source": [
     "#### Create helper function\n"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "b959a3a2",
+   "metadata": {},
+   "source": [
+    "The helper function defines a simple trianing loop and standard metrics for edge classification."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 8,
-   "id": "f46973e3",
+   "id": "5c4bec86",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -329,16 +367,24 @@
   },
   {
    "cell_type": "markdown",
-   "id": "6ad092e6",
+   "id": "dc4cea06",
    "metadata": {},
    "source": [
     "#### No-Pretrain"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "093203f8",
+   "metadata": {},
+   "source": [
+    "Without pre-training the model is trained from scratch using the original data graph."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 9,
-   "id": "d4ad039a",
+   "id": "93ab387d",
    "metadata": {},
    "outputs": [
     {
@@ -383,16 +429,26 @@
   },
   {
    "cell_type": "markdown",
-   "id": "7f061442",
+   "id": "08f5280a",
    "metadata": {},
    "source": [
     "#### Pretrain"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "18bebba4",
+   "metadata": {},
+   "source": [
+    "In this example the model is first trained on the generated data for a certain epoch budget.\n",
+    "\n",
+    "Subsequently it is further trained on the original data graph."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 10,
-   "id": "2f3985b2",
+   "id": "e21ab679",
    "metadata": {},
    "outputs": [
     {
@@ -438,7 +494,7 @@
   {
    "cell_type": "code",
    "execution_count": 11,
-   "id": "f33bec4f",
+   "id": "8b615c76",
    "metadata": {},
    "outputs": [
     {
@@ -458,15 +514,25 @@
   },
   {
    "cell_type": "markdown",
-   "id": "a6f0cfbe",
+   "id": "69b9e95c",
    "metadata": {},
    "source": [
     "### CLI example"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "2c48ec37",
+   "id": "93fd05a0",
+   "metadata": {},
+   "source": [
+    "The tool also provides this functionality through its CLI.\n",
+    "\n",
+    "The commands used to generate and pretrain/fine tune on the downstream tasks as done above are provided below."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8de441fe",
    "metadata": {},
    "source": [
     "#### Generate synthetic graph"
@@ -475,7 +541,7 @@
   {
    "cell_type": "code",
    "execution_count": 1,
-   "id": "b588c44a",
+   "id": "af89d214",
    "metadata": {},
    "outputs": [
     {
@@ -553,7 +619,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "01eeff23",
+   "id": "7fef4fb7",
    "metadata": {},
    "source": [
     "#### Results without pretraining"
@@ -562,7 +628,7 @@
   {
    "cell_type": "code",
    "execution_count": 2,
-   "id": "50238488",
+   "id": "c65ab4be",
    "metadata": {},
    "outputs": [
     {
@@ -607,7 +673,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "1a8474cb",
+   "id": "e6655f58",
    "metadata": {},
    "source": [
     "#### Pretrain and finetune"
@@ -616,7 +682,7 @@
   {
    "cell_type": "code",
    "execution_count": 3,
-   "id": "92039366",
+   "id": "fd2b8caf",
    "metadata": {},
    "outputs": [
     {
@@ -668,7 +734,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "f0405bf2",
+   "id": "2da530b6",
    "metadata": {},
    "outputs": [],
    "source": []
@@ -693,7 +759,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.15"
+   "version": "3.8.10"
   }
  },
  "nbformat": 4,