Started centrality algos notebook

kedarghule · Mats-SX · commit 9991e3e3be7d · 2023-06-19T14:00:30.000+02:00
diff --git a/examples/centrality-algorithms.ipynb b/examples/centrality-algorithms.ipynb
@@ -0,0 +1,227 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Centrality Algorithms"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<a target=\"_blank\" href=\"https://colab.research.google.com/github/neo4j/graph-data-science-client/blob/main/examples/centrality-algorithms.ipynb\">\n",
+    "  <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
+    "</a>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This Jupyter notebook is hosted [here](https://github.com/neo4j/graph-data-science-client/blob/main/examples/centrality-algorithms.ipynb) in the Neo4j Graph Data Science Client Github repository.\n",
+    "\n",
+    "Centrality algorithms are used to understand the role or influence of particular nodes in a graph. The notebook shows the application of centrality algorithms using the `graphdatascience` library on the Airline travel reachability network dataset that can be downloaded [here](https://snap.stanford.edu/data/reachability.html).\n",
+    "\n",
+    "< TALK ABOUT TASKS >\n",
+    "\n",
+    "### Setup\n",
+    "\n",
+    "We start by importing our dependencies and setting up our GDS client connection to the database."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Install necessary dependencies\n",
+    "#%pip install graphdatascience pandas"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "C:\\Users\\kedar\\anaconda3\\envs\\graph_stuff\\lib\\site-packages\\tqdm\\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
+      "  from .autonotebook import tqdm as notebook_tqdm\n"
+     ]
+    }
+   ],
+   "source": [
+    "from graphdatascience import GraphDataScience\n",
+    "import pandas as pd\n",
+    "import os"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# NEO4J_URI = os.environ.get(\"NEO4J_URI\", \"bolt://localhost:7687\")\n",
+    "# NEO4J_AUTH = None\n",
+    "# if os.environ.get(\"NEO4J_USER\") and os.environ.get(\"NEO4J_PASSWORD\"):\n",
+    "#     NEO4J_AUTH = (\n",
+    "#         os.environ.get(\"NEO4J_USER\"),\n",
+    "#         os.environ.get(\"NEO4J_PASSWORD\"),\n",
+    "#     )\n",
+    "\n",
+    "# gds = GraphDataScience(NEO4J_URI, auth=NEO4J_AUTH)\n",
+    "\n",
+    "# # Replace with the actual connection URI and credentials\n",
+    "NEO4J_CONNECTION_URI = \"bolt://XXXXXXXXXXXXX\n",
+    "NEO4J_USERNAME = \"neo4j\"\n",
+    "NEO4J_PASSWORD = \"XXXXXXXXXXXXX\"\n",
+    "\n",
+    "# Client instantiation\n",
+    "gds = GraphDataScience(\n",
+    "    NEO4J_CONNECTION_URI,\n",
+    "    auth=(NEO4J_USERNAME, NEO4J_PASSWORD)\n",
+    ")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Importing the dataset\n",
+    "\n",
+    "We import the dataset as a pandas dataframe first. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>node_id</th>\n",
+       "      <th>name</th>\n",
+       "      <th>metro_pop</th>\n",
+       "      <th>latitude</th>\n",
+       "      <th>longitude</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>0</td>\n",
+       "      <td>Abbotsford, BC</td>\n",
+       "      <td>133497.0</td>\n",
+       "      <td>49.051575</td>\n",
+       "      <td>-122.328849</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>1</td>\n",
+       "      <td>Aberdeen, SD</td>\n",
+       "      <td>40878.0</td>\n",
+       "      <td>45.459090</td>\n",
+       "      <td>-98.487324</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>2</td>\n",
+       "      <td>Abilene, TX</td>\n",
+       "      <td>166416.0</td>\n",
+       "      <td>32.449175</td>\n",
+       "      <td>-99.741424</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>3</td>\n",
+       "      <td>Akron/Canton, OH</td>\n",
+       "      <td>701456.0</td>\n",
+       "      <td>40.797810</td>\n",
+       "      <td>-81.371567</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>4</td>\n",
+       "      <td>Alamosa, CO</td>\n",
+       "      <td>9433.0</td>\n",
+       "      <td>37.468180</td>\n",
+       "      <td>-105.873599</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "   node_id              name  metro_pop   latitude   longitude\n",
+       "0        0    Abbotsford, BC   133497.0  49.051575 -122.328849\n",
+       "1        1      Aberdeen, SD    40878.0  45.459090  -98.487324\n",
+       "2        2       Abilene, TX   166416.0  32.449175  -99.741424\n",
+       "3        3  Akron/Canton, OH   701456.0  40.797810  -81.371567\n",
+       "4        4       Alamosa, CO     9433.0  37.468180 -105.873599"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df = pd.read_csv('https://snap.stanford.edu/data/reachability-meta.csv.gz', compression='gzip')\n",
+    "df.head()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.11"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}