[SN-92] create a foundry + send to annotate workflow example (#1339)

Gabefire · web-flow · commit cf79c41f5bb6 · 2024-01-12T16:41:26.000-06:00
diff --git a/examples/foundry/object_detection.ipynb b/examples/foundry/object_detection.ipynb
@@ -0,0 +1,376 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 2,
+  "metadata": {},
+  "cells": [
+    {
+      "metadata": {},
+      "source": [
+        "<td>\n",
+        "   <a target=\"_blank\" href=\"https://labelbox.com\" ><img src=\"https://labelbox.com/blog/content/images/2021/02/logo-v4.svg\" width=190/></a>\n",
+        "</td>"
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "<td>\n",
+        "<a href=\"https://colab.research.google.com/github/Labelbox/labelbox-python/blob/master/examples/foundry/object_detection.ipynb\" target=\"_blank\"><img\n",
+        "src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"></a>\n",
+        "</td>\n",
+        "\n",
+        "<td>\n",
+        "<a href=\"https://github.com/Labelbox/labelbox-python/tree/master/examples/foundry/object_detection.ipynb\" target=\"_blank\"><img\n",
+        "src=\"https://img.shields.io/badge/GitHub-100000?logo=github&logoColor=white\" alt=\"GitHub\"></a>\n",
+        "</td>"
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "# Foundry overview\n",
+        "\n",
+        "This notebook is used to go over the basic of foundry through the Python SDK"
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "Foundry incorporates foundational models into your Labelbox workflow. You can use Foundry to:\n",
+        "\n",
+        "* Predict (infer) labels from your data\n",
+        "* Compare the performance of different foundational models with your data and ontologies.\n",
+        "* Prototype, diagnose, and refine a machine learning app to solve specific business needs.\n",
+        "\n",
+        "Foundry creates model runs that predict data row annotations based on your input."
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "!pip install -q labelbox"
+      ],
+      "cell_type": "code",
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "metadata": {},
+      "source": [
+        "import labelbox as lb\n",
+        "from labelbox.schema.conflict_resolution_strategy import ConflictResolutionStrategy\n",
+        "import uuid"
+      ],
+      "cell_type": "code",
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "metadata": {},
+      "source": [
+        "# API Key and Client\n",
+        "\n",
+        "Provide a valid API key below in order to properly connect to the Labelbox Client."
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "# Add your API key\n",
+        "API_KEY = \"\"\n",
+        "# To get your API key go to: Workspace settings -> API -> Create API Key\n",
+        "client = lb.Client(api_key=API_KEY)"
+      ],
+      "cell_type": "code",
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "metadata": {},
+      "source": [
+        "# End-to-end example: Run foundry and send to annotate from catalog"
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "## Step 1: Import data rows into catelog"
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "# send a sample image as data row for a dataset\n",
+        "global_key = str(uuid.uuid4())\n",
+        "\n",
+        "test_img_url = {\n",
+        "    \"row_data\":\n",
+        "        \"https://storage.googleapis.com/labelbox-datasets/image_sample_data/2560px-Kitano_Street_Kobe01s5s4110.jpeg\",\n",
+        "    \"global_key\":\n",
+        "        global_key\n",
+        "}\n",
+        "\n",
+        "dataset = client.create_dataset(name=\"foundry-demo-dataset\")\n",
+        "task = dataset.create_data_rows([test_img_url])\n",
+        "task.wait_till_done()\n",
+        "\n",
+        "print(f\"Errors: {task.errors}\")\n",
+        "print(f\"Failed data rows: {task.failed_data_rows}\")"
+      ],
+      "cell_type": "code",
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Errors: None\n",
+            "Failed data rows: None\n"
+          ]
+        }
+      ],
+      "execution_count": null
+    },
+    {
+      "metadata": {},
+      "source": [
+        "## Step 2: Create/select an ontology that matches model\n",
+        "\n",
+        "Your project should have the correct ontology setup with all the tools and classifications supported for your model and data type.\n",
+        "\n",
+        "For example, when using Amazon Rekognition you would need to create a bounding box annotation for your ontology since it only supports object detection. Likewise when using YOLOv8 you would need to create a classification annotation for your ontology since it only supports image classification. \n",
+        "\n",
+        "In this tutorial, we will use Amazon Rekognition to detect objects in an image dataset. "
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "# Create ontology with two bounding boxes that is included with Amazon Rekognition: Car and Person \n",
+        "ontology_builder = lb.OntologyBuilder(\n",
+        "    classifications=[],\n",
+        "    tools=[\n",
+        "        lb.Tool(tool=lb.Tool.Type.BBOX, name=\"Car\"),\n",
+        "        lb.Tool(tool=lb.Tool.Type.BBOX, name=\"Person\")\n",
+        "    ]\n",
+        ")\n",
+        "\n",
+        "ontology = client.create_ontology(\"Image Bounding Box Annotation Demo Foundry\",\n",
+        "                                  ontology_builder.asdict(),\n",
+        "                                  media_type=lb.MediaType.Image)"
+      ],
+      "cell_type": "code",
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "metadata": {},
+      "source": [
+        "## Step 3: Create a labeling project\n",
+        "\n",
+        "Connect the ontology to the labeling project"
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "project = client.create_project(name=\"Foundry Image Demo\",\n",
+        "                                media_type=lb.MediaType.Image)\n",
+        "\n",
+        "project.setup_editor(ontology)"
+      ],
+      "cell_type": "code",
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "metadata": {},
+      "source": [
+        "## Step 4: Create foundry application in UI\n",
+        "\n",
+        "Currently we do not support this workflow through the SDK\n",
+        "#### Workflow:\n",
+        "\n",
+        "1. Navigate to model and select ***Create*** > ***App***\n",
+        "\n",
+        "2. Select ***Amazon Rekognition*** and name your foundry application\n",
+        "\n",
+        "3. Customize your perimeters and then select ***Save & Create***"
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "#Select your foundry application inside the UI and copy the APP ID from the top right corner\n",
+        "AMAZON_REKOGNITION_APP_ID = \"\""
+      ],
+      "cell_type": "code",
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "metadata": {},
+      "source": [
+        "## Step 5: Run foundry app on data rows\n",
+        "\n",
+        "This step is meant to generate annotations that can later be reused as pre-labels in a project. You must provide your app ID from the previous step for this method to run, please see the [Foundry Apps Guide](https://docs.labelbox.com/docs/foundry-apps#run-app-using-sdk) for more information.\n"
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "task = client.run_foundry_app(model_run_name=f\"Amazon-{str(uuid.uuid4())}\",\n",
+        "                              data_rows=lb.GlobalKeys(\n",
+        "                                  [global_key] # Provide a list of global keys \n",
+        "                                  ), \n",
+        "                              app_id=AMAZON_REKOGNITION_APP_ID)\n",
+        "\n",
+        "task.wait_till_done()\n",
+        "\n",
+        "print(f\"Errors: {task.errors}\") \n",
+        "\n",
+        "#Obtain model run ID from task\n",
+        "MODEL_RUN_ID = task.metadata[\"modelRunId\"]"
+      ],
+      "cell_type": "code",
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Errors: None\n"
+          ]
+        }
+      ],
+      "execution_count": null
+    },
+    {
+      "metadata": {},
+      "source": [
+        "## Step 6: Map ontology through the UI\n",
+        "\n",
+        "Mapping a model's ontology to a project's ontology is currently not supported through the SDK, however, to showcase how to send foundry predictions to a project, we are going to generate the mapping of the foundry app ontology to the project ontology through the UI.\n",
+        "\n",
+        "#### Workflow\n",
+        "\n",
+        "1. Navigate to your dataset you created for your model run\n",
+        "2. Select **Select all** in the top right corner\n",
+        "3. Select **Manage selection** > **Send to Annotate**\n",
+        "4. Specify the project we created from the project dropdown menu\n",
+        "5. Selecting a workflow step is not required since we are not sending annotations from the UI to a project using this notebook \n",
+        "6. Mark **Include model predictions** then scroll down and select **Map**\n",
+        "7. Select the incoming ontology and matching ontology feature for both Car and Person\n",
+        "8. Once both features are mapped press the **Copy ontology mapping as JSON** in the top right corner\n",
+        "9. Do not save this configuration, since we are not sending predictions to a project using this UI modal. We will be sending predictions in the following steps using the SDK"
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "# Copy map ontology through the UI then paste JSON file here\n",
+        "PREDICTIONS_ONTOLOGY_MAPPING = {}"
+      ],
+      "cell_type": "code",
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "metadata": {},
+      "source": [
+        "## Step 7: Send model generated annotations from catalog to annotate\n",
+        "\n",
+        "### Parameters\n",
+        "\n",
+        "When you send predicted data rows to annotate from catalog, you may choose to include or exclude certain parameters, at a minimum a predictions_ontology_mapping will need to be provided:\n",
+        "\n",
+        "* `predictions_ontology_mapping`\n",
+        "    - A dictionary containing the mapping of the model's ontology feature schema ids to the project's ontology feature schema ids\n",
+        "* `exclude_data_rows_in_project`\n",
+        "    - Excludes data rows that are already in the project. \n",
+        "* `override_existing_annotations_rule` \n",
+        "    - The strategy defining how to handle conflicts in classifications between the data rows that already exist in the project and incoming predictions from the source model run or annotations from the source project. \n",
+        "        * Defaults to ConflictResolutionStrategy.KeepExisting\n",
+        "        * Options include:\n",
+        "            * ConflictResolutionStrategy.KeepExisting\n",
+        "            * ConflictResolutionStrategy.OverrideWithPredictions\n",
+        "            * ConflictResolutionStrategy.OverrideWithAnnotations\n",
+        "* `param batch_priority`\n",
+        "    - The priority of the batch.\n"
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "model_run = client.get_model_run(MODEL_RUN_ID)\n",
+        "\n",
+        "send_to_annotations_params = {\n",
+        "    \"predictions_ontology_mapping\": PREDICTIONS_ONTOLOGY_MAPPING,\n",
+        "    \"exclude_data_rows_in_project\": False,\n",
+        "    \"override_existing_annotations_rule\": ConflictResolutionStrategy.OverrideWithPredictions,\n",
+        "    \"batch_priority\": 5,\n",
+        "}\n",
+        "\n",
+        "\n",
+        "task = model_run.send_to_annotate_from_model(\n",
+        "    destination_project_id=project.uid,\n",
+        "    task_queue_id=None, #ID of workflow task, set ID to None if you want to convert pre-labels to ground truths or obtain task queue id through project.task_queues().\n",
+        "    batch_name=\"Foundry Demo Batch\",\n",
+        "    data_rows=lb.GlobalKeys(\n",
+        "        [global_key] # Provide a list of global keys from foundry app task\n",
+        "    ),\n",
+        "    params=send_to_annotations_params\n",
+        "    )\n",
+        "\n",
+        "task.wait_till_done()\n",
+        "\n",
+        "print(f\"Errors: {task.errors}\")"
+      ],
+      "cell_type": "code",
+      "outputs": [
+        {
+          "name": "stdout",
+          "output_type": "stream",
+          "text": [
+            "Errors: None\n"
+          ]
+        }
+      ],
+      "execution_count": null
+    },
+    {
+      "metadata": {},
+      "source": [
+        "## Clean up"
+      ],
+      "cell_type": "markdown"
+    },
+    {
+      "metadata": {},
+      "source": [
+        "# project.delete()\n",
+        "# dataset.delete()\n",
+        "# model_run.delete()"
+      ],
+      "cell_type": "code",
+      "outputs": [],
+      "execution_count": null
+    },
+    {
+      "metadata": {},
+      "source": [],
+      "cell_type": "markdown"
+    }
+  ]
+}