describe bonus assignments

justheuristic · justheuristic · commit 91a562dce067 · 2025-02-21T12:46:23.000+03:00
diff --git a/week05_large_models/practice_part1.ipynb b/week05_large_models/practice_part1.ipynb
@@ -2,7 +2,7 @@
   "cells": [
     {
       "cell_type": "code",
-      "execution_count": 8,
+      "execution_count": null,
       "metadata": {
         "id": "0TH9Am-9ztHB"
       },
@@ -42,7 +42,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 3,
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
@@ -81,7 +81,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 4,
+      "execution_count": null,
       "metadata": {
         "id": "sTuoIY_tNSVk",
         "colab": {
@@ -279,7 +279,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 5,
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
@@ -305,7 +305,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 6,
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
@@ -354,7 +354,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 7,
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/",
@@ -564,7 +564,7 @@
     },
     {
       "cell_type": "code",
-      "execution_count": 8,
+      "execution_count": null,
       "metadata": {
         "colab": {
           "base_uri": "https://localhost:8080/"
@@ -660,6 +660,7 @@
         "\n",
         "__Task 1.2:__ generate a short sequence given a prefix. You may choose any generation task that requires generating at least 25 consecutive tokens. Here's one example from the NLP course (the generated code is in blue)\n",
         "\n",
+        "![img](https://i.imgur.com/a1QhKF7.png)\n",
         "\n",
         "You may use model.generate (if your code is compatible with that) or write your own inference loop. If you choose to write your own loop, you are free to use sampling, greedy, top-p, top-k or any other [inference mode supported by HF transformers](https://huggingface.co/docs/transformers/main_classes/text_generation).\n",
         "\n",
@@ -672,7 +673,10 @@
         "- __+1 point__ you can perform forward pass on 128x1024 tokens of actual text data (e.g. the sample data above)\n",
         "- __+1 point__ you can compute gradients with offloading on the same 128x1024 tokens from the real text data\n",
         "- __+1 point__ you can inference the model - and it generates some human-readable text\n",
-        "- __bonus points__ optimize your code so that it would pre-load the next offloaded layer in background\n",
+        "- __bonus points:__ we offer two optional assignments:\n",
+        "   - **Selective activation checkpointing (2pt):** there is a gentler version of gradient checkpointing where you don't just remember the layer inputs, but also some activations that are easier to compute - compared to their size. For instance, MLP linear layers are compute-heavy, but the nonlinearity is relatively compute-light for the same amount of memory. You can re-compute only the compute-light operations and keep the compute-heavy ones in memory. There's [a paper](https://arxiv.org/pdf/2205.05198) that describes such an approach in detail (see 'Selective activation checkpointing').\n",
+        "   - **Prefetch offloaded layers (2pt):** optimize your code so that it begins pre-loading the next offloaded layer in the background, while computing the current layer. It can be done with a copy with non_blocking=True, or, for fine-grained control, CUDA streams. To get the full grade for this assignment, please demonstrate that your approach is faster than naive offloading, at least during large batch forward/backward pass. This can be done using a profiler.\n",
+        "   - Please note that the maximum points for this week are **capped at 14**.\n",
         "\n",
         "__Conditions:__\n",
         "- using more than 10GiB of GPU memory at any point is forbidden (check with [`torch.cuda.max_memory_allocated()`](https://pytorch.org/docs/stable/generated/torch.cuda.max_memory_allocated.html))\n",
@@ -5337,4 +5341,4 @@
   },
   "nbformat": 4,
   "nbformat_minor": 0
-}
+}