Skip to content

Commit 22eb894

Browse files
Update README.md
1 parent 45e3271 commit 22eb894

File tree

1 file changed

+5
-7
lines changed

1 file changed

+5
-7
lines changed

week05_large_models/README.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,12 @@
66

77

88
### Practice / homework
9-
In this homework, you can choose one of 3 tasks to complete:
10-
- Option A: [`./homework_a.ipynb`](./homework_a.ipynb) memory-efficient training and inference - recommended if you have a single GPU
11-
- Option B: [`./homework_b.md`](./homework_b.md) benchmarking ZeRO implementations - requires at least two GPUs and some RAM
12-
- Option C: [`./homework_c.md`](./homework_c.md) write your own model parallelism - requires at least two GPUs
9+
This homework consists of two parts:
10+
- Part 1: [`./homework_part1.ipynb`](./homework_part1.ipynb) - memory-efficient training and inference
11+
- Part 2: **TBU** - implementing tensor parallelism
1312

14-
You can do more than one, and we'll award bonus points for that, but doing 2 options will yield (much) less than 2x points. If you're an enrolled student, please only submit the files that you changed (i.e. do not submit homework_b.md if you did option A or C)
15-
16-
We recommend that you choose options B and C if you have access to a computer with at least two GPUs. For YSDA and HSE students, you can use either DataSphere or one of the GPU servers available for this course (recommended). If you are an online student, you can try to register for kaggle kernels ([they ley you run on 2x T4](https://www.kaggle.com/discussions/product-feedback/361104)) in jupyter-like interface. That said, implementing assignments B and C in Kaggle is more difficult than intended. For non-enrolled online students, we recommend option A unless you have access to some other multi-GPU-hardware or are intentionally masochistic.
13+
Part 2 is much more convenient with multiple GPUs - though, it can *potentially* be solved by emulating GPUs with CPU-only code.
14+
For YSDA and HSE students, you can use either DataSphere or one of the GPU servers available for this course (recommended). If you are an online student, you can try to register for kaggle kernels ([they ley you run on 2x T4](https://www.kaggle.com/discussions/product-feedback/361104)) in jupyter-like interface. That said, implementing assignments B and C in Kaggle is more difficult than intended. For non-enrolled online students, we recommend option A unless you have access to some other multi-GPU-hardware or are intentionally masochistic.
1715

1816

1917
### References

0 commit comments

Comments
 (0)