doc: scenario documentation

starpit · starpit · commit 17c51713eabb · 2022-08-30T17:47:39.000-04:00
diff --git a/docs/scenarios/2.md b/docs/scenarios/2.md
@@ -0,0 +1,32 @@
+# The CodeFlare Stack - Scenario 2
+
+### Pre-Train a RoBERTa Language Model from Pre-tokenized Data (Using Demo Data)
+
+[RoBERTa](https://huggingface.co/docs/transformers/model_doc/roberta)
+is a robustly optimized method for pretraining natural language
+processing (NLP) systems.
+
+**Goals**: Learning about CodeFlare<br>
+**You Provide**: nothing, it just works!<br>
+**CodeFlare Stack Provides**: [S3](https://aws.amazon.com/s3/) data **|** [Ray](https://www.ray.io/) cluster **|** [Kubernetes](https://kubernetes.io/) management **|** Distributed training job **|** Pop-up Dashboards
+
+---
+
+To start:
+
+```shell
+codeflare ml/codeflare/training/roberta
+```
+
+### CLI In Action
+
+You may run the CodeFlare RoBERTa model architecture against sample
+data, as we have done in these recordings:
+
+<a href="https://asciinema.org/a/517993" target="_blank"><img src="https://asciinema.org/a/517993.svg" width="600" /></a>
+
+### Pop-up CodeFlare Dashboard In Action
+
+https://user-images.githubusercontent.com/4741620/187531069-12a5dbd3-1b3f-45e8-b8e9-d0940bdc7db1.mp4
+
+[Back to Top](README.md)
diff --git a/docs/scenarios/3.md b/docs/scenarios/3.md
@@ -0,0 +1,45 @@
+# The CodeFlare Stack - Scenario 3
+
+### Bring Your Own Code
+
+I have my own training code, and want to run it at scale.
+
+**Goals**: Productive Use<br>
+**You Provide**: Python source code **|** [S3](https://aws.amazon.com/s3/) data **|** Command-line options to tweak the run parameters <br>
+**CodeFlare Stack Provides**: [Ray](https://www.ray.io/) cluster **|** [Kubernetes](https://kubernetes.io/) management **|** Distributed training job **|** Link S3 credentials **|** Pop-up Dashboards
+
+---
+
+This example utilizes the "bring your own code" feature of the
+CodeFlare Stack. We will point the CLI to
+[this](https://torchtutorialstaging.z5.web.core.windows.net/beginner/hyperparameter_tuning_tutorial.html)
+simple example that uses Ray Tune. In this mode, you point the CLI
+tool to a working directory that contains a `main.py` and (optionally)
+a `requirements.txt`. Make a local directory and download those two
+files from
+[here](https://github.com/project-codeflare/codeflare-cli/tree/main/tests/kind/inputs/ray-tune-tutorial):
+
+This script mimics "bringing your own code". Normally, you would have
+the code already sitting in a directory on your laptop:
+
+```shell
+mkdir codeflare-scenario-1 && cd codeflare-scenario-1
+curl -LO https://raw.githubusercontent.com/project-codeflare/codeflare-cli/main/tests/kind/inputs/ray-tune-tutorial/main.py
+curl -LO https://raw.githubusercontent.com/project-codeflare/codeflare-cli/main/tests/kind/inputs/ray-tune-tutorial/requirements.txt
+```
+
+Then launch the `codeflare` CLI and point it to your directory:
+
+```
+codeflare ml/codeflare/training/byoc
+```
+
+### The CLI In Action
+
+<a href="https://asciinema.org/a/517989" target="_blank"><img src="https://asciinema.org/a/517989.svg" width="600" /></a>
+
+### Pop-up CodeFlare Dashboard In Action
+
+https://user-images.githubusercontent.com/4741620/187532373-556dd733-7eef-4b70-81e1-b841289535da.mp4
+
+[Back to Top](README.md)
diff --git a/docs/scenarios/README.md b/docs/scenarios/README.md
@@ -0,0 +1,25 @@
+# The CodeFlare Stack
+
+The CodeFlare Stack is a set of tooling and best of breed code and
+models to help you be productive at leveraging Cloud GPU resources for
+ML tasks.
+
+## How can I Leverage the CodeFlare Stack?
+
+The CodeFlare Stack is set up to accommodate _your choices_. So,
+first, choose what you would like to accomplish:
+
+- [**Introductory Demo**](1.md) I want to see the experience in action. <br>
+  **Goals**: Learning<br>
+  **You Provide**: nothing, it just works! <br>
+  **CodeFlare Stack Provides**: [Ray](https://www.ray.io/) cluster **|** [Kubernetes](https://kubernetes.io/) management **|** Distributed training job **|** Pop-up Dashboards
+- [**Train a Masked Language Model (Demo)**](2.md)<br>
+  **Goals**: Learning<br>
+  **You Provide**: nothing, it just works!<br>
+  **CodeFlare Stack Provides**: [S3](https://aws.amazon.com/s3/) data **|** [Ray](https://www.ray.io/) cluster **|** [Kubernetes](https://kubernetes.io/) management **|** Distributed training job **|** Pop-up Dashboards
+
+- [**Bring Your Own Code**](3.md) I have my own training code, and
+  want to run it at scale. <br>
+  **Goals**: Productive Use<br>
+  **You Provide**: Python source code **|** [S3](https://aws.amazon.com/s3/) data **|** Command-line options to tweak the run parameters <br>
+  **CodeFlare Stack Provides**: [Ray](https://www.ray.io/) cluster **|** [Kubernetes](https://kubernetes.io/) management **|** Distributed training job **|** Link S3 credentials **|** Pop-up Dashboards