Update README.md (#1416)

caleb-kaiser · web-flow · commit 42a85c0ce91d · 2020-10-13T15:51:16.000-07:00
diff --git a/README.md b/README.md
@@ -3,52 +3,144 @@
 
 <br>
 
-# Build machine learning APIs
-
-Cortex makes deploying, scaling, and managing machine learning systems in production simple. We believe that developers in any organization should be able to add natural language processing, computer vision, and other machine learning capabilities to their applications without having to worry about infrastructure.
-
 <!-- Delete on release branches -->
 <!-- CORTEX_VERSION_README_MINOR -->
+
 [install](https://docs.cortex.dev/install) • [documentation](https://docs.cortex.dev) • [examples](https://github.com/cortexlabs/cortex/tree/0.20/examples) • [we're hiring](https://angel.co/cortex-labs-inc/jobs) • [chat with us](https://gitter.im/cortexlabs/cortex)
 
 <br>
 
-# Key features
+# Model serving at scale
 
 ### Deploy
 
-* Run Cortex locally or as a production cluster on your AWS account.
-* Deploy TensorFlow, PyTorch, Keras, ONNX, XGBoost, scikit-learn, and other models as realtime APIs or batch APIs.
+* Deploy TensorFlow, PyTorch, ONNX, scikit-learn, and other models.
 * Define preprocessing and postprocessing steps in Python.
+* Configure APIs as realtime or batch.
+* Deploy multiple models per API.
 
 ### Manage
 
-* Update APIs with no downtime.
-* Stream logs from your APIs to your CLI.
 * Monitor API performance and track predictions.
-* Run A/B tests.
+* Update APIs with no downtime.
+* Stream logs from APIs.
+* Perform A/B tests.
 
 ### Scale
 
-* Automatically scale APIs to handle production traffic.
-* Reduce your cloud infrastructure spend with spot instances.
-* Maximize resource utilization by deploying multiple models per API.
+* Test locally, scale on your AWS account.
+* Autoscale to handle production traffic.
+* Reduce cost with spot instances.
+
+<br>
+
+## How it works
+
+### Write APIs in Python
+
+Define any real-time or batch inference pipeline as simple Python APIs, regardless of framework.
+
+```python
+# predictor.py
+
+from transformers import pipeline
+
+class PythonPredictor:
+  def __init__(self, config):
+    self.model = pipeline(task="text-generation")
+
+  def predict(self, payload):
+    return self.model(payload["text"])[0]
+```
+
+<br>
+
+### Configure infrastructure in YAML
+
+Configure autoscaling, monitoring, compute resources, update strategies, and more.
+
+```yaml
+# cortex.yaml
+
+- name: text-generator
+  predictor:
+    path: predictor.py
+  networking:
+    api_gateway: public
+  compute:
+    gpu: 1
+  autoscaling:
+    min_replicas: 3
+```
+
+<br>
+
+### Scale to handle production traffic
+
+Handle traffic with request-based autoscaling. Minimize spend with spot instances and multi-model APIs.
+
+```bash
+$ cortex get text-generator
+
+endpoint: https://example.com/text-generator
+
+status   last-update   replicas   requests   latency
+live     10h           10         100000     100ms
+```
 
 <br>
 
-# How it works
+### Integrate with your stack
 
-Implement your predictor in `predictor.py`, configure your deployment in `cortex.yaml`, and run `cortex deploy`.
+Integrate Cortex with any data science platform and CI/CD tooling, without changing your workflow.
 
-Here's how to deploy GPT-2 as a scalable text generation API:
+```python
+# predictor.py
 
-![Demo](https://d1zqebknpdh033.cloudfront.net/demo/gif/v0.18.gif)
+import tensorflow
+import torch
+import transformers
+import mlflow
+
+...
+```
 
 <br>
 
-# Get started
+### Run on your AWS account
+
+Run Cortex on your AWS account (GCP support is coming soon), maintaining control over resource utilization and data access.
+
+```yaml
+# cluster.yaml
 
-### Install
+region: us-west-2
+instance_type: g4dn.xlarge
+spot: true
+min_instances: 1
+max_instances: 5
+```
+
+<br>
+
+### Focus on machine learning, not DevOps
+
+You don't need to bring your own cluster or containerize your models, Cortex automates your cloud infrastructure.
+
+```bash
+$ cortex cluster up
+
+confguring networking ...
+configuring logging ...
+configuring metrics ...
+configuring autoscaling ...
+
+cortex is ready!
+```
+
+<br>
+
+## Get started
 
 <!-- CORTEX_VERSION_README_MINOR -->
 ```bash
@@ -57,7 +149,3 @@ bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.20/get
 
 <!-- CORTEX_VERSION_README_MINOR -->
 See our [installation guide](https://docs.cortex.dev/install), then deploy one of our [examples](https://github.com/cortexlabs/cortex/tree/0.20/examples) or bring your own models to build [realtime APIs](https://docs.cortex.dev/deployments/realtime-api) and [batch APIs](https://docs.cortex.dev/deployments/batch-api).
-
-### Learn more
-
-Check out our [docs](https://docs.cortex.dev) and join our [community](https://gitter.im/cortexlabs/cortex).