Skip to content

Commit 42a85c0

Browse files
authored
Update README.md (#1416)
1 parent c2033be commit 42a85c0

File tree

1 file changed

+111
-23
lines changed

1 file changed

+111
-23
lines changed

README.md

Lines changed: 111 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -3,52 +3,144 @@
33

44
<br>
55

6-
# Build machine learning APIs
7-
8-
Cortex makes deploying, scaling, and managing machine learning systems in production simple. We believe that developers in any organization should be able to add natural language processing, computer vision, and other machine learning capabilities to their applications without having to worry about infrastructure.
9-
106
<!-- Delete on release branches -->
117
<!-- CORTEX_VERSION_README_MINOR -->
8+
129
[install](https://docs.cortex.dev/install)[documentation](https://docs.cortex.dev)[examples](https://github.com/cortexlabs/cortex/tree/0.20/examples)[we're hiring](https://angel.co/cortex-labs-inc/jobs)[chat with us](https://gitter.im/cortexlabs/cortex)
1310

1411
<br>
1512

16-
# Key features
13+
# Model serving at scale
1714

1815
### Deploy
1916

20-
* Run Cortex locally or as a production cluster on your AWS account.
21-
* Deploy TensorFlow, PyTorch, Keras, ONNX, XGBoost, scikit-learn, and other models as realtime APIs or batch APIs.
17+
* Deploy TensorFlow, PyTorch, ONNX, scikit-learn, and other models.
2218
* Define preprocessing and postprocessing steps in Python.
19+
* Configure APIs as realtime or batch.
20+
* Deploy multiple models per API.
2321

2422
### Manage
2523

26-
* Update APIs with no downtime.
27-
* Stream logs from your APIs to your CLI.
2824
* Monitor API performance and track predictions.
29-
* Run A/B tests.
25+
* Update APIs with no downtime.
26+
* Stream logs from APIs.
27+
* Perform A/B tests.
3028

3129
### Scale
3230

33-
* Automatically scale APIs to handle production traffic.
34-
* Reduce your cloud infrastructure spend with spot instances.
35-
* Maximize resource utilization by deploying multiple models per API.
31+
* Test locally, scale on your AWS account.
32+
* Autoscale to handle production traffic.
33+
* Reduce cost with spot instances.
34+
35+
<br>
36+
37+
## How it works
38+
39+
### Write APIs in Python
40+
41+
Define any real-time or batch inference pipeline as simple Python APIs, regardless of framework.
42+
43+
```python
44+
# predictor.py
45+
46+
from transformers import pipeline
47+
48+
class PythonPredictor:
49+
def __init__(self, config):
50+
self.model = pipeline(task="text-generation")
51+
52+
def predict(self, payload):
53+
return self.model(payload["text"])[0]
54+
```
55+
56+
<br>
57+
58+
### Configure infrastructure in YAML
59+
60+
Configure autoscaling, monitoring, compute resources, update strategies, and more.
61+
62+
```yaml
63+
# cortex.yaml
64+
65+
- name: text-generator
66+
predictor:
67+
path: predictor.py
68+
networking:
69+
api_gateway: public
70+
compute:
71+
gpu: 1
72+
autoscaling:
73+
min_replicas: 3
74+
```
75+
76+
<br>
77+
78+
### Scale to handle production traffic
79+
80+
Handle traffic with request-based autoscaling. Minimize spend with spot instances and multi-model APIs.
81+
82+
```bash
83+
$ cortex get text-generator
84+
85+
endpoint: https://example.com/text-generator
86+
87+
status last-update replicas requests latency
88+
live 10h 10 100000 100ms
89+
```
3690

3791
<br>
3892

39-
# How it works
93+
### Integrate with your stack
4094

41-
Implement your predictor in `predictor.py`, configure your deployment in `cortex.yaml`, and run `cortex deploy`.
95+
Integrate Cortex with any data science platform and CI/CD tooling, without changing your workflow.
4296

43-
Here's how to deploy GPT-2 as a scalable text generation API:
97+
```python
98+
# predictor.py
4499

45-
![Demo](https://d1zqebknpdh033.cloudfront.net/demo/gif/v0.18.gif)
100+
import tensorflow
101+
import torch
102+
import transformers
103+
import mlflow
104+
105+
...
106+
```
46107

47108
<br>
48109

49-
# Get started
110+
### Run on your AWS account
111+
112+
Run Cortex on your AWS account (GCP support is coming soon), maintaining control over resource utilization and data access.
113+
114+
```yaml
115+
# cluster.yaml
50116

51-
### Install
117+
region: us-west-2
118+
instance_type: g4dn.xlarge
119+
spot: true
120+
min_instances: 1
121+
max_instances: 5
122+
```
123+
124+
<br>
125+
126+
### Focus on machine learning, not DevOps
127+
128+
You don't need to bring your own cluster or containerize your models, Cortex automates your cloud infrastructure.
129+
130+
```bash
131+
$ cortex cluster up
132+
133+
confguring networking ...
134+
configuring logging ...
135+
configuring metrics ...
136+
configuring autoscaling ...
137+
138+
cortex is ready!
139+
```
140+
141+
<br>
142+
143+
## Get started
52144

53145
<!-- CORTEX_VERSION_README_MINOR -->
54146
```bash
@@ -57,7 +149,3 @@ bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.20/get
57149

58150
<!-- CORTEX_VERSION_README_MINOR -->
59151
See our [installation guide](https://docs.cortex.dev/install), then deploy one of our [examples](https://github.com/cortexlabs/cortex/tree/0.20/examples) or bring your own models to build [realtime APIs](https://docs.cortex.dev/deployments/realtime-api) and [batch APIs](https://docs.cortex.dev/deployments/batch-api).
60-
61-
### Learn more
62-
63-
Check out our [docs](https://docs.cortex.dev) and join our [community](https://gitter.im/cortexlabs/cortex).

0 commit comments

Comments
 (0)