Skip to content

Commit fd2d99c

Browse files
committed
Update README.md (#1175)
(cherry picked from commit 6de67ad)
1 parent 457bbb3 commit fd2d99c

File tree

1 file changed

+25
-121
lines changed

1 file changed

+25
-121
lines changed

README.md

Lines changed: 25 additions & 121 deletions
Original file line numberDiff line numberDiff line change
@@ -1,146 +1,50 @@
11
# Build machine learning APIs
22

3-
<br>
4-
5-
<!-- Set header Cache-Control=no-cache on the S3 object metadata (see https://help.github.com/en/articles/about-anonymized-image-urls) -->
6-
![Demo](https://d1zqebknpdh033.cloudfront.net/demo/gif/v0.13_2.gif)
3+
Cortex makes deploying, scaling, and managing machine learning systems in production simple. We believe that developers in any organization should be able to add natural language processing, computer vision, and other machine learning capabilities to their applications without having to worry about infrastructure.
74

85
<br>
96

107
## Key features
118

12-
* **Multi framework:** deploy TensorFlow, PyTorch, scikit-learn, and other models.
13-
* **Autoscaling:** automatically scale APIs to handle production workloads.
14-
* **ML instances:** run inference on G4, P2, Inf1, M5, C5 and other AWS instance types.
15-
* **Spot instances:** save money with spot instances.
16-
* **Multi-model APIs:** deploy multiple models in a single API.
17-
* **Rolling updates:** update deployed APIs with no downtime.
18-
* **Log streaming:** stream logs from deployed models to your CLI.
19-
* **Prediction monitoring:** monitor API performance and prediction results.
20-
21-
<br>
22-
23-
## Deploying a model
9+
### Deploy
2410

25-
### Install the CLI
11+
* Run Cortex locally or as a production cluster on your AWS account.
12+
* Deploy TensorFlow, PyTorch, scikit-learn, and other models as web APIs.
13+
* Define preprocessing and postprocessing steps in Python.
2614

27-
<!-- CORTEX_VERSION_README_MINOR -->
28-
```bash
29-
$ bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.18/get-cli.sh)"
30-
```
31-
32-
### Implement your predictor
15+
### Manage
3316

34-
```python
35-
# predictor.py
36-
37-
class PythonPredictor:
38-
def __init__(self, config):
39-
self.model = download_model()
40-
41-
def predict(self, payload):
42-
return self.model.predict(payload["text"])
43-
```
17+
* Update APIs with no downtime.
18+
* Stream logs from your APIs to your CLI.
19+
* Monitor API performance and track predictions.
4420

45-
### Configure your deployment
21+
### Scale
4622

47-
```yaml
48-
# cortex.yaml
49-
50-
- name: sentiment-classifier
51-
predictor:
52-
type: python
53-
path: predictor.py
54-
compute:
55-
gpu: 1
56-
mem: 4G
57-
```
58-
59-
### Deploy your model
60-
61-
```bash
62-
$ cortex deploy
63-
64-
creating sentiment-classifier
65-
```
66-
67-
### Serve predictions
68-
69-
```bash
70-
$ curl http://localhost:8888 \
71-
-X POST -H "Content-Type: application/json" \
72-
-d '{"text": "serving models locally is cool!"}'
73-
74-
positive
75-
```
23+
* Automatically scale APIs to handle production traffic.
24+
* Reduce your cloud infrastructure spend with spot instances.
25+
* Maximize resource utilization by deploying multiple models per API.
7626

7727
<br>
7828

79-
## Deploying models at scale
80-
81-
### Spin up a cluster
29+
## How it works
8230

83-
Cortex clusters are designed to be self-hosted on any AWS account:
84-
85-
```bash
86-
$ cortex cluster up
87-
88-
aws region: us-east-1
89-
aws instance type: g4dn.xlarge
90-
spot instances: yes
91-
min instances: 0
92-
max instances: 5
93-
94-
your cluster will cost $0.19 - $2.85 per hour based on cluster size and spot instance pricing/availability
95-
96-
○ spinning up your cluster ...
97-
98-
your cluster is ready!
99-
```
100-
101-
### Deploy to your cluster with the same code and configuration
102-
103-
```bash
104-
$ cortex deploy --env aws
105-
106-
creating sentiment-classifier
107-
```
108-
109-
### Serve predictions at scale
31+
<!-- Set header Cache-Control=no-cache on the S3 object metadata (see https://help.github.com/en/articles/about-anonymized-image-urls) -->
32+
![Demo](https://d1zqebknpdh033.cloudfront.net/demo/gif/v0.13_2.gif)
11033

111-
```bash
112-
$ curl http://***.amazonaws.com/sentiment-classifier \
113-
-X POST -H "Content-Type: application/json" \
114-
-d '{"text": "serving models at scale is really cool!"}'
34+
<br>
11535

116-
positive
117-
```
36+
## Get started
11837

119-
### Monitor your deployment
38+
### Install
12039

40+
<!-- CORTEX_VERSION_README_MINOR -->
12141
```bash
122-
$ cortex get sentiment-classifier
123-
124-
status up-to-date requested last update avg request 2XX
125-
live 1 1 8s 24ms 12
126-
127-
class count
128-
positive 8
129-
negative 4
42+
bash -c "$(curl -sS https://raw.githubusercontent.com/cortexlabs/cortex/0.18/get-cli.sh)"
13043
```
13144

132-
### How it works
133-
134-
The CLI sends configuration and code to the cluster every time you run `cortex deploy`. Each model is loaded into a Docker container, along with any Python packages and request handling code. The model is exposed as a web service using a Network Load Balancer (NLB) and FastAPI / TensorFlow Serving / ONNX Runtime (depending on the model type). The containers are orchestrated on Elastic Kubernetes Service (EKS) while logs and metrics are streamed to CloudWatch.
135-
136-
Cortex manages its own Kubernetes cluster so that end-to-end functionality like request-based autoscaling, GPU support, and spot instance management can work out of the box without any additional DevOps work.
137-
138-
<br>
45+
<!-- CORTEX_VERSION_README_MINOR -->
46+
See our [installation guide](https://docs.cortex.dev/install), then deploy one of our [examples](https://github.com/cortexlabs/cortex/tree/0.18/examples) or bring your own models to build [custom APIs](https://docs.cortex.dev/deployments/exporting).
13947

140-
## Examples
48+
### Learn more
14149

142-
<!-- CORTEX_VERSION_README_MINOR x4 -->
143-
* [Image classification](https://github.com/cortexlabs/cortex/tree/0.18/examples/tensorflow/image-classifier): deploy an Inception model to classify images.
144-
* [Search completion](https://github.com/cortexlabs/cortex/tree/0.18/examples/pytorch/search-completer): deploy Facebook's RoBERTa model to complete search terms.
145-
* [Text generation](https://github.com/cortexlabs/cortex/tree/0.18/examples/pytorch/text-generator): deploy Hugging Face's DistilGPT2 model to generate text.
146-
* See [all examples](https://github.com/cortexlabs/cortex/tree/0.18/examples)
50+
Check out our [docs](https://docs.cortex.dev) and join our [community](https://gitter.im/cortexlabs/cortex).

0 commit comments

Comments
 (0)