Skip to content

Commit 30d6e76

Browse files
authored
Update tutorial to use pytorch/text-generator (#1278)
1 parent 213fc3c commit 30d6e76

File tree

37 files changed

+375
-695
lines changed

37 files changed

+375
-695
lines changed

docs/cluster-management/install.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,8 @@ You must have [Docker](https://docs.docker.com/install) installed to run Cortex
1818
# clone the Cortex repository
1919
git clone -b master https://github.com/cortexlabs/cortex.git
2020

21-
# navigate to the TensorFlow iris classification example
22-
cd cortex/examples/tensorflow/iris-classifier
21+
# navigate to the Pytorch text generator example
22+
cd cortex/examples/pytorch/text-generator
2323

2424
# deploy the model as a realtime api
2525
cortex deploy
@@ -28,18 +28,18 @@ cortex deploy
2828
cortex get --watch
2929

3030
# stream logs from the api
31-
cortex logs iris-classifier
31+
cortex logs text-generator
3232

3333
# get the api's endpoint
34-
cortex get iris-classifier
34+
cortex get text-generator
3535

3636
# classify a sample
37-
curl -X POST -H "Content-Type: application/json" \
38-
-d '{ "sepal_length": 5.2, "sepal_width": 3.6, "petal_length": 1.4, "petal_width": 0.3 }' \
39-
<API endpoint>
37+
curl <API endpoint> \
38+
-X POST -H "Content-Type: application/json" \
39+
-d '{"text": "machine learning is"}' \
4040

4141
# delete the api
42-
cortex delete iris-classifier
42+
cortex delete text-generator
4343
```
4444

4545
## Running at scale on AWS
@@ -56,12 +56,12 @@ cortex cluster up
5656
cortex env default aws
5757
```
5858

59-
You can now run the same commands shown above to deploy the iris classifier to AWS (if you didn't set the default CLI environment, add `--env aws` to the `cortex` commands).
59+
You can now run the same commands shown above to deploy the text generator to AWS (if you didn't set the default CLI environment, add `--env aws` to the `cortex` commands).
6060

6161
## Next steps
6262

6363
<!-- CORTEX_VERSION_MINOR -->
64-
* Try the [tutorial](../../examples/sklearn/iris-classifier/README.md) to learn more about how to use Cortex.
64+
* Try the [tutorial](../../examples/pytorch/text-generator/README.md) to learn more about how to use Cortex.
6565
* Deploy one of our [examples](https://github.com/cortexlabs/cortex/tree/master/examples).
6666
* See our [exporting guide](../guides/exporting.md) for how to export your model to use in an API.
6767
* See [uninstall](uninstall.md) if you'd like to spin down your cluster.

docs/deployments/batch-api/api-configuration.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ See additional documentation for [compute](../compute.md), [networking](../netwo
4141
model_path: <string> # S3 path to an exported model (e.g. s3://my-bucket/exported_model) (either this or 'models' must be provided)
4242
signature_key: <string> # name of the signature def to use for prediction (required if your model has more than one signature def)
4343
models: # use this when multiple models per API are desired (either this or 'model_path' must be provided)
44-
- name: <string> # unique name for the model (e.g. iris-classifier) (required)
44+
- name: <string> # unique name for the model (e.g. text-generator) (required)
4545
model_path: <string> # S3 path to an exported model (e.g. s3://my-bucket/exported_model) (required)
4646
signature_key: <string> # name of the signature def to use for prediction (required if your model has more than one signature def)
4747
...
@@ -75,7 +75,7 @@ See additional documentation for [compute](../compute.md), [networking](../netwo
7575
path: <string> # path to a python file with an ONNXPredictor class definition, relative to the Cortex root (required)
7676
model_path: <string> # S3 path to an exported model (e.g. s3://my-bucket/exported_model.onnx) (either this or 'models' must be provided)
7777
models: # use this when multiple models per API are desired (either this or 'model_path' must be provided)
78-
- name: <string> # unique name for the model (e.g. iris-classifier) (required)
78+
- name: <string> # unique name for the model (e.g. text-generator) (required)
7979
model_path: <string> # S3 path to an exported model (e.g. s3://my-bucket/exported_model.onnx) (required)
8080
signature_key: <string> # name of the signature def to use for prediction (required if your model has more than one signature def)
8181
...

docs/deployments/batch-api/deployment.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,6 @@ deleting my-api
122122
## Additional resources
123123

124124
<!-- CORTEX_VERSION_MINOR -->
125-
* [Tutorial](../../../examples/batch/image-classifier/README.md) provides a step-by-step walkthrough of deploying an iris classifier API
125+
* [Tutorial](../../../examples/batch/image-classifier/README.md) provides a step-by-step walkthrough of deploying an image classification batch API
126126
* [CLI documentation](../../miscellaneous/cli.md) lists all CLI commands
127127
* [Examples](https://github.com/cortexlabs/cortex/tree/master/examples/batch) demonstrate how to deploy models from common ML libraries

docs/deployments/batch-api/predictors.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ The following files can also be added at the root of the project's directory:
2222
For example, if your directory looks like this:
2323

2424
```text
25-
./iris-classifier/
25+
./my-classifier/
2626
├── cortex.yaml
2727
├── values.json
2828
├── predictor.py
@@ -191,7 +191,7 @@ class TensorFlowPredictor:
191191
<!-- CORTEX_VERSION_MINOR -->
192192
Cortex provides a `tensorflow_client` to your Predictor's constructor. `tensorflow_client` is an instance of [TensorFlowClient](https://github.com/cortexlabs/cortex/tree/master/pkg/workloads/cortex/lib/client/tensorflow.py) that manages a connection to a TensorFlow Serving container to make predictions using your model. It should be saved as an instance variable in your Predictor, and your `predict()` function should call `tensorflow_client.predict()` to make an inference with your exported TensorFlow model. Preprocessing of the JSON payload and postprocessing of predictions can be implemented in your `predict()` function as well.
193193

194-
When multiple models are defined using the Predictor's `models` field, the `tensorflow_client.predict()` method expects a second argument `model_name` which must hold the name of the model that you want to use for inference (for example: `self.client.predict(payload, "iris-classifier")`). See the [multi model guide](../../guides/multi-model.md#tensorflow-predictor) for more information.
194+
When multiple models are defined using the Predictor's `models` field, the `tensorflow_client.predict()` method expects a second argument `model_name` which must hold the name of the model that you want to use for inference (for example: `self.client.predict(payload, "text-generator")`). See the [multi model guide](../../guides/multi-model.md#tensorflow-predictor) for more information.
195195

196196
For proper separation of concerns, it is recommended to use the constructor's `config` parameter for information such as from where to download the model and initialization files, or any configurable model parameters. You define `config` in your [API configuration](api-configuration.md), and it is passed through to your Predictor's constructor. The `config` parameters in the `API configuration` can be overridden by providing `config` in the job submission requests.
197197

@@ -260,7 +260,7 @@ class ONNXPredictor:
260260
<!-- CORTEX_VERSION_MINOR -->
261261
Cortex provides an `onnx_client` to your Predictor's constructor. `onnx_client` is an instance of [ONNXClient](https://github.com/cortexlabs/cortex/tree/master/pkg/workloads/cortex/lib/client/onnx.py) that manages an ONNX Runtime session to make predictions using your model. It should be saved as an instance variable in your Predictor, and your `predict()` function should call `onnx_client.predict()` to make an inference with your exported ONNX model. Preprocessing of the JSON payload and postprocessing of predictions can be implemented in your `predict()` function as well.
262262

263-
When multiple models are defined using the Predictor's `models` field, the `onnx_client.predict()` method expects a second argument `model_name` which must hold the name of the model that you want to use for inference (for example: `self.client.predict(model_input, "iris-classifier")`). See the [multi model guide](../../guides/multi-model.md#onnx-predictor) for more information.
263+
When multiple models are defined using the Predictor's `models` field, the `onnx_client.predict()` method expects a second argument `model_name` which must hold the name of the model that you want to use for inference (for example: `self.client.predict(model_input, "text-generator")`). See the [multi model guide](../../guides/multi-model.md#onnx-predictor) for more information.
264264

265265
For proper separation of concerns, it is recommended to use the constructor's `config` parameter for information such as from where to download the model and initialization files, or any configurable model parameters. You define `config` in your [API configuration](api-configuration.md), and it is passed through to your Predictor's constructor. The `config` parameters in the `API configuration` can be overridden by providing `config` in the job submission requests.
266266

docs/deployments/python-packages.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ _WARNING: you are on the master branch, please refer to the docs on the branch t
77
You can install your required PyPI packages and import them in your Python files using pip. Cortex looks for a `requirements.txt` file in the top level Cortex project directory (i.e. the directory which contains `cortex.yaml`):
88

99
```text
10-
./iris-classifier/
10+
./my-classifier/
1111
├── cortex.yaml
1212
├── predictor.py
1313
├── ...
@@ -56,7 +56,7 @@ On GitHub, you can generate a personal access token by following [these steps](h
5656
Python packages can also be installed by providing a `setup.py` that describes your project's modules. Here's an example directory structure:
5757

5858
```text
59-
./iris-classifier/
59+
./my-classifier/
6060
├── cortex.yaml
6161
├── predictor.py
6262
├── ...
@@ -78,7 +78,7 @@ In this case, `requirements.txt` will have this form:
7878
Cortex supports installing Conda packages. We recommend only using Conda when your required packages are not available in PyPI. Cortex looks for a `conda-packages.txt` file in the top level Cortex project directory (i.e. the directory which contains `cortex.yaml`):
7979

8080
```text
81-
./iris-classifier/
81+
./my-classifier/
8282
├── cortex.yaml
8383
├── predictor.py
8484
├── ...

docs/deployments/realtime-api.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ The Cortex Cluster will automatically scale based on the incoming traffic and th
4040

4141
## Next steps
4242

43-
* Try the [tutorial](../../examples/sklearn/iris-classifier/README.md) to deploy a Realtime API locally or on AWS.
43+
* Try the [tutorial](../../examples/pytorch/text-generator/README.md) to deploy a Realtime API locally or on AWS.
4444
* See our [exporting guide](../guides/exporting.md) for how to export your model to use in a Realtime API.
4545
* See the [Predictor docs](realtime-api/predictors.md) for how to implement a Predictor class.
4646
* See the [API configuration docs](realtime-api/api-configuration.md) for a full list of features that can be used to deploy your Realtime API.

docs/deployments/realtime-api/api-configuration.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ See additional documentation for [parallelism](parallelism.md), [autoscaling](au
6363
model_path: <string> # S3 path to an exported model (e.g. s3://my-bucket/exported_model) (either this or 'models' must be provided)
6464
signature_key: <string> # name of the signature def to use for prediction (required if your model has more than one signature def)
6565
models: # use this when multiple models per API are desired (either this or 'model_path' must be provided)
66-
- name: <string> # unique name for the model (e.g. iris-classifier) (required)
66+
- name: <string> # unique name for the model (e.g. text-generator) (required)
6767
model_path: <string> # S3 path to an exported model (e.g. s3://my-bucket/exported_model) (required)
6868
signature_key: <string> # name of the signature def to use for prediction (required if your model has more than one signature def)
6969
...
@@ -119,7 +119,7 @@ See additional documentation for [parallelism](parallelism.md), [autoscaling](au
119119
path: <string> # path to a python file with an ONNXPredictor class definition, relative to the Cortex root (required)
120120
model_path: <string> # S3 path to an exported model (e.g. s3://my-bucket/exported_model.onnx) (either this or 'models' must be provided)
121121
models: # use this when multiple models per API are desired (either this or 'model_path' must be provided)
122-
- name: <string> # unique name for the model (e.g. iris-classifier) (required)
122+
- name: <string> # unique name for the model (e.g. text-generator) (required)
123123
model_path: <string> # S3 path to an exported model (e.g. s3://my-bucket/exported_model.onnx) (required)
124124
signature_key: <string> # name of the signature def to use for prediction (required if your model has more than one signature def)
125125
...

docs/deployments/realtime-api/deployment.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ $ cortex get my-api
2626
status up-to-date requested last update avg request 2XX
2727
live 1 1 1m - -
2828

29-
endpoint: http://***.amazonaws.com/iris-classifier
29+
endpoint: http://***.amazonaws.com/text-generator
3030
...
3131
```
3232

@@ -63,6 +63,6 @@ deleting my-api
6363
## Additional resources
6464

6565
<!-- CORTEX_VERSION_MINOR -->
66-
* [Tutorial](../../../examples/sklearn/iris-classifier/README.md) provides a step-by-step walkthrough of deploying an iris classifier API
66+
* [Tutorial](../../../examples/pytorch/text-generator/README.md) provides a step-by-step walkthrough of deploying a text generation API
6767
* [CLI documentation](../../miscellaneous/cli.md) lists all CLI commands
6868
* [Examples](https://github.com/cortexlabs/cortex/tree/master/examples) demonstrate how to deploy models from common ML libraries

docs/deployments/realtime-api/predictors.md

Lines changed: 13 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ The following files can also be added at the root of the project's directory:
2424
For example, if your directory looks like this:
2525

2626
```text
27-
./iris-classifier/
27+
./my-classifier/
2828
├── cortex.yaml
2929
├── values.json
3030
├── predictor.py
@@ -97,48 +97,25 @@ Your `predictor` method can return different types of objects such as `JSON`-par
9797
Many of the [examples](https://github.com/cortexlabs/cortex/tree/master/examples) use the Python Predictor, including all of the PyTorch examples.
9898

9999
<!-- CORTEX_VERSION_MINOR -->
100-
Here is the Predictor for [examples/pytorch/iris-classifier](https://github.com/cortexlabs/cortex/tree/master/examples/pytorch/iris-classifier):
100+
Here is the Predictor for [examples/pytorch/text-generator](https://github.com/cortexlabs/cortex/tree/master/examples/pytorch/text-generator):
101101

102102
```python
103-
import re
104103
import torch
105-
import boto3
106-
from model import IrisNet
104+
from transformers import GPT2Tokenizer, GPT2LMHeadModel
107105

108-
labels = ["setosa", "versicolor", "virginica"]
109106

110107
class PythonPredictor:
111108
def __init__(self, config):
112-
# download the model
113-
bucket, key = re.match("s3://(.+?)/(.+)", config["model"]).groups()
114-
s3 = boto3.client("s3")
115-
s3.download_file(bucket, key, "model.pth")
116-
117-
# initialize the model
118-
model = IrisNet()
119-
model.load_state_dict(torch.load("model.pth"))
120-
model.eval()
121-
122-
self.model = model
109+
self.device = "cuda" if torch.cuda.is_available() else "cpu"
110+
print(f"using device: {self.device}")
111+
self.tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
112+
self.model = GPT2LMHeadModel.from_pretrained("gpt2").to(self.device)
123113

124114
def predict(self, payload):
125-
# Convert the request to a tensor and pass it into the model
126-
input_tensor = torch.FloatTensor(
127-
[
128-
[
129-
payload["sepal_length"],
130-
payload["sepal_width"],
131-
payload["petal_length"],
132-
payload["petal_width"],
133-
]
134-
]
135-
)
136-
137-
# Run the prediction
138-
output = self.model(input_tensor)
139-
140-
# Translate the model output to the corresponding label string
141-
return labels[torch.argmax(output[0])]
115+
input_length = len(payload["text"].split())
116+
tokens = self.tokenizer.encode(payload["text"], return_tensors="pt").to(self.device)
117+
prediction = self.model.generate(tokens, max_length=input_length + 20, do_sample=True)
118+
return self.tokenizer.decode(prediction[0])
142119
```
143120

144121
### Pre-installed packages
@@ -256,7 +233,7 @@ class TensorFlowPredictor:
256233
<!-- CORTEX_VERSION_MINOR -->
257234
Cortex provides a `tensorflow_client` to your Predictor's constructor. `tensorflow_client` is an instance of [TensorFlowClient](https://github.com/cortexlabs/cortex/tree/master/pkg/workloads/cortex/lib/client/tensorflow.py) that manages a connection to a TensorFlow Serving container to make predictions using your model. It should be saved as an instance variable in your Predictor, and your `predict()` function should call `tensorflow_client.predict()` to make an inference with your exported TensorFlow model. Preprocessing of the JSON payload and postprocessing of predictions can be implemented in your `predict()` function as well.
258235

259-
When multiple models are defined using the Predictor's `models` field, the `tensorflow_client.predict()` method expects a second argument `model_name` which must hold the name of the model that you want to use for inference (for example: `self.client.predict(payload, "iris-classifier")`). See the [multi model guide](../../guides/multi-model.md#tensorflow-predictor) for more information.
236+
When multiple models are defined using the Predictor's `models` field, the `tensorflow_client.predict()` method expects a second argument `model_name` which must hold the name of the model that you want to use for inference (for example: `self.client.predict(payload, "text-generator")`). See the [multi model guide](../../guides/multi-model.md#tensorflow-predictor) for more information.
260237

261238
For proper separation of concerns, it is recommended to use the constructor's `config` parameter for information such as configurable model parameters or download links for initialization files. You define `config` in your [API configuration](api-configuration.md), and it is passed through to your Predictor's constructor.
262239

@@ -352,7 +329,7 @@ class ONNXPredictor:
352329
<!-- CORTEX_VERSION_MINOR -->
353330
Cortex provides an `onnx_client` to your Predictor's constructor. `onnx_client` is an instance of [ONNXClient](https://github.com/cortexlabs/cortex/tree/master/pkg/workloads/cortex/lib/client/onnx.py) that manages an ONNX Runtime session to make predictions using your model. It should be saved as an instance variable in your Predictor, and your `predict()` function should call `onnx_client.predict()` to make an inference with your exported ONNX model. Preprocessing of the JSON payload and postprocessing of predictions can be implemented in your `predict()` function as well.
354331

355-
When multiple models are defined using the Predictor's `models` field, the `onnx_client.predict()` method expects a second argument `model_name` which must hold the name of the model that you want to use for inference (for example: `self.client.predict(model_input, "iris-classifier")`). See the [multi model guide](../../guides/multi-model.md#onnx-predictor) for more information.
332+
When multiple models are defined using the Predictor's `models` field, the `onnx_client.predict()` method expects a second argument `model_name` which must hold the name of the model that you want to use for inference (for example: `self.client.predict(model_input, "text-generator")`). See the [multi model guide](../../guides/multi-model.md#onnx-predictor) for more information.
356333

357334
For proper separation of concerns, it is recommended to use the constructor's `config` parameter for information such as configurable model parameters or download links for initialization files. You define `config` in your [API configuration](api-configuration.md), and it is passed through to your Predictor's constructor.
358335

docs/deployments/realtime-api/traffic-splitter.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,5 +78,5 @@ Note that this will not delete the Realtime APIs targeted by the Traffic Splitte
7878

7979
<!-- CORTEX_VERSION_MINOR -->
8080
* [Traffic Splitter Tutorial](../../../examples/traffic-splitter/README.md) provides a step-by-step walkthrough for deploying an Traffic Splitter
81-
* [Realtime API Tutorial](../../../examples/sklearn/iris-classifier/README.md) provides a step-by-step walkthrough of deploying an iris classifier Realtime API
81+
* [Realtime API Tutorial](../../../examples/pytorch/text-generator/README.md) provides a step-by-step walkthrough of deploying a realtime API for text generation
8282
* [CLI documentation](../../miscellaneous/cli.md) lists all CLI commands

0 commit comments

Comments
 (0)