Skip to content

Commit 459a25f

Browse files
authored
Edit docker rate limit guide; add separate guide for self-hosted docker images (#1579)
1 parent ae5b8cc commit 459a25f

File tree

4 files changed

+273
-124
lines changed

4 files changed

+273
-124
lines changed

docs/guides/docker-hub-rate-limiting.md

Lines changed: 128 additions & 119 deletions
Original file line numberDiff line numberDiff line change
@@ -19,147 +19,156 @@ Follow these steps to determine if this issue is affecting your cluster:
1919
5. `kubectl describe pod <pod id> --namespace <pod namespace>`
2020
6. Under the events section, if you see error events related to docker hub rate limiting, then your cluster is likely affected by the rate limiting
2121

22-
We are actively working on a long term resolution to this problem. In the meantime, there are two ways to avoid this issue:
22+
There are three ways to avoid this issue:
2323

24-
## Paid Docker Hub subscription
24+
## Use Cortex images from quay.io
2525

26-
One option is to pay for the Docker Hub subscription to remove the limit on the number of image pulls. Docker Hub's updated pricing model allows unlimited pulls on a _Pro_ subscription for individuals as described [here](https://www.docker.com/pricing).
26+
In response to Docker Hub's new image pull policy, we have migrated our images to [quay.io](https://quay.io). This registry allows for unlimited image pulls for unauthenticated users.
2727

28-
By default, the Cortex cluster pulls the images as an anonymous user. Follow [this guide](private-docker.md) to configure your Cortex cluster to pull the images as an authenticated user.
28+
It is possible to configure Cortex to use the images from Quay instead of Docker Hub:
2929

30-
## Push to AWS ECR (Elastic Container Registry)
30+
### Update your cluster configuration file
31+
32+
Add the following to your [cluster configuration file](../cluster-management/config.md) (e.g. `cluster.yaml`). In the image paths below, make sure to set `<VERSION>` to your cluster's version.
33+
34+
```yaml
35+
# cluster.yaml
36+
37+
image_manager: quay.io/cortexlabs/manager:<VERSION>
38+
image_operator: quay.io/cortexlabs/operator:<VERSION>
39+
image_downloader: quay.io/cortexlabs/downloader:<VERSION>
40+
image_request_monitor: quay.io/cortexlabs/request-monitor:<VERSION>
41+
image_cluster_autoscaler: quay.io/cortexlabs/cluster-autoscaler:<VERSION>
42+
image_metrics_server: quay.io/cortexlabs/metrics-server:<VERSION>
43+
image_nvidia: quay.io/cortexlabs/nvidia:<VERSION>
44+
image_inferentia: quay.io/cortexlabs/inferentia:<VERSION>
45+
image_neuron_rtd: quay.io/cortexlabs/neuron-rtd:<VERSION>
46+
image_fluentd: quay.io/cortexlabs/fluentd:<VERSION>
47+
image_statsd: quay.io/cortexlabs/statsd:<VERSION>
48+
image_istio_proxy: quay.io/cortexlabs/istio-proxy:<VERSION>
49+
image_istio_pilot: quay.io/cortexlabs/istio-pilot:<VERSION>
50+
```
51+
52+
For cluster version <= `0.20.0`, also add the following two images:
53+
54+
```yaml
55+
image_istio_galley: quay.io/cortexlabs/istio-galley:<VERSION>
56+
image_istio_citadel: quay.io/cortexlabs/istio-citadel:<VERSION>
57+
```
58+
59+
For Cortex cluster version < `0.16.0`, please upgrade your cluster to the latest version.
60+
61+
Once you've updated your cluster configuration file, you can spin up your cluster (e.g. `cortex cluster up --config cluster.yaml`).
62+
63+
### Update your API configuration file(s)
64+
65+
To configure your APIs to use the Quay images, you cna update your [API configuration files](../deployments/realtime-api/api-configuration.md). The image paths are specified in `predictor.image` (and `predictor.tensorflow_serving_image` for APIs with `kind: tensorflow`). Be advised that by default, the Docker Hub images are used for your predictors, so you will need to specify the Quay image paths for all of your APIs.
66+
67+
Here is a list of available images (make sure to set `<VERSION>` to your cluster's version):
68+
69+
```text
70+
quay.io/cortexlabs/python-predictor-cpu:<VERSION>
71+
quay.io/cortexlabs/python-predictor-gpu:<VERSION>
72+
quay.io/cortexlabs/python-predictor-inf:<VERSION>
73+
quay.io/cortexlabs/tensorflow-serving-cpu:<VERSION>
74+
quay.io/cortexlabs/tensorflow-serving-gpu:<VERSION>
75+
quay.io/cortexlabs/tensorflow-serving-inf:<VERSION>
76+
quay.io/cortexlabs/tensorflow-predictor:<VERSION>
77+
quay.io/cortexlabs/onnx-predictor-cpu:<VERSION>
78+
quay.io/cortexlabs/onnx-predictor-gpu:<VERSION>
79+
quay.io/cortexlabs/python-predictor-cpu-slim:<VERSION>
80+
quay.io/cortexlabs/python-predictor-gpu-slim:<VERSION>-cuda10.0
81+
quay.io/cortexlabs/python-predictor-gpu-slim:<VERSION>-cuda10.1
82+
quay.io/cortexlabs/python-predictor-gpu-slim:<VERSION>-cuda10.2
83+
quay.io/cortexlabs/python-predictor-gpu-slim:<VERSION>-cuda11.0
84+
quay.io/cortexlabs/python-predictor-inf-slim:<VERSION>
85+
quay.io/cortexlabs/tensorflow-predictor-slim:<VERSION>
86+
quay.io/cortexlabs/onnx-predictor-cpu-slim:<VERSION>
87+
quay.io/cortexlabs/onnx-predictor-gpu-slim:<VERSION>
88+
```
89+
90+
## Paid Docker Hub subscription
91+
92+
Another option is to pay for the Docker Hub subscription to remove the limit on the number of image pulls. Docker Hub's updated pricing model allows unlimited pulls on a _Pro_ subscription for individuals as described [here](https://www.docker.com/pricing).
93+
94+
The advantage of this approach is that there's no need to do a `cortex cluster down`/`cortex cluster up` to authenticate with your Docker Hub account.
3195

32-
You can configure the Cortex cluster to use images from a different registry. A good choice is ECR on AWS. When an ECR repository resides in the same region as your Cortex cluster, there are no costs incurred when pulling images.
96+
By default, the Cortex cluster pulls the images as an anonymous user. To configure your Cortex cluster to pull the images as an authenticated user, follow these steps:
3397

3498
### Step 1
3599

36-
Make sure you have the [aws](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv1.html) and [docker](https://docs.docker.com/get-docker/) CLIs installed.
100+
Install and configure kubectl ([instructions](kubectl-setup.md)).
37101

38102
### Step 2
39103

40-
Export the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables in your current shell, or run `aws configure`. These credentials must have access to push to ECR.
104+
Set the following environment variables, replacing the placeholders with your docker username and password:
41105

42-
### Step 3
106+
```bash
107+
DOCKER_USERNAME=***
108+
DOCKER_PASSWORD=***
109+
```
110+
111+
Run the following commands:
112+
113+
```bash
114+
kubectl create secret docker-registry registry-credentials \
115+
--namespace default \
116+
--docker-username=$DOCKER_USERNAME \
117+
--docker-password=$DOCKER_PASSWORD
118+
119+
kubectl create secret docker-registry registry-credentials \
120+
--namespace kube-system \
121+
--docker-username=$DOCKER_USERNAME \
122+
--docker-password=$DOCKER_PASSWORD
123+
124+
kubectl patch serviceaccount default --namespace default \
125+
-p "{\"imagePullSecrets\": [{\"name\": \"registry-credentials\"}]}"
43126
44-
Choose a region for your cluster and ECR repositories. In this guide, we'll assume the region is `us-west-2`.
127+
kubectl patch serviceaccount operator --namespace default \
128+
-p "{\"imagePullSecrets\": [{\"name\": \"registry-credentials\"}]}"
45129
46-
Also, take note of your AWS account ID. The account ID can be found in the _My Account_ section of your AWS console.
130+
kubectl patch serviceaccount fluentd --namespace default \
131+
-p "{\"imagePullSecrets\": [{\"name\": \"registry-credentials\"}]}"
47132
48-
### Step 4
133+
kubectl patch serviceaccount cluster-autoscaler --namespace kube-system \
134+
-p "{\"imagePullSecrets\": [{\"name\": \"registry-credentials\"}]}"
49135
50-
You can use the script below to push the images from Docker Hub to your ECR registry. Make sure to update the `ecr_region`, `aws_account_id`, and `cortex_version` variables at the top of the file. Copy-paste the contents into a new file (e.g. `ecr.sh`), and then run `chmod +x ecr.sh`, followed by `./ecr.sh`. It is recommended to run this from an EC2 instance in the same region as your ECR repository, since it will be much faster.
136+
kubectl patch serviceaccount metrics-server --namespace kube-system \
137+
-p "{\"imagePullSecrets\": [{\"name\": \"registry-credentials\"}]}"
138+
139+
# Only if you are using Inferentia:
140+
kubectl patch serviceaccount neuron-device-plugin --namespace kube-system \
141+
-p "{\"imagePullSecrets\": [{\"name\": \"registry-credentials\"}]}"
142+
```
143+
144+
### Updating your credentials
145+
146+
First remove your old docker credentials from the cluster:
51147

52148
```bash
53-
#!/bin/bash
54-
set -euo pipefail
55-
56-
# user set variables
57-
ecr_region="us-west-2"
58-
aws_account_id="620970939130" # example account ID
59-
cortex_version="0.22.1"
60-
61-
source_registry="docker.io/cortexlabs"
62-
destination_registry="${aws_account_id}.dkr.ecr.${ecr_region}.amazonaws.com/cortexlabs"
63-
64-
aws ecr get-login-password --region $ecr_region | docker login --username AWS --password-stdin $destination_registry
65-
66-
# images for the cluster
67-
cluster_images=(
68-
"manager"
69-
"request-monitor"
70-
"downloader"
71-
"operator"
72-
"cluster-autoscaler"
73-
"metrics-server"
74-
"inferentia"
75-
"neuron-rtd"
76-
"nvidia"
77-
"fluentd"
78-
"statsd"
79-
"istio-proxy"
80-
"istio-pilot"
81-
)
82-
83-
# images for the APIs (you may delete any images that your APIs don't use)
84-
api_images=(
85-
"python-predictor-cpu"
86-
"python-predictor-gpu"
87-
"python-predictor-inf"
88-
"tensorflow-serving-cpu"
89-
"tensorflow-serving-gpu"
90-
"tensorflow-serving-inf"
91-
"tensorflow-predictor"
92-
"onnx-predictor-cpu"
93-
"onnx-predictor-gpu"
94-
"python-predictor-cpu-slim"
95-
"python-predictor-gpu-slim"
96-
"python-predictor-inf-slim"
97-
"tensorflow-predictor-slim"
98-
"onnx-predictor-cpu-slim"
99-
"onnx-predictor-gpu-slim"
100-
)
101-
images=( "${cluster_images[@]}" "${api_images[@]}" )
102-
103-
extra_tags_for_slim_python_predictor=(
104-
"cuda10.0"
105-
"cuda10.1"
106-
"cuda10.2"
107-
"cuda11.0"
108-
)
109-
110-
# create the image repositories
111-
for image in "${images[@]}"; do
112-
aws ecr create-repository --repository-name=cortexlabs/$image --region=$ecr_region || true
113-
done
114-
115-
# pull the images from Docker Hub and push them to ECR
116-
for image in "${images[@]}"; do
117-
if [ "$image" = "python-predictor-gpu-slim" ]; then
118-
for extra_tag in "${extra_tags_for_slim_python_predictor[@]}"; do
119-
docker image pull "$source_registry/$image:$cortex_version-$extra_tag"
120-
docker image tag "$source_registry/$image:$cortex_version-$extra_tag" "$destination_registry/$image:$cortex_version-$extra_tag"
121-
docker image push "$destination_registry/$image:$cortex_version-$extra_tag"
122-
echo
123-
done
124-
else
125-
docker image pull "$source_registry/$image:$cortex_version"
126-
docker image tag "$source_registry/$image:$cortex_version" "$destination_registry/$image:$cortex_version"
127-
docker image push "$destination_registry/$image:$cortex_version"
128-
echo
129-
fi
130-
done
131-
132-
echo "###############################################"
133-
echo
134-
echo "add the following images to your cortex cluster configuration file (e.g. cluster.yaml):"
135-
echo "-----------------------------------------------"
136-
for cluster_image in "${cluster_images[@]}"; do
137-
echo "image_$cluster_image: $destination_registry/$cluster_image:$cortex_version"
138-
done
139-
echo -e "-----------------------------------------------\n"
140-
141-
echo "use the following images in your API configuration files (e.g. cortex.yaml):"
142-
echo "-----------------------------------------------"
143-
for api_image in "${api_images[@]}"; do
144-
if [ "$api_image" = "python-predictor-gpu-slim" ]; then
145-
for extra_tag in "${extra_tags_for_slim_python_predictor[@]}"; do
146-
echo "$destination_registry/$api_image:$cortex_version-$extra_tag"
147-
done
148-
else
149-
echo "$destination_registry/$api_image:$cortex_version"
150-
fi
151-
done
152-
echo "-----------------------------------------------"
149+
kubectl delete secret --namespace default registry-credentials
150+
kubectl delete secret --namespace kube-system registry-credentials
153151
```
154152

155-
The first list of images that were printed (the cluster images) can be directly copy-pasted in your [cluster configuration file](../cluster-management/config.md) before spinning up your cluster.
153+
Then repeat step 2 above with your updated credentials.
156154

157-
The second list of images that were printed (the API images) can be used in your [API configuration files](../deployments/realtime-api/api-configuration.md). The images are specified in `predictor.image` (and `predictor.tensorflow_serving_image` for APIs with `kind: tensorflow`). Be advised that by default, the Docker Hub images are used for your predictors, so you will need to specify your ECR image paths for all of your APIs.
155+
### Removing your credentials
158156

159-
## Step 6
157+
To remove your docker credentials from the cluster, run the following commands:
160158

161-
Spin up your Cortex cluster using your updated cluster configuration file (e.g. `cortex cluster up --config cluster.yaml`).
159+
```bash
160+
kubectl delete secret --namespace default registry-credentials
161+
kubectl delete secret --namespace kube-system registry-credentials
162+
163+
kubectl patch serviceaccount default --namespace default -p "{\"imagePullSecrets\": []}"
164+
kubectl patch serviceaccount operator --namespace default -p "{\"imagePullSecrets\": []}"
165+
kubectl patch serviceaccount fluentd --namespace default -p "{\"imagePullSecrets\": []}"
166+
kubectl patch serviceaccount cluster-autoscaler --namespace kube-system -p "{\"imagePullSecrets\": []}"
167+
kubectl patch serviceaccount metrics-server --namespace kube-system -p "{\"imagePullSecrets\": []}"
168+
# Only if you are using Inferentia:
169+
kubectl patch serviceaccount neuron-device-plugin --namespace kube-system -p "{\"imagePullSecrets\": []}"
170+
```
162171

163-
## Cleanup
172+
## Push to AWS ECR (Elastic Container Registry)
164173

165-
You can delete your ECR images from the [AWS ECR dashboard](https://console.aws.amazon.com/ecr/repositories) (set your region in the upper right corner). Make sure all of your Cortex clusters have been deleted before deleting any ECR images.
174+
You can also push the Cortex images to ECR on your AWS account, and pull from your ECR repository in your cluster. Follow [this guide](self-hosted-images.md) to do this.

docs/guides/private-docker.md

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -31,14 +31,13 @@ kubectl create secret docker-registry registry-credentials \
3131
--docker-username=$DOCKER_USERNAME \
3232
--docker-password=$DOCKER_PASSWORD
3333

34-
kubectl patch serviceaccount default \
35-
--namespace default \
34+
kubectl patch serviceaccount default --namespace default \
3635
-p "{\"imagePullSecrets\": [{\"name\": \"registry-credentials\"}]}"
3736
```
3837

3938
### Updating your credentials
4039

41-
To remove your docker credentials from the cluster, run this command:
40+
First remove your old docker credentials from the cluster:
4241

4342
```bash
4443
kubectl delete secret --namespace default registry-credentials
@@ -53,7 +52,6 @@ To remove your docker credentials from the cluster, run the following commands:
5352
```bash
5453
kubectl delete secret --namespace default registry-credentials
5554

56-
kubectl patch serviceaccount default \
57-
--namespace default \
55+
kubectl patch serviceaccount default --namespace default \
5856
-p "{\"imagePullSecrets\": []}"
5957
```

0 commit comments

Comments
 (0)