Skip to content

Commit 06f6d7d

Browse files
authored
Schedule post predict in same threadpool as predict (#1367)
1 parent 4fda8a9 commit 06f6d7d

File tree

2 files changed

+13
-1
lines changed

2 files changed

+13
-1
lines changed

docs/deployments/realtime-api/predictors.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,10 @@ class PythonPredictor:
8686
Useful for tasks that the client doesn't need to wait on before
8787
receiving a response such as recording metrics or storing results.
8888
89+
Note: post_predict() and predict() run in the same thread pool. The
90+
size of the thread pool can be increased by updating
91+
`threads_per_process` in the api configuration yaml.
92+
8993
Args:
9094
response (optional): The response as returned by the predict method.
9195
payload (optional): The request payload (see below for the possible
@@ -245,6 +249,10 @@ class TensorFlowPredictor:
245249
Useful for tasks that the client doesn't need to wait on before
246250
receiving a response such as recording metrics or storing results.
247251
252+
Note: post_predict() and predict() run in the same thread pool. The
253+
size of the thread pool can be increased by updating
254+
`threads_per_process` in the api configuration yaml.
255+
248256
Args:
249257
response (optional): The response as returned by the predict method.
250258
payload (optional): The request payload (see below for the possible
@@ -353,6 +361,10 @@ class ONNXPredictor:
353361
Useful for tasks that the client doesn't need to wait on before
354362
receiving a response such as recording metrics or storing results.
355363
364+
Note: post_predict() and predict() run in the same thread pool. The
365+
size of the thread pool can be increased by updating
366+
`threads_per_process` in the api configuration yaml.
367+
356368
Args:
357369
response (optional): The response as returned by the predict method.
358370
payload (optional): The request payload (see below for the possible

pkg/workloads/cortex/serve/serve.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -214,7 +214,7 @@ def predict(request: Request):
214214

215215
if util.has_method(predictor_impl, "post_predict"):
216216
kwargs = build_post_predict_kwargs(prediction, request)
217-
tasks.add_task(predictor_impl.post_predict, **kwargs)
217+
request_thread_pool.submit(predictor_impl.post_predict, **kwargs)
218218

219219
if len(tasks.tasks) > 0:
220220
response.background = tasks

0 commit comments

Comments
 (0)