Skip to content

Commit cb49628

Browse files
diptanueabatalov
andauthored
Data Applications Serverside Impl (#1728)
* v3 proto * v3 proto * v3 proto * update proto * Add definitions for new Application and Function HTTP objects + added route for registering/updating application and parsing its manifest. * update executor api v3 * update executor api * update proto * Add metadata_offset and metadata_size to DataPayload * Add positional_args and keyword_args to TaskAllocation * update proto * update proto * update proto for indexify v3 * add missing fields to proto * remove initial_value from ReduceOp * update to proto * function sdk implementation * fixed tests * removing example.py * removing task cache * fix: update state store to use new driver * review comments * review comments * lint * Add id to ReduceOp * add comments to ReduceOp * Fix the reducer code path * added a new RPC call to make blocking function calls * Rename Executor V3 API proto fields to use consistent new SDK terms Renaming only at the API boundary as this is most problematic to refactor later. Helps me to not write new code in Executor that uses old and new terms inconsistently. * More Server API protos cleanups Just again renaming a few things and removing not used stuff. * Main Executor side changes to work with new Server and FE APIs Executor starts up, didn't run anything besides this. * Remove is_reducer from ApplicationFunction There are no statically defined reducer functions anymore. User decides to run a reduce operation using a function in runtime. * Fixes in app deployment and run code, Executor dummy bugfixes * A few more improvements * Minor Server fixes + function call tree output value workaround * Add root_function_call_id to ExecutionPlanUpdates * Tracking the root function call id in the graph updates * Fixes for complex graph * Propagate value output up through the call tree to its consumer functions * Set node id of propagated data payloads correctly so Executor can restore the tree even with incorrect id in propagated data payload metadata. * Work with user supplied DataPayload content types. * Optimizing how we store the source of output when FunctionRuns return execution plan updates * Removed un-necessary code * updated code to use the new scheduler update request * lint * fix rust tests * fixed tests * Fix function calls without parameters and rename "tasks" to "requests" * propagate output of function runs to consumers * fixed reduce op * Reduce fixes, fix idempotency check, added comments to code paths * Fix FE initialization failure handling code We were ignoring FE initilization failures previously and were failing when running alloc instead. * updated indexify tests * Add class instance init timeout feature and fix a bunch of tests * More test fixes * Fix the rest of the Indexify tests * Remove not implemented settings from tensorlake.Retries * Dynamically override function output serializer if it's outputs are API outputs Indexify side of the change. * Fixing app update * clippy fixes * update deps * clippy fixes * clippy fixes * clippy fixes * fmt * fmt * fmt --------- Co-authored-by: Eugene Batalov <eugene@tensorlake.ai>
1 parent 0fce198 commit cb49628

File tree

112 files changed

+5924
-8331
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

112 files changed

+5924
-8331
lines changed

Cargo.lock

Lines changed: 120 additions & 104 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

docs/tags.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,11 @@
11
Unified tags for indexify / PE / FE stack. All events/metrics/spans should use these attribute names when relevant.
22

3-
- graph - Name of graph.
4-
- graph_version - Version of the graph
5-
- invocation_id = ID of the invocation.
3+
- graph - Name of graph. (Deprecated by application)
4+
- graph_version - Version of the graph (Deprecated by app_version)
5+
- invocation_id = ID of the invocation. (Deprecated by request_id)
6+
- app - Name of application.
7+
- app_version - Version of the application
8+
- request_id = ID of the request.
69
- fn - Name of the compute function
710
- allocation_id - ID of the allocation
811
- task_id - ID of the task.

indexify/Makefile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,13 @@ build: build_proto
66
@poetry build
77

88
SERVER_API_PY_CLIENT_PROTO_DIR_PATH=indexify/proto
9-
SERVER_API_PROTO_DIR_PATH=../server/proto
9+
SERVER_API_PROTO_PATH=../server/proto/executor_api_v3.proto
1010

11-
build_proto: ${SERVER_API_PROTO_DIR_PATH}/executor_api.proto
11+
build_proto: ${SERVER_API_PROTO_PATH}
1212
@poetry install
1313
@# .proto file and generated Python files have to be in the same directory.
1414
@# See known issue https://github.com/grpc/grpc/issues/29459.
15-
@cp ${SERVER_API_PROTO_DIR_PATH}/executor_api.proto src/${SERVER_API_PY_CLIENT_PROTO_DIR_PATH}/executor_api.proto
15+
@cp ${SERVER_API_PROTO_PATH} src/${SERVER_API_PY_CLIENT_PROTO_DIR_PATH}/executor_api.proto
1616
@cd src && poetry run python -m grpc_tools.protoc \
1717
--proto_path=. \
1818
--python_out=. \

indexify/poetry.lock

Lines changed: 62 additions & 60 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

indexify/pyproject.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,9 @@ boto3 = "^1.40.15"
2929
structlog = "25.4.0"
3030
# Adds function-executor binary, utils lib, sdk used in indexify-cli commands.
3131
# We need to specify the tensorlake version exactly because pip install doesn't respect poetry.lock files.
32-
tensorlake = "0.2.47"
32+
# tensorlake = "0.2.48"
3333
# Uncomment the next line to use local tensorlake package (only for development!)
34-
# tensorlake = { path = "../tensorlake", develop = true }
34+
tensorlake = { path = "../tensorlake", develop = true }
3535
# grpcio is provided by tensorlake
3636
# grpcio-tools is provided by tensorlake
3737

indexify/src/indexify/cli/executor.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,6 @@
1818
import structlog
1919

2020
from indexify.executor.blob_store.blob_store import BLOBStore
21-
from indexify.executor.blob_store.local_fs_blob_store import LocalFSBLOBStore
22-
from indexify.executor.blob_store.s3_blob_store import S3BLOBStore
2321
from indexify.executor.executor import Executor
2422
from indexify.executor.function_executor.server.subprocess_function_executor_server_factory import (
2523
SubprocessFunctionExecutorServerFactory,
@@ -68,7 +66,7 @@
6866
default=[],
6967
multiple=True,
7068
help="Functions that the executor will run "
71-
"specified as <namespace>:<workflow>:<function>:<version>"
69+
"specified as <namespace>:<application>:<function>:<version>"
7270
"version is optional, not specifying it will make the server send any version"
7371
"of the function. Any number of --function arguments can be passed.",
7472
)

indexify/src/indexify/executor/executor.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
import signal
33
from pathlib import Path
44
from socket import gethostname
5-
from typing import Dict, List, Optional
5+
from typing import Dict, List
66

77
import structlog
88

@@ -48,7 +48,7 @@ def __init__(
4848
function_executor_server_factory: FunctionExecutorServerFactory,
4949
server_addr: str,
5050
grpc_server_addr: str,
51-
config_path: Optional[str],
51+
config_path: str | None,
5252
monitoring_server_host: str,
5353
monitoring_server_port: int,
5454
blob_store: BLOBStore,
@@ -100,8 +100,8 @@ def __init__(
100100
reported_state_handler=ReportedStateHandler(self._state_reporter),
101101
desired_state_handler=DesiredStateHandler(self._state_reconciler),
102102
)
103-
self._run_aio_task: Optional[asyncio.Task] = None
104-
self._shutdown_aio_task: Optional[asyncio.Task] = None
103+
self._run_aio_task: asyncio.Task | None = None
104+
self._shutdown_aio_task: asyncio.Task | None = None
105105

106106
executor_info: Dict[str, str] = {
107107
"id": id,

0 commit comments

Comments
 (0)