-
Notifications
You must be signed in to change notification settings - Fork 559
Log store abstraction #4111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
bcdurak
wants to merge
93
commits into
develop
Choose a base branch
from
feature/log-store
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Log store abstraction #4111
+3,165
−1,644
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bcdurak
commented
Nov 18, 2025
bcdurak
commented
Nov 18, 2025
stefannica
approved these changes
Dec 3, 2025
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
enhancement
New feature or request
internal
To filter out internal PRs and issues
release-notes
Release notes will be attached and used publicly for this PR.
run-slow-ci
Tag that is used to trigger the slow-ci
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a large PR, so I would advise you to read the whole description before starting the review.
What do we capture now?
Previously, ZenML used to add a new handler to the root logger. This root logger was used to capture all the logs that goes through the root logger and store them in the artifact store. Additionally, we used to wrap the built-in
printfunction in a way, that we stored theprinted messages as well. However, in this case, we missed on a couple of sources such as messages from loggers that do not propagate to the root logger, anything on the stdout/stderr aside from log messages andprintstatements.Now, we do the following:
stdoutandstderrare now wrapped. we keep the originalstdoutandstderr.stdout/stderrstill go through the originalstdout/stderr.classmethodcalledLoggingContext.emit(...). (will be explained in the following section).console_handlerand thezenml_handler.console_handlerformats and writes things to the console, thezenml_handleris responsible for routing all the incoming log messages to theLoggingContext.emit(...)as well.The new
LoggingContextclassWe have a new
LoggingContextclass now that replaces the oldPipelineLogsContext. It's still a context manager, but operates a bit differently.When you
__init__this class, it stores the reference to the log store within your active stack. Every the__enter__method gets called, it checks a context variable calledactive_logging_context, if there is one, it stores it and replaces the context variable with itself. Similarly, when__exit__gets called, it removes itself from the context variable and puts the old value back.One of the most critical parts is the fact that you require a
LogsResponseto initiate aLoggingContextnow. So, when we ultimately call theemit(...)classmethod, it passes the message and active logging context (along with the correspondingLogsResponse) to theemit(...)method of the log store.The new
LogStorecomponentWe have a new type of stack component called a
LogStore. It handles log collection and retrieval . Different implementations can plug into this interface to provide different storage backends without changing how logs are captured or accessed.This PR also introduces three layers of implementation:
1. Layer:
BaseLogStoreThis layer introduces the main abstraction for the new stack component. Main abstract methods include:
emit(...): receives log records and sends them to a specific backendfetch(...): retrieves stored logs for the dashboard and API based on time filters and limitsfinalize(...): finalizes the stream of logs associated with a specific log response2. Layer:
OtelLogStoreThis is yet another abstraction built on the base log store that implements the core OTEL infrastructure:
emit(...): Activates the log store if not yet activated and translates log recored objects into OTEL format with zenml-specific attributes (e.g., zenml.log_id, zenml.log_uri, zenml.log_store_id) and emits them through the OTEL loggeractivate(...): Sets up the OpenTelemetry pipeline including the LoggerProvider, BatchLogRecordProcessor, and LoggingHandlerdeactivate(...): Flushes pending logs and shuts down the processor and its background threadMoreover, It introduces configuration options for the OTEL-standardized logs including:
service_name,service_version,max_queue_size,schedule_delay_millis,max_export_batch_sizeThe following abstract methods are exposed and must be implemented by subclasses:
get_exporter(...): Returns the specificLogExporterinstance for the backendfetch(...): Backend-specific log retrieval (since each backend has different query mechanisms)3. Layer: Concrete Implementations
ArtifactLogStore
The artifact log store writes logs directly to the artifact store, providing a zero-configuration logging solution that works out of the box:
ArtifactLogExporterthat writesLogEntryobjects to the artifact store (compatible with our previous approach)from_artifact_store(...)class methodEND_OF_STREAM_MESSAGEmarker that triggers file merging on immutable filesystems and version removal on othersDatadogLogStore
OTLPLogExporterconfigured with Datadog's OTLP endpointapi_keyfor log ingestion and anapplication_keyfor log retrievalserviceandzenml.log_idInteraction of the stack with log stores
Similar to the image builders, if you don't have a log store within your active stack, an
ArtifactLogStoreflavor will be used instead. Since our default approach requires theopentelemetry-sdk, it is added to the pyproject.toml as an additional dependency to the base package.Various other changes
redirectedwhich was defaulted toFalseand never used afterwards. That is now removed.Notes
Pre-requisites
Please ensure you have done the following:
developand the open PR is targetingdevelop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes