Skip to content

Conversation

@keivenchang
Copy link
Contributor

@keivenchang keivenchang commented Dec 2, 2025

Overview:

This PR removes the legacy NATS metrics reporting infrastructure from the runtime, eliminating ~1200 lines of unused code while maintaining all passing tests.

Details:

  • Removed NATS metric constants and prometheus metric structs (ComponentNatsServerPrometheusMetrics, DRTNatsClientPrometheusMetrics)
  • Removed nats_metrics_worker background task that collected NATS service metrics
  • Removed stats_handler infrastructure including EndpointStatsHandler type and component/service.rs file
  • Removed start_stats_service() and add_stats_service() methods from DistributedRuntime
  • Updated runtime examples to remove stats_handler usage
  • Updated tests to remove NATS metrics filtering
  • Clarified distributed_runtime.md documentation

Where should the reviewer start?

  • lib/runtime/src/metrics.rs - Main metrics removal (~573 lines)
  • lib/runtime/src/distributed.rs - Removed stats service methods
  • lib/runtime/src/component/service.rs - Deleted file

Related Issues:

/coderabbit profile chill

.metrics_registry
.add_update_callback(nats_client_callback);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also remove start_stats_service and add_stats_service from this file, and all of lib/runtime/src/component/service.rs ?

Copy link
Contributor Author

@keivenchang keivenchang Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I'll remove these unused code with pleasure. Coming up!

@grahamking
Copy link
Contributor

Is it Xmas already?!

@keivenchang keivenchang marked this pull request as ready for review December 2, 2025 22:12
@keivenchang keivenchang requested review from a team as code owners December 2, 2025 22:12
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

Walkthrough

The PR removes Prometheus metrics instrumentation for NATS throughout the runtime codebase. This includes deleting NATS-specific metric definitions, metrics wrapper structs, metrics integration in initialization paths, test helpers, and adjusting test assertions to account for fewer metrics.

Changes

Cohort / File(s) Summary
NATS Metrics Infrastructure
lib/runtime/src/metrics/prometheus_names.rs, lib/runtime/src/metrics.rs
Removed public modules nats_client and nats_service, and public constants DRT_NATS_METRICS and COMPONENT_NATS_METRICS. Updated metrics.rs to remove NATS-related imports and test helpers like remove_nats_lines and extract_nats_lines.
NATS Metrics Wrapper Structs
lib/runtime/src/transports/nats.rs, lib/runtime/src/service.rs
Removed public struct DRTNatsClientPrometheusMetrics from nats.rs and ComponentNatsServerPrometheusMetrics from service.rs, along with their associated impl blocks and methods.
Runtime Initialization
lib/runtime/src/distributed.rs, lib/runtime/src/component.rs
Removed NATS metrics initialization in DistributedRuntime setup (metrics creation, callback registration, and background task spawning). Removed ComponentNatsServerPrometheusMetrics import from component.rs.
Test Updates
lib/runtime/src/metrics.rs, lib/runtime/src/system_status_server.rs, tests/utils/payloads.py
Removed NATS metrics filtering logic in test assertions, disabled NATS-related integration tests in metrics.rs, and updated minimum dynamo_component_\* metric requirement from 23 to 11 in payloads.py.
Documentation
docs/design_docs/distributed_runtime.md
Updated Component initialization description to clarify NATS service group naming using {namespace_name}.{service_name} format and its registration in the internal registry.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • lib/runtime/src/distributed.rs: Large block of metrics initialization code removed; verify all cleanup is complete and no dangling references remain.
  • lib/runtime/src/metrics.rs: Significant test suite modifications and removal of NATS-related integration tests; ensure test logic remains sound after filtering removal.
  • lib/runtime/src/metrics/prometheus_names.rs: Public API surface reduction with module and constant removals; verify no external dependents rely on these exports.
  • tests/utils/payloads.py: Metric threshold change from 23 to 11; confirm this aligns with the actual metric reduction and doesn't mask test failures.

Poem

🐰 Out with NATS metrics tangled tight,
Simpler signals, cleaner sight!
Runtime hops with lighter grace,
Prometheus clutter finds its place. 🌟

Pre-merge checks

✅ Passed checks (3 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The title accurately describes the main objective of the changeset: removing legacy NATS metrics and stats_handler infrastructure.
Description check ✅ Passed The description follows the template structure with Overview, Details, Where should the reviewer start, and Related Issues sections, providing comprehensive information about the changes.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@keivenchang keivenchang force-pushed the keivenchang/remove-legacy-nats-metrics-code branch from 883c4a6 to ec1c7bb Compare December 2, 2025 22:44
@keivenchang keivenchang changed the title refactor: remove legacy NATS metrics reporting infrastructure refactor: remove legacy NATS metrics and stats_handler Dec 2, 2025
@keivenchang keivenchang force-pushed the keivenchang/remove-legacy-nats-metrics-code branch from 3c68913 to 674e945 Compare December 2, 2025 22:52
Remove all NATS metrics collection and reporting code from the runtime:

- Remove NATS metric constants (nats_client, nats_service modules)
- Remove ComponentNatsServerPrometheusMetrics struct and impl
- Remove DRTNatsClientPrometheusMetrics struct and impl
- Remove nats_metrics_worker background task
- Remove NATS metrics tests
- Remove NATS metrics helper functions
- Remove stats_handler infrastructure (EndpointStatsHandler, service.rs)
- Update tests to not filter NATS metrics
- Update runtime examples to remove stats_handler usage
- Clarify distributed_runtime.md documentation

This removes ~1200 lines of code while maintaining all passing tests.
The system no longer collects or reports NATS client/service metrics.

Fix assert_eq! argument order in metrics tests

The assert_eq! macro arguments were swapped relative to the error message
template. Fixed by swapping 'Expected' and 'Actual' labels in the error
message to match the actual order of arguments (actual, expected).

Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>
@keivenchang keivenchang force-pushed the keivenchang/remove-legacy-nats-metrics-code branch from 674e945 to 424d2da Compare December 3, 2025 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants