From 3bb655ec92f824aaf613f3d7bf008c06455eb9d8 Mon Sep 17 00:00:00 2001 From: Claude Date: Thu, 20 Nov 2025 12:18:23 +0000 Subject: [PATCH] Add comprehensive architecture documentation This commit adds two new documentation files: 1. architecture.md - Comprehensive architecture documentation covering: - Architectural layers and organization - Detailed crate descriptions and responsibilities - Dependency hierarchy and relationships - Key architectural patterns (traits, builders, channels, etc.) - Core data flows (startup, xDS updates, request processing) - Important types and traits - Design decisions and rationale 2. architecture-diagrams.md - Visual documentation with Mermaid diagrams: - Crate dependency graph - Architectural layers diagram - Application startup flow - xDS configuration update flow - Request processing flow - Access logging flow - Cluster selection & load balancing - Component interaction overview - Configuration channel architecture - Thread architecture These documents provide developers with a complete understanding of the orion-kmesh workspace structure, how components interact, and the design principles behind the implementation. --- docs/architecture-diagrams.md | 814 ++++++++++++++++++++++++++++++++++ docs/architecture.md | 787 ++++++++++++++++++++++++++++++++ 2 files changed, 1601 insertions(+) create mode 100644 docs/architecture-diagrams.md create mode 100644 docs/architecture.md diff --git a/docs/architecture-diagrams.md b/docs/architecture-diagrams.md new file mode 100644 index 00000000..e8a71366 --- /dev/null +++ b/docs/architecture-diagrams.md @@ -0,0 +1,814 @@ +# Orion-Kmesh Architecture Diagrams + +This document provides visual representations of the Orion-Kmesh architecture using Mermaid diagrams. These diagrams illustrate the relationships between crates, data flows, and major code interactions. + +## Table of Contents + +1. [Crate Dependency Graph](#crate-dependency-graph) +2. [Architectural Layers](#architectural-layers) +3. [Application Startup Flow](#application-startup-flow) +4. [xDS Configuration Update Flow](#xds-configuration-update-flow) +5. [Request Processing Flow](#request-processing-flow) +6. [Access Logging Flow](#access-logging-flow) +7. [Cluster Selection & Load Balancing](#cluster-selection--load-balancing) +8. [Component Interaction Overview](#component-interaction-overview) +9. [Configuration Channel Architecture](#configuration-channel-architecture) +10. [Thread Architecture](#thread-architecture) + +## Crate Dependency Graph + +This diagram shows the dependencies between all orion-* crates. + +```mermaid +graph TD + %% Layer 0 - Foundation + error[orion-error
Error handling] + header[orion-http-header
HTTP headers] + interner[orion-interner
String interning] + dataplane[orion-data-plane-api
Envoy protos] + + %% Layer 1 - Utilities + format[orion-format
Access log formatting] + metrics[orion-metrics
OpenTelemetry metrics] + tracing[orion-tracing
Distributed tracing] + + %% Layer 2 - Configuration + config[orion-configuration
Config parsing] + + %% Layer 3 - Control Plane + xds[orion-xds
xDS client] + + %% Layer 4 - Runtime + lib[orion-lib
Proxy runtime] + + %% Layer 5 - Application + proxy[orion-proxy
Main application] + + %% Dependencies - Layer 1 + format --> header + format --> interner + metrics --> config + metrics --> interner + tracing --> header + tracing --> interner + tracing --> config + tracing --> error + + %% Dependencies - Layer 2 + config --> error + config --> format + config --> interner + config -.-> dataplane + + %% Dependencies - Layer 3 + xds --> config + xds --> dataplane + xds --> error + + %% Dependencies - Layer 4 + lib --> config + lib --> dataplane + lib --> error + lib --> format + lib --> header + lib --> interner + lib --> metrics + lib --> tracing + lib --> xds + + %% Dependencies - Layer 5 + proxy --> config + proxy --> error + proxy --> format + proxy --> lib + proxy --> metrics + proxy --> tracing + proxy --> xds + + %% Styling + classDef layer0 fill:#e1f5ff,stroke:#01579b,stroke-width:2px + classDef layer1 fill:#f3e5f5,stroke:#4a148c,stroke-width:2px + classDef layer2 fill:#fff3e0,stroke:#e65100,stroke-width:2px + classDef layer3 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px + classDef layer4 fill:#fce4ec,stroke:#880e4f,stroke-width:2px + classDef layer5 fill:#fff9c4,stroke:#f57f17,stroke-width:2px + + class error,header,interner,dataplane layer0 + class format,metrics,tracing layer1 + class config layer2 + class xds layer3 + class lib layer4 + class proxy layer5 +``` + +## Architectural Layers + +This diagram shows the layered architecture of Orion-Kmesh. + +```mermaid +graph TB + subgraph "Layer 5: Application" + proxy[orion-proxy
Main Orchestrator] + end + + subgraph "Layer 4: Runtime" + lib[orion-lib
Proxy Runtime
Listeners, Clusters, Transport] + end + + subgraph "Layer 3: Control Plane" + xds[orion-xds
xDS Client] + end + + subgraph "Layer 2: Configuration" + config[orion-configuration
Config Parser & Validator] + end + + subgraph "Layer 1: Utilities & Observability" + format[orion-format
Access Logging] + metrics[orion-metrics
Metrics] + tracing[orion-tracing
Tracing] + end + + subgraph "Layer 0: Foundation" + error[orion-error
Error Handling] + header[orion-http-header
Headers] + interner[orion-interner
String Interning] + dataplane[orion-data-plane-api
Envoy Protos] + end + + proxy --> lib + proxy --> xds + lib --> xds + lib --> config + xds --> config + lib --> format + lib --> metrics + lib --> tracing + config --> format + metrics --> config + tracing --> config + format --> header + format --> interner + metrics --> interner + tracing --> header + tracing --> interner + config --> error + xds --> error + tracing --> error + lib --> error + proxy --> error + config -.-> dataplane + xds --> dataplane + lib --> dataplane + + classDef layer0 fill:#e1f5ff,stroke:#01579b,stroke-width:2px + classDef layer1 fill:#f3e5f5,stroke:#4a148c,stroke-width:2px + classDef layer2 fill:#fff3e0,stroke:#e65100,stroke-width:2px + classDef layer3 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px + classDef layer4 fill:#fce4ec,stroke:#880e4f,stroke-width:2px + classDef layer5 fill:#fff9c4,stroke:#f57f17,stroke-width:2px + + class error,header,interner,dataplane layer0 + class format,metrics,tracing layer1 + class config layer2 + class xds layer3 + class lib layer4 + class proxy layer5 +``` + +## Application Startup Flow + +This diagram illustrates the initialization sequence when the proxy starts. + +```mermaid +sequenceDiagram + participant Main as main.rs + participant Proxy as orion_proxy::run() + participant Config as Config::new() + participant TracingMgr as TracingManager + participant Runtime as proxy::run_orion() + participant Workers as Worker Threads + + Main->>Main: Setup allocator (jemalloc/dhat) + Main->>Proxy: Call run() + + Proxy->>TracingMgr: Initialize logging + activate TracingMgr + TracingMgr-->>Proxy: Logging ready + deactivate TracingMgr + + Proxy->>Config: Parse CLI options & config files + activate Config + Config->>Config: Deserialize Bootstrap YAML + Config->>Config: Extract Listeners/Clusters/Secrets + Config-->>Proxy: Configuration loaded + deactivate Config + + Proxy->>Proxy: Set RUNTIME_CONFIG global + Proxy->>TracingMgr: Update with LogConfig + + Proxy->>Runtime: Call run_orion() + activate Runtime + + Runtime->>Runtime: launch_runtimes() + Runtime->>Runtime: Compute thread allocation + Runtime->>Runtime: Calculate core affinity + + loop For each worker thread + Runtime->>Workers: Create Tokio runtime + activate Workers + Runtime->>Workers: Spawn thread with core affinity + Workers->>Workers: Initialize listeners + Workers->>Workers: Initialize clusters + Workers->>Workers: Start accepting connections + end + + Runtime->>Runtime: Spawn xDS handler (if configured) + Runtime->>Runtime: Spawn admin API server + Runtime->>Runtime: Setup signal handlers + + Runtime-->>Proxy: All workers running + deactivate Runtime +``` + +## xDS Configuration Update Flow + +This diagram shows how dynamic configuration updates flow through the system. + +```mermaid +sequenceDiagram + participant XDS as xDS Server
(Control Plane) + participant Client as DeltaDiscoveryClient + participant Handler as XdsConfigurationHandler + participant Converter as Type Converters + participant Channels as Config Channels + participant Workers as Worker Threads + participant Listeners as ListenersManager + participant Clusters as ClusterManager + participant Secrets as SecretManager + + Handler->>Handler: resolve_endpoints() + Handler->>Handler: Find xDS cluster from config + Handler->>Client: Connect to xDS server + activate Client + Client->>XDS: Subscribe to resources + + loop Configuration Updates + XDS->>Client: DeltaDiscoveryResponse + Client->>Client: Deserialize protobuf + Client->>Handler: Send XdsResourcePayload + deactivate Client + + activate Handler + Handler->>Handler: Match resource type + + alt Listener Update + Handler->>Converter: Convert Listener proto + Converter-->>Handler: Listener config + Handler->>Channels: Send ListenerConfigurationChange + Channels->>Workers: Broadcast to all workers + Workers->>Listeners: AddOrUpdate(Listener) + Listeners->>Listeners: Rebind or update routes + else Cluster Update + Handler->>Converter: Convert Cluster proto + Converter-->>Handler: Cluster config + Handler->>Channels: Send ClusterConfigurationChange + Channels->>Workers: Broadcast to all workers + Workers->>Clusters: Update cluster endpoints + Clusters->>Clusters: Start health checks + else Route Update + Handler->>Converter: Convert RouteConfig proto + Converter-->>Handler: RouteConfiguration + Handler->>Channels: Send RouteConfigurationChange + Channels->>Workers: Broadcast to all workers + Workers->>Listeners: Update route tables + else Endpoint Update + Handler->>Converter: Convert ClusterLoadAssignment + Converter-->>Handler: Endpoints + Handler->>Clusters: Update load assignment + Clusters->>Clusters: Rebalance connections + else Secret Update + Handler->>Converter: Convert Secret proto + Converter-->>Handler: TLS certificates/keys + Handler->>Secrets: Update secrets + Secrets->>Listeners: Reload TLS contexts + end + + Handler->>Client: Send ACK + activate Client + Client->>XDS: ACK with version + deactivate Client + deactivate Handler + end +``` + +## Request Processing Flow + +This diagram shows the complete lifecycle of an HTTP request through the proxy. + +```mermaid +sequenceDiagram + participant Client as Client + participant Listener as Listener + participant FilterChain as Filter Chain Matcher + participant TLS as TLS Handler + participant HTTP as HTTP Handler + participant Router as Route Matcher + participant Filters as HTTP Filters + participant Cluster as Cluster Manager + participant LB as Load Balancer + participant Upstream as Upstream Server + participant AccessLog as Access Logger + + Client->>Listener: TCP connection + activate Listener + + Listener->>FilterChain: Match connection + activate FilterChain + FilterChain->>FilterChain: Check SNI + FilterChain->>FilterChain: Check ALPN + FilterChain->>FilterChain: Check source IP + FilterChain-->>Listener: Selected filter chain + deactivate FilterChain + + alt TLS configured + Listener->>TLS: TLS handshake + activate TLS + TLS->>TLS: Validate certificate + TLS-->>Listener: TLS session + deactivate TLS + end + + Listener->>HTTP: Create HTTP handler + activate HTTP + Client->>HTTP: HTTP request + + HTTP->>AccessLog: InitContext(start_time) + HTTP->>AccessLog: DownstreamContext(request) + + HTTP->>Router: Match route + activate Router + Router->>Router: Check path prefix + Router->>Router: Check headers + Router->>Router: Check methods + Router-->>HTTP: Selected route + deactivate Router + + HTTP->>Filters: Apply HTTP filters + activate Filters + Filters->>Filters: RBAC check + Filters->>Filters: Rate limiting + Filters->>Filters: Header manipulation + Filters-->>HTTP: Filter result + deactivate Filters + + HTTP->>Cluster: Get cluster + activate Cluster + HTTP->>AccessLog: UpstreamContext(cluster) + + Cluster->>LB: Select endpoint + activate LB + LB->>LB: Filter healthy endpoints + LB->>LB: Apply locality weights + LB->>LB: Round-robin selection + LB-->>Cluster: Selected endpoint + deactivate LB + + Cluster->>Upstream: Forward request + activate Upstream + Upstream-->>Cluster: Response + deactivate Upstream + deactivate Cluster + + HTTP->>AccessLog: DownstreamResponse(response) + HTTP->>Client: Forward response + HTTP->>AccessLog: FinishContext(duration, bytes) + deactivate HTTP + + AccessLog->>AccessLog: Format log message + AccessLog->>AccessLog: Write to log output + + deactivate Listener +``` + +## Access Logging Flow + +This diagram details how access logs are generated and formatted. + +```mermaid +graph TD + subgraph "Request Processing" + A[Request Start] --> B[InitContext] + B --> C[DownstreamContext] + C --> D[Route Matching] + D --> E[UpstreamContext] + E --> F[Forward to Cluster] + F --> G[DownstreamResponse] + G --> H[FinishContext] + end + + subgraph "Log Formatting (orion-format)" + I[LogFormatterLocal] --> J{Thread-local cache} + J -->|Miss| K[Parse format string] + J -->|Hit| L[Cached template] + K --> L + L --> M[Template tree] + end + + subgraph "Context Evaluation" + M --> N{For each Placeholder} + N --> O[Evaluate Operator] + O --> P{Category} + P -->|Init| Q[Timestamp data] + P -->|Downstream| R[Request data] + P -->|Upstream| S[Cluster data] + P -->|Response| T[Status data] + P -->|Finish| U[Duration data] + Q --> V[StringType result] + R --> V + S --> V + T --> V + U --> V + end + + subgraph "Output" + V --> W[FormattedMessage] + W --> X{Output format} + X -->|JSON| Y[JSON writer] + X -->|Text| Z[Text writer] + Y --> AA[Log file/stdout] + Z --> AA + end + + H --> I + B --> Q + C --> R + E --> S + G --> T + H --> U + + style A fill:#e8f5e9 + style AA fill:#fff3e0 + style I fill:#f3e5f5 + style M fill:#e1f5ff + style W fill:#fce4ec +``` + +## Cluster Selection & Load Balancing + +This diagram shows the cluster selection and load balancing process. + +```mermaid +graph TD + A[HTTP Request] --> B{Route Matched} + B --> C[Get Target Cluster] + + C --> D{Cluster Type} + + D -->|Static| E[StaticCluster] + D -->|Dynamic| F[DynamicCluster] + D -->|OriginalDst| G[OriginalDstCluster] + + E --> H[Static Endpoints] + F --> I[DNS Resolution] + I --> J[Resolved Endpoints] + G --> K[Connection Metadata] + K --> L[Original Destination] + + H --> M[Health Check Filter] + J --> M + L --> M + + M --> N{Healthy Endpoints} + N -->|None| O[Connection Failed] + N -->|Available| P[Locality Selection] + + P --> Q{Locality Weights} + Q --> R[Select Locality] + + R --> S{Load Balancing Policy} + + S -->|Round Robin| T[Next in rotation] + S -->|Least Request| U[Endpoint with fewest active] + S -->|Random| V[Random selection] + + T --> W[Selected Endpoint] + U --> W + V --> W + + W --> X{Connection Pool} + X -->|Exists| Y[Reuse connection] + X -->|None| Z[Create new connection] + + Y --> AA[Forward Request] + Z --> AA + + AA --> AB{Response Status} + AB -->|Success| AC[Update Success Metrics] + AB -->|Failure| AD[Update Failure Metrics] + AD --> AE{Retry Policy} + AE -->|Retry| M + AE -->|No Retry| AF[Return Error] + AC --> AG[Return Response] + + style A fill:#e8f5e9 + style W fill:#fff3e0 + style AA fill:#e1f5ff + style AG fill:#e8f5e9 + style O fill:#ffebee + style AF fill:#ffebee +``` + +## Component Interaction Overview + +This high-level diagram shows how major components interact during runtime. + +```mermaid +graph TB + subgraph "Control Plane Communication" + XDS[xDS Server] + XDSClient[DeltaDiscoveryClient] + end + + subgraph "Configuration Management" + Bootstrap[Bootstrap Config] + ConfigChannels[Configuration Channels] + SecretMgr[Secret Manager] + end + + subgraph "Worker Thread 1" + L1[Listeners] + C1[Clusters] + R1[Routes] + end + + subgraph "Worker Thread 2" + L2[Listeners] + C2[Clusters] + R2[Routes] + end + + subgraph "Worker Thread N" + LN[Listeners] + CN[Clusters] + RN[Routes] + end + + subgraph "Observability" + Metrics[OpenTelemetry Metrics] + Tracing[OpenTelemetry Tracing] + AccessLog[Access Logs] + end + + subgraph "Admin API" + Admin[Admin Server] + end + + XDS -.->|gRPC Stream| XDSClient + XDSClient -->|Updates| ConfigChannels + Bootstrap -->|Initial Config| ConfigChannels + + ConfigChannels -->|Listener Updates| L1 + ConfigChannels -->|Listener Updates| L2 + ConfigChannels -->|Listener Updates| LN + + ConfigChannels -->|Cluster Updates| C1 + ConfigChannels -->|Cluster Updates| C2 + ConfigChannels -->|Cluster Updates| CN + + ConfigChannels -->|Route Updates| R1 + ConfigChannels -->|Route Updates| R2 + ConfigChannels -->|Route Updates| RN + + ConfigChannels -->|Secret Updates| SecretMgr + SecretMgr -.->|TLS Contexts| L1 + SecretMgr -.->|TLS Contexts| L2 + SecretMgr -.->|TLS Contexts| LN + + L1 -.->|Request Metrics| Metrics + L2 -.->|Request Metrics| Metrics + LN -.->|Request Metrics| Metrics + + L1 -.->|Trace Spans| Tracing + L2 -.->|Trace Spans| Tracing + LN -.->|Trace Spans| Tracing + + L1 -.->|Access Logs| AccessLog + L2 -.->|Access Logs| AccessLog + LN -.->|Access Logs| AccessLog + + Admin -.->|Query| ConfigChannels + Admin -.->|Query| Metrics + + style XDS fill:#e8f5e9 + style Bootstrap fill:#fff3e0 + style L1 fill:#e1f5ff + style L2 fill:#e1f5ff + style LN fill:#e1f5ff + style Metrics fill:#f3e5f5 + style Tracing fill:#f3e5f5 + style AccessLog fill:#f3e5f5 + style Admin fill:#fff9c4 +``` + +## Configuration Channel Architecture + +This diagram shows the async channel architecture for configuration distribution. + +```mermaid +graph LR + subgraph "Configuration Sources" + Static[Static Config
Bootstrap YAML] + XDS[xDS Updates] + end + + subgraph "Configuration Handler" + Handler[XdsConfigurationHandler] + Converter[Type Converters] + end + + subgraph "Channel Distribution" + ListenerChan[Listener Channel
mpsc::Sender] + RouteChan[Route Channel
mpsc::Sender] + ClusterChan[Cluster Channel
mpsc::Sender] + end + + subgraph "Worker Thread 1" + ListenerRx1[Listener Receiver] + RouteRx1[Route Receiver] + ClusterRx1[Cluster Receiver] + LM1[ListenersManager] + end + + subgraph "Worker Thread 2" + ListenerRx2[Listener Receiver] + RouteRx2[Route Receiver] + ClusterRx2[Cluster Receiver] + LM2[ListenersManager] + end + + subgraph "Worker Thread N" + ListenerRxN[Listener Receiver] + RouteRxN[Route Receiver] + ClusterRxN[Cluster Receiver] + LMN[ListenersManager] + end + + Static --> Handler + XDS --> Handler + Handler --> Converter + Converter -->|ListenerConfigurationChange| ListenerChan + Converter -->|RouteConfigurationChange| RouteChan + Converter -->|ClusterConfigurationChange| ClusterChan + + ListenerChan -.->|Clone & Broadcast| ListenerRx1 + ListenerChan -.->|Clone & Broadcast| ListenerRx2 + ListenerChan -.->|Clone & Broadcast| ListenerRxN + + RouteChan -.->|Clone & Broadcast| RouteRx1 + RouteChan -.->|Clone & Broadcast| RouteRx2 + RouteChan -.->|Clone & Broadcast| RouteRxN + + ClusterChan -.->|Clone & Broadcast| ClusterRx1 + ClusterChan -.->|Clone & Broadcast| ClusterRx2 + ClusterChan -.->|Clone & Broadcast| ClusterRxN + + ListenerRx1 --> LM1 + RouteRx1 --> LM1 + ClusterRx1 --> LM1 + + ListenerRx2 --> LM2 + RouteRx2 --> LM2 + ClusterRx2 --> LM2 + + ListenerRxN --> LMN + RouteRxN --> LMN + ClusterRxN --> LMN + + style Static fill:#fff3e0 + style XDS fill:#e8f5e9 + style Handler fill:#f3e5f5 + style ListenerChan fill:#e1f5ff + style RouteChan fill:#e1f5ff + style ClusterChan fill:#e1f5ff + style LM1 fill:#fce4ec + style LM2 fill:#fce4ec + style LMN fill:#fce4ec +``` + +## Thread Architecture + +This diagram illustrates the multi-threaded runtime architecture. + +```mermaid +graph TB + subgraph "Main Thread" + Main[main.rs] + ProxyInit[Proxy Initialization] + RuntimeLauncher[Runtime Launcher] + end + + subgraph "Configuration Thread" + XDSHandler[xDS Configuration Handler] + XDSClient[Delta Discovery Client] + ConfigProcessor[Config Update Processor] + end + + subgraph "Admin Thread" + AdminAPI[Admin API Server] + ConfigDump[Config Dump Endpoint] + MetricsEndpoint[Metrics Endpoint] + end + + subgraph "Worker Thread 1 (Core 0)" + Runtime1[Tokio Runtime 1] + Listener1[Listener Tasks] + Cluster1[Cluster Tasks] + Health1[Health Check Tasks] + end + + subgraph "Worker Thread 2 (Core 1)" + Runtime2[Tokio Runtime 2] + Listener2[Listener Tasks] + Cluster2[Cluster Tasks] + Health2[Health Check Tasks] + end + + subgraph "Worker Thread N (Core N-1)" + RuntimeN[Tokio Runtime N] + ListenerN[Listener Tasks] + ClusterN[Cluster Tasks] + HealthN[Health Check Tasks] + end + + subgraph "Shared State (Thread-Safe)" + Secrets[Arc RwLock SecretManager] + GlobalConfig[Arc-swap Config] + Tracers[Arc-swap Tracers] + Interner[Global String Interner] + end + + Main --> ProxyInit + ProxyInit --> RuntimeLauncher + RuntimeLauncher -->|Spawn| XDSHandler + RuntimeLauncher -->|Spawn| AdminAPI + RuntimeLauncher -->|Spawn with affinity| Runtime1 + RuntimeLauncher -->|Spawn with affinity| Runtime2 + RuntimeLauncher -->|Spawn with affinity| RuntimeN + + XDSHandler --> XDSClient + XDSClient --> ConfigProcessor + ConfigProcessor -.->|Update| GlobalConfig + ConfigProcessor -.->|Update| Secrets + + AdminAPI --> ConfigDump + AdminAPI --> MetricsEndpoint + ConfigDump -.->|Read| GlobalConfig + MetricsEndpoint -.->|Read| GlobalConfig + + Runtime1 --> Listener1 + Runtime1 --> Cluster1 + Runtime1 --> Health1 + + Runtime2 --> Listener2 + Runtime2 --> Cluster2 + Runtime2 --> Health2 + + RuntimeN --> ListenerN + RuntimeN --> ClusterN + RuntimeN --> HealthN + + Listener1 -.->|Read| Secrets + Listener2 -.->|Read| Secrets + ListenerN -.->|Read| Secrets + + Listener1 -.->|Use| Interner + Listener2 -.->|Use| Interner + ListenerN -.->|Use| Interner + + Listener1 -.->|Use| Tracers + Listener2 -.->|Use| Tracers + ListenerN -.->|Use| Tracers + + style Main fill:#fff9c4 + style XDSHandler fill:#e8f5e9 + style AdminAPI fill:#f3e5f5 + style Runtime1 fill:#e1f5ff + style Runtime2 fill:#e1f5ff + style RuntimeN fill:#e1f5ff + style Secrets fill:#ffebee + style GlobalConfig fill:#ffebee + style Tracers fill:#ffebee + style Interner fill:#ffebee +``` + +## Summary + +These diagrams provide a comprehensive visual representation of the Orion-Kmesh architecture: + +1. **Dependency graphs** show the modular structure and crate relationships +2. **Flow diagrams** illustrate the sequence of operations for key scenarios +3. **Component diagrams** demonstrate runtime interactions and data flow +4. **Architecture diagrams** reveal the thread model and shared state management + +Together with the [architecture documentation](./architecture.md), these diagrams provide a complete picture of how Orion-Kmesh is structured and operates. diff --git a/docs/architecture.md b/docs/architecture.md new file mode 100644 index 00000000..7b3b5715 --- /dev/null +++ b/docs/architecture.md @@ -0,0 +1,787 @@ +# Orion-Kmesh Architecture Documentation + +## Table of Contents + +1. [Overview](#overview) +2. [Architectural Layers](#architectural-layers) +3. [Crate Descriptions](#crate-descriptions) +4. [Dependency Hierarchy](#dependency-hierarchy) +5. [Key Architectural Patterns](#key-architectural-patterns) +6. [Core Data Flows](#core-data-flows) +7. [Important Types and Traits](#important-types-and-traits) +8. [Design Decisions](#design-decisions) + +## Overview + +Orion-Kmesh is a high-performance, Envoy-compatible service mesh proxy implemented in Rust. The architecture is organized into a layered, modular design with 11 specialized crates that work together to provide a complete proxy solution with dynamic configuration, observability, and high throughput. + +### Key Features + +- **Envoy Compatibility**: Full support for Envoy's xDS (Discovery Service) protocol and configuration format +- **Multi-threaded Runtime**: Per-worker Tokio runtimes with CPU core affinity for optimal performance +- **Dynamic Configuration**: Real-time configuration updates via xDS without restarts +- **Comprehensive Observability**: OpenTelemetry metrics and tracing, structured access logging +- **Zero-Copy Design**: Extensive use of Arc and static references to minimize allocations +- **Type Safety**: Rust's type system prevents configuration and runtime errors + +## Architectural Layers + +The crates are organized into six distinct layers, from foundation to application: + +### Layer 0: Foundation +- **orion-error**: Custom error handling with contextual information +- **orion-http-header**: HTTP header constant definitions +- **orion-interner**: String interning for memory efficiency +- **orion-data-plane-api**: Envoy protobuf definitions + +### Layer 1: Utilities & Formatting +- **orion-format**: Envoy-compatible access log formatting +- **orion-metrics**: OpenTelemetry metrics collection +- **orion-tracing**: Distributed tracing with OpenTelemetry + +### Layer 2: Configuration +- **orion-configuration**: Complete Envoy-compatible configuration parsing and validation + +### Layer 3: Control Plane +- **orion-xds**: xDS client implementation with delta subscriptions + +### Layer 4: Runtime +- **orion-lib**: Core proxy runtime (listeners, clusters, routing, transport) + +### Layer 5: Application +- **orion-proxy**: Main application orchestrator and entry point + +## Crate Descriptions + +### orion-error + +**Purpose**: Custom error handling framework with context support + +**Key Features**: +- Type-erased error wrapper supporting any error type +- Context trait for error chaining with additional information +- `WithContext` wrapper for attaching context to concrete errors +- Zero-cost abstractions for error propagation + +**Core Types**: +- `Error`: Main error type wrapping `Box` +- `Result`: Type alias for `std::result::Result` +- `Context` trait: Enables `.with_context()` method on Results + +**Dependencies**: None (foundation layer) + +### orion-http-header + +**Purpose**: Centralized HTTP header constant definitions + +**Key Features**: +- Macro-based header definitions using `HeaderName::from_static()` +- Support for tracing headers (B3, Jaeger, W3C, Datadog) +- Envoy-specific headers (X-Envoy-Original-Path, X-Envoy-Internal, etc.) +- Orion-specific headers (X-Orion-RateLimited) + +**Core Types**: +- Static `HeaderName` constants + +**Dependencies**: http crate only + +### orion-interner + +**Purpose**: String interning for memory efficiency with static lifetime strings + +**Key Features**: +- Global thread-safe interner using `ThreadedRodeo` from lasso crate +- Converts various string types to `&'static str` for zero-copy sharing +- Implementations for `&str`, `String`, `CompactString`, and HTTP `Version` +- Thread-safe global singleton pattern + +**Core Types**: +- `StringInterner` trait with `to_static_str()` method +- `GLOBAL_INTERNER`: Thread-safe global instance + +**Dependencies**: lasso, compact_str + +### orion-format + +**Purpose**: Envoy-compatible access log formatting with operator evaluation + +**Key Features**: +- Parses Envoy format strings into template trees +- Context-based operator evaluation for request/response/timing data +- Thread-local formatting to avoid per-request allocations +- Support for standard and Istio access log formats +- Hierarchical template structure with placeholders + +**Core Types**: +- `LogFormatter`: Main formatter (thread-safe, immutable) +- `LogFormatterLocal`: Thread-local per-thread formatter state +- `FormattedMessage`: Final formatted output +- `Template`: Hierarchical format template tree +- `Context` trait: Provides data for placeholder evaluation + +**Dependencies**: smol_str, bitflags, uuid, traceparent, orion-http-header, orion-interner + +### orion-metrics + +**Purpose**: OpenTelemetry metrics collection and export + +**Key Features**: +- Configurable metrics export to OTEL GRPC collectors +- Support for multiple stat sinks with different endpoints +- Configurable export periods and stat prefixes +- Integration with OpenTelemetry SDK and Prometheus exporter +- Sharded metric collection for thread safety + +**Core Types**: +- `Metrics`: Single metrics sink configuration +- `VecMetrics`: Multiple metrics sinks +- `otel_launch_exporter()`: Initializes OTEL exporters + +**Dependencies**: opentelemetry, opentelemetry-otlp, opentelemetry_sdk, dashmap, orion-configuration, orion-interner + +### orion-tracing + +**Purpose**: Distributed tracing with OpenTelemetry and span context management + +**Key Features**: +- Multi-provider support (OpenTelemetry, extensible) +- Request ID generation and propagation +- B3/Jaeger/W3C traceparent header parsing +- Span state management and context passing +- Global tracer map with lock-free updates via Arc-swap + +**Core Types**: +- `TracingConfig`: Tracing provider configuration +- `SupportedTracingProvider`: Enum of tracing providers +- `span_state`: Thread-local span context +- `trace_context`: Distributed trace context +- `GlobalTracers`: Arc-swap for lock-free tracer updates + +**Dependencies**: opentelemetry, opentelemetry-otlp, opentelemetry_sdk, orion-http-header, orion-interner, orion-configuration, orion-error + +### orion-configuration + +**Purpose**: Complete Envoy-compatible configuration parsing and validation + +**Key Features**: +- Hierarchical YAML/JSON configuration parsing +- Bootstrap configuration containing static/dynamic resources +- Cluster definitions with discovery types (Static, StrictDns, OriginalDst) +- Listener definitions with filter chains and TLS support +- HTTP connection manager with routing configuration +- Network filters (TCP Proxy, RBAC, HTTP RBAC, Local Rate Limiting) +- Secret management (TLS certificates and keys) +- Runtime configuration for thread counts and CPU affinity +- Optional Envoy protobuf conversions + +**Key Modules**: +- `config/bootstrap.rs`: Bootstrap, Node, DynamicResources +- `config/listener.rs`: Listener, FilterChain, TLS configuration +- `config/cluster.rs`: Cluster, discovery types, health checks +- `config/network_filters/`: HTTP connection manager, routing, RBAC +- `config/secret.rs`: Certificate/key storage +- `options.rs`: CLI argument parsing + +**Core Types**: +- `Bootstrap`: Root configuration container +- `Listener`: Network listener definition +- `Cluster`: Upstream cluster definition +- `ClusterLoadAssignment`: Endpoints for cluster +- `RouteConfiguration`: HTTP routing rules +- `Secret`: TLS certificates and keys +- `Runtime`: Thread and affinity configuration + +**Dependencies**: serde_yaml, serde_json, prost, clap, itertools, regex, base64, orion-error, orion-format, orion-interner, orion-data-plane-api (optional) + +### orion-data-plane-api + +**Purpose**: Bridge between Envoy protobuf definitions and Orion configuration + +**Key Features**: +- Re-exports protobuf definitions from envoy-data-plane-api +- Bootstrap loader for reading Envoy protobuf +- Protobuf message decoding and validation +- xDS resource conversion helpers +- Envoy proto validation against configuration rules + +**Core Types**: +- Protobuf message wrappers +- `Resource`: Generic xDS resource wrapper +- `Any`: Protobuf Any type for dynamic typing + +**Dependencies**: envoy-data-plane-api, prost, tokio, tower, futures + +### orion-xds + +**Purpose**: xDS (Envoy Discovery Service) client implementation with delta subscriptions + +**Key Features**: +- Delta Discovery Protocol (xDS v3) client implementation +- Multi-resource type support (Listeners, Clusters, Routes, Endpoints, Secrets) +- Subscription management with resource tracking +- ACK/NACK handling with backoff retry logic +- Background worker for async gRPC communication +- Type-safe bindings via `TypedXdsBinding` trait + +**Key Modules**: +- `xds/model.rs`: XdsResourcePayload, XdsResourceUpdate, XdsError +- `xds/client.rs`: DeltaDiscoveryClient, DiscoveryClientBuilder +- `xds/bindings.rs`: TypedXdsBinding for type-safe client variants + +**Core Types**: +- `XdsResourcePayload`: Enum of resource types (Listener, Cluster, Routes, Endpoints, Secret) +- `XdsResourceUpdate`: Update or Remove operations +- `DeltaDiscoveryClient`: Async client for receiving updates +- `DeltaDiscoverySubscriptionManager`: Subscribe/unsubscribe interface +- `DeltaClientBackgroundWorker`: Background gRPC task +- `XdsError`: Error type with variants + +**Dependencies**: orion-configuration, orion-data-plane-api, orion-error, futures, tokio, tower, async-stream + +### orion-lib + +**Purpose**: Core proxy runtime implementation combining listeners, clusters, routing, and transport + +This is the largest and most complex crate, integrating all other components. + +**Key Features**: +- **Listeners Management**: Listener creation, filter chain evaluation, TLS termination +- **Clusters Management**: Cluster types (Static/Dynamic/OriginalDst), load balancing, endpoint health +- **Transport Layer**: TLS, TCP, gRPC, HTTP/2 connection handling +- **Load Assignment**: Endpoint selection and locality-based routing +- **Health Checking**: Active/passive endpoint health monitoring +- **Access Logging**: Integration with orion-format for structured logs +- **Secrets Management**: TLS certificate/key lifecycle +- **Configuration Runtime**: Channels for receiving configuration updates + +**Key Modules**: +- `configuration.rs`: Main entry point, listeners/clusters conversion +- `clusters/`: ClusterType, PartialClusterType, cluster implementations +- `listeners/`: ListenerFactory, filter chain matching +- `transport/`: TLS, TCP, gRPC, HTTP channels +- `access_log.rs`: Structured access logging +- `secrets/`: Secret management + +**Core Types**: +- `ClusterType`: Enum (Static/Dynamic/OriginalDst with implementations) +- `PartialClusterType`: Builder intermediate +- `ListenerFactory`: Trait for creating listeners +- `SecretManager`: Thread-safe secret storage wrapper +- `ConfigurationSenders/Receivers`: Async channels for config updates +- `ListenerConfigurationChange`: Listener update event enum +- `RouteConfigurationChange`: Route update event enum + +**Dependencies**: ALL other orion-* crates + +### orion-proxy + +**Purpose**: Main application orchestrator tying all components together + +**Key Features**: +- Multi-threaded runtime management with per-thread Tokio runtimes +- xDS configuration handler with delta update processing +- Signal handling for graceful shutdown (SIGTERM, SIGINT) +- Admin API server for debugging (config dump, metrics) +- Core affinity management for CPU binding +- Logging and tracing system initialization +- Access log writer coordination + +**Key Modules**: +- `proxy.rs`: Main orchestration, runtime launch, configuration updates +- `xds_configurator.rs`: xDS client management and configuration distribution +- `admin.rs`: Admin API endpoints +- `runtime.rs`: Per-worker thread management +- `core_affinity.rs`: CPU affinity and binding +- `signal.rs`: Signal handling + +**Entry Points**: +- `main.rs`: Jemalloc/dhat setup, calls `orion_proxy::run()` +- `lib.rs:run()`: Initializes config, logging, then calls `proxy::run_orion()` + +**Dependencies**: orion-configuration, orion-error, orion-format, orion-lib, orion-metrics, orion-tracing, orion-xds + +## Dependency Hierarchy + +``` +Layer 0 (Foundation): + orion-error (0 deps) + orion-http-header (0 deps) + orion-interner (0 deps) + orion-data-plane-api (0 orion deps) + +Layer 1 (Utilities): + orion-format + ← orion-http-header + ← orion-interner + + orion-metrics + ← orion-configuration + ← orion-interner + + orion-tracing + ← orion-http-header + ← orion-interner + ← orion-configuration + ← orion-error + +Layer 2 (Configuration): + orion-configuration + ← orion-error + ← orion-format + ← orion-interner + ← orion-data-plane-api (optional) + +Layer 3 (Control Plane): + orion-xds + ← orion-configuration + ← orion-data-plane-api + ← orion-error + +Layer 4 (Runtime): + orion-lib + ← orion-configuration + ← orion-data-plane-api + ← orion-error + ← orion-format + ← orion-http-header + ← orion-interner + ← orion-metrics + ← orion-tracing + ← orion-xds + +Layer 5 (Application): + orion-proxy + ← orion-configuration + ← orion-error + ← orion-format + ← orion-lib + ← orion-metrics + ← orion-tracing + ← orion-xds +``` + +### Dependency Matrix + +| | error | header | interner | format | metrics | tracing | config | data-plane | xds | lib | proxy | +|------------|-------|--------|----------|--------|---------|---------|--------|------------|-----|-----|-------| +| error | - | - | - | - | - | - | ✓ | - | - | ✓ | ✓ | +| header | - | - | - | - | - | ✓ | - | - | - | ✓ | - | +| interner | - | - | - | ✓ | ✓ | ✓ | ✓ | - | - | ✓ | - | +| format | ✓ | ✓ | ✓ | - | - | - | - | - | - | ✓ | ✓ | +| metrics | - | - | ✓ | - | - | - | ✓ | - | - | ✓ | ✓ | +| tracing | ✓ | ✓ | ✓ | - | - | - | ✓ | - | - | ✓ | ✓ | +| config | ✓ | - | ✓ | ✓ | - | - | - | ✓* | - | ✓ | ✓ | +| data-plane | - | - | - | - | - | - | - | - | - | ✓ | - | +| xds | ✓ | - | - | - | - | - | ✓ | ✓ | - | ✓ | ✓ | +| lib | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | - | ✓ | +| proxy | ✓ | - | - | ✓ | ✓ | ✓ | ✓ | - | ✓ | ✓ | - | + +*\* = optional feature (envoy-conversions)* + +## Key Architectural Patterns + +### 1. Trait-Based Extensibility + +Traits enable polymorphism and extensibility throughout the codebase: + +- **`Context` trait** (orion-error): Enables error chaining with custom data +- **`StringInterner` trait** (orion-interner): Multiple implementations for string interning +- **`Grammar` trait** (orion-format): Pluggable format string parsers +- **`ClusterOps` trait** (orion-lib): Different cluster implementations +- **`TypedXdsBinding` trait** (orion-xds): Type-safe xDS client variants + +### 2. Builder Pattern + +Builders provide fluent APIs for complex object construction: + +- **`DiscoveryClientBuilder`** (orion-xds): Fluent API for xDS client construction +- **`ClusterLoadAssignmentBuilder`** (orion-lib): Incremental cluster configuration +- **`LogFormatter::try_new()`** (orion-format): Parsing with validation + +### 3. Type-Safe Enums with Dispatch + +Enums with associated data provide type-safe polymorphism: + +- **`XdsResourcePayload`** (orion-xds): Tagged union of resource types +- **`ClusterType`** (orion-lib): Enum dispatch for Static/Dynamic/OriginalDst clusters +- **`Template`** (orion-format): Hierarchical format templates +- **`enum_dispatch` macro**: Virtual dispatch without dynamic allocation + +### 4. Channel-Based Async Communication + +MPSC channels enable decoupled asynchronous communication: + +- **`ConfigurationSenders/Receivers`** (orion-lib): Configuration update propagation +- **`DeltaDiscoveryClient`** (orion-xds): Resource update flow from xDS +- **Health check channels** (orion-lib): Health status propagation + +### 5. Thread-Local Caching & Arc-Swap + +Performance optimizations using thread-local storage and lock-free updates: + +- **`LogFormatterLocal`** (orion-format): Per-thread formatter state to avoid allocations +- **`GlobalTracers`** (orion-tracing): Arc-swap for lock-free concurrent tracer updates +- **`ThreadLocal>`**: Patterns throughout for zero-copy sharing + +### 6. Type Erasure + +Type erasure enables storing heterogeneous types: + +- **`Error` type** (orion-error): Stores boxed `dyn ErrorTrait + Send + Sync` +- **`BoxedErr`** (orion-configuration): Generic error wrapper + +### 7. Converter Pattern + +`TryFrom` implementations enable safe type conversions: + +- Configuration types → Runtime types (orion-configuration → orion-lib) +- Envoy protos → Orion config (orion-data-plane-api) +- xDS resources → Runtime payload (orion-xds) + +### 8. Lifecycle Management + +Safe resource lifecycle management patterns: + +- **`OnceLock`**: One-time initialization (RUNTIME_CONFIG, GLOBAL_INTERNER, GLOBAL_TRACERS) +- **`Arc>`**: Concurrent mutable state (SecretManager, configuration) +- **`CancellationToken`**: Graceful shutdown propagation across tasks + +### 9. Observer Pattern + +Event propagation for configuration updates: + +- **`ListenerConfigurationChange`**: Events pushed through channels +- **xDS client subscribers**: Push resource updates +- **Health check events**: Propagate status changes + +## Core Data Flows + +### Application Startup + +``` +main.rs:main() + → Set up allocator (jemalloc/dhat) + → orion_proxy::run() + → Initialize TracingManager (logging setup) + → Options::parse_options() (CLI args) + → Config::new() - Parse config files + → Bootstrap YAML deserialization + → Listener/Cluster/Secret extraction + → Set RUNTIME_CONFIG global + → Update tracing with LogConfig + → proxy::run_orion() + → launch_runtimes() + → Compute thread allocation + → Create per-worker tokio::runtime instances + → Spawn threads with core affinity + → Per-thread: initialize listeners/clusters +``` + +### xDS Configuration Updates + +``` +XdsConfigurationHandler::run_loop() + → resolve_endpoints() - Find xDS cluster + → Look up cluster from static resources + → Get gRPC connections + → Load balance to available endpoints + → start_aggregate_client_no_retry_loop() + → Connect to AggregatedDiscoveryServiceClient + → Create DeltaDiscoveryClient + → Loop: + → client.recv() - Wait for xDS updates + → process_updates() + → For each XdsResourcePayload: + → Match on type (Listener/Cluster/Route/Endpoint/Secret) + → Convert to Orion types + → Send through config change channels + → Update SecretManager with new secrets + → Start health checks for clusters + → Send ACK/NACK to xDS server +``` + +### Request Processing + +``` +ListenerFactory::create_listener() + → Parse Listener config + → Create filter chains + → For each route: + → Parse route match conditions + → Set up route actions (forward to cluster) + → Bind to socket address + → Accept connections: + → Match incoming connection to filter chain + → Check SNI, ALPN, source IP + → Apply network filters (TCP proxy, RBACs) + → Setup TLS (if configured) + → Create HTTP connection handler + → Parse HTTP request + → Match against routes + → Apply HTTP filters + → Forward to selected cluster +``` + +### Cluster & Endpoint Selection + +``` +ClusterType dispatch: + → Static cluster: Use configured endpoints + → Dynamic cluster: Resolve via DNS + → OriginalDstCluster: Use original destination from connection + +Load assignment: + → Select locality (geographic/zone-aware) + → Within locality: Select endpoint using policy + → Round-robin (default) + → Least request + → Random + → Health check filtering: + → Skip unhealthy endpoints + → Maintain connection pools + → Retry on failure +``` + +### Access Logging + +``` +For each request: + → InitContext(start_time) + → DownstreamContext(request, headers) + → UpstreamContext(cluster, authority) + → DownstreamResponse(response, status) + → FinishContext(duration, bytes, flags) + +LogFormatterLocal::with_context() + → For each Placeholder in Template: + → Eval operator against context + → Store StringType result + +FormattedMessage::write_to() + → Write to structured log output + → Format: JSON or text per config +``` + +### Graceful Shutdown + +``` +Signal handler (SIGTERM/SIGINT) + → tokio::spawn() signal monitoring task + → CancellationToken::cancel() + → Broadcast to all workers + → Listeners stop accepting connections + → Existing connections drain gracefully + → Request timeout applies + → xDS client stops subscription + → Admin API server shuts down + → All threads join +``` + +## Important Types and Traits + +### Foundational Traits + +| Trait | Crate | Purpose | Key Methods | +|-------|-------|---------|-------------| +| `Context` | orion-error | Error contextual information | `with_context()`, `with_context_msg()`, `with_context_data()` | +| `StringInterner` | orion-interner | String interning | `to_static_str()` | +| `Grammar` | orion-format | Format string parsing | `parse(input)` | +| `ClusterOps` | orion-lib | Cluster operations | (enum dispatch) | +| `TypedXdsBinding` | orion-xds | Type-safe xDS client | `type_url()` | + +### Core Configuration Types + +| Type | Crate | Purpose | +|------|-------|---------| +| `Bootstrap` | orion-configuration | Root configuration container | +| `Node` | orion-configuration | Proxy identity (id, cluster_id) | +| `Listener` | orion-configuration | Network listener definition | +| `Cluster` | orion-configuration | Upstream cluster definition | +| `ClusterLoadAssignment` | orion-configuration | Endpoints for cluster | +| `RouteConfiguration` | orion-configuration | HTTP routing rules | +| `Secret` | orion-configuration | TLS certificates/keys | +| `Runtime` | orion-configuration | Thread and affinity config | + +### Runtime Types + +| Type | Crate | Purpose | +|------|-------|---------| +| `ClusterType` | orion-lib | Enum: Static/Dynamic/OriginalDst cluster | +| `PartialClusterType` | orion-lib | Builder intermediate for clusters | +| `ListenerFactory` | orion-lib | Creates listeners from config | +| `SecretManager` | orion-lib | Thread-safe secret storage | +| `ConfigurationSenders` | orion-lib | Async configuration channels | +| `ListenerConfigurationChange` | orion-lib | Listener update event | +| `RouteConfigurationChange` | orion-lib | Route update event | + +### xDS Types + +| Type | Crate | Purpose | +|------|-------|---------| +| `XdsResourcePayload` | orion-xds | Enum of resource types | +| `XdsResourceUpdate` | orion-xds | Update or Remove operations | +| `DeltaDiscoveryClient` | orion-xds | Async client for updates | +| `DeltaDiscoverySubscriptionManager` | orion-xds | Subscribe/unsubscribe interface | +| `DeltaClientBackgroundWorker` | orion-xds | Background gRPC task | +| `XdsError` | orion-xds | xDS-specific errors | + +### Observability Types + +| Type | Crate | Purpose | +|------|-------|---------| +| `LogFormatter` | orion-format | Parse and execute format strings | +| `LogFormatterLocal` | orion-format | Thread-local formatter state | +| `FormattedMessage` | orion-format | Final formatted log output | +| `Template` | orion-format | Format template tree | +| `Metrics` | orion-metrics | Single metrics sink config | +| `VecMetrics` | orion-metrics | Multiple metrics sinks | +| `TracingConfig` | orion-tracing | Tracing provider configuration | +| `SupportedTracingProvider` | orion-tracing | OpenTelemetry or others | + +### Key Enums + +```rust +// orion-error +pub enum ErrorImpl { + Error(BoxedErr), + Context(ErrorInfo, BoxedErr) +} + +// orion-format +pub enum Template { + Char(char), + Literal(SmolStr), + Placeholder(Operator, Category) +} + +pub enum StringType { + Char(char), + Smol(SmolStr), + Bytes(Box<[u8]>), + None +} + +// orion-lib +pub enum ClusterType { + Static(StaticCluster), + Dynamic(DynamicCluster), + OriginalDst(OriginalDstCluster) +} + +pub enum ListenerConfigurationChange { + AddOrUpdate(Listener), + Remove(String) +} + +// orion-xds +pub enum XdsResourcePayload { + Listener(...), + Cluster(...), + Endpoints(...), + RouteConfiguration(...), + Secret(...) +} + +pub enum XdsResourceUpdate { + Update(ResourceId, Payload, Version), + Remove(ResourceId, TypeUrl) +} + +// orion-configuration +pub enum ClusterDiscoveryType { + Static(ClusterLoadAssignment), + StrictDns(ClusterLoadAssignment), + OriginalDst +} + +pub enum MainFilter { + HttpConnectionManager(...), + TcpProxy(...) +} +``` + +## Design Decisions + +### 1. Clean Separation of Concerns + +Each crate has a specific, well-defined responsibility with minimal overlap. This enables: +- Independent testing and maintenance +- Clear ownership boundaries +- Easier onboarding for new developers +- Parallel development + +### 2. Type-Safe Configuration + +Extensive use of Rust's type system prevents configuration errors at compile time: +- Strong typing for all configuration fields +- `TryFrom` conversions with validation +- Compile-time enforcement of required fields +- No stringly-typed configuration + +### 3. Zero-Copy Data Passing + +Performance optimization through minimizing allocations: +- `Arc<>` for shared ownership without copying +- `&'static` references via string interning +- Thread-local caching for per-request data +- Copy-on-write patterns where needed + +### 4. Lock-Free Updates + +Concurrency optimizations to reduce contention: +- `Arc-swap` for configuration updates without locks +- Thread-local storage for per-thread state +- Message passing instead of shared mutable state +- Lock-free data structures where possible + +### 5. Graceful Shutdown + +Clean resource cleanup on shutdown: +- `CancellationToken` for cooperative cancellation +- Async task coordination +- Connection draining before termination +- Timeout-based forced shutdown + +### 6. Pluggable Error Context + +`WithContext` pattern allows adding context without breaking APIs: +- Error chaining without wrapping types +- Contextual information attached to errors +- Preserves original error type information +- No performance overhead when not used + +### 7. Format String Safety + +Grammar-based parsing prevents runtime format errors: +- Parse-time validation of format strings +- Type-safe operator evaluation +- Compile-time template structure +- No string interpolation vulnerabilities + +### 8. Observable by Default + +First-class support for observability: +- OpenTelemetry metrics and tracing integration +- Structured access logging +- Admin API for runtime introspection +- Configurable observability backends + +### 9. Multi-Threaded Architecture + +Per-worker runtimes prevent shared state contention: +- One Tokio runtime per worker thread +- CPU core affinity for cache locality +- Minimal cross-thread communication +- Thread-local caching for performance + +### 10. Envoy Compatibility + +Full protobuf compatibility ensures ecosystem integration: +- Support for Envoy's xDS protocol +- Compatible configuration format +- Interoperability with Istio and other control planes +- Standard metrics and trace formats + +## Conclusion + +The Orion-Kmesh architecture demonstrates a well-designed, layered approach to building a high-performance service mesh proxy. The clear separation of concerns, type-safe design, and performance optimizations make it both maintainable and efficient. The modular crate structure enables independent evolution of components while maintaining a cohesive system.