|
| 1 | +# Service Providers |
| 2 | + |
| 3 | +This document outlines the `ServiceProvider` domain and its technical considerations within the context of the [openMCP project](https://github.com/openmcp-project/), providing a foundation for understanding its architecture and operational aspects. |
| 4 | + |
| 5 | +## Goals |
| 6 | + |
| 7 | +- Define clear terminology around `ServiceProvider` within the openMCP project |
| 8 | +- Establish the scope of a `ServiceProvider`, including its responsibilities and boundaries |
| 9 | +- Define a `ServiceProvider` implementation layer to implement common features and ensure consistency across `ServiceProvider` instances |
| 10 | +- Outline how a `ServiceProvider` can be validated |
| 11 | + |
| 12 | +## Non-Goals |
| 13 | + |
| 14 | +- `ServiceProviders` are not required to deploy their `DomainService` on `WorkloadClusters`. For now, a `DomainService` can be deployed on either a `WorkloadCluster` or `MCPCluster`. However, newly developed services should prioritize deploying their workloads on `WorkloadClusters`. |
| 15 | +- Define a `ServiceProvider` model that implements a higher level `API`/`Run` platform concept (e.g., to allow flexible deployment models, e.g. with `ClusterProvider` [kcp](https://github.com/kcp-dev/kcp)) |
| 16 | + |
| 17 | +## Terminology |
| 18 | + |
| 19 | +- `End Users`: These are the consumers of services provided by an openMCP platform installation. They operate on the `OnboardingCluster` and `MCPCluster` (see [deployment model](#deployment-model)). |
| 20 | +- `Platform Operators`: These are either human users or technical systems that are responsible for managing an openMCP platform installation. While they may operate on any cluster, their primary focus is on the `PlatformCluster` and `WorkloadCluster`. |
| 21 | + |
| 22 | +## Domain |
| 23 | + |
| 24 | +A `ServiceProvider` enables platform operators to offer managed `DomainServices` to end users. A `DomainService` is a third-party service that delivers its functionality to end users through a `DomainServiceAPI`. |
| 25 | + |
| 26 | +For example, consider an openMCP installation that aims to provide [Crossplane](https://www.crossplane.io/) as a managed service to its end user. Let's assume that end users specifically want to use the `Object` API of [provider-kubernetes](https://github.com/crossplane-contrib/provider-kubernetes), to create Kubernetes objects on their own Kubernetes clusters without the need to manage Crossplane themselves. |
| 27 | + |
| 28 | +If we map this to the terminology of a `DomainService` and `DomainServiceAPI`: |
| 29 | + |
| 30 | +- The `DomainService` is `Crossplane`. |
| 31 | +- The `DomainServiceAPI` is `Object`. |
| 32 | + |
| 33 | +:::info |
| 34 | +Note that `provider-kubernetes` depends on a running Crossplane installation to function properly. Therefore, `provider-kubernetes` itself cannot be considered a `DomainService`. |
| 35 | +::: |
| 36 | + |
| 37 | +The following subsections describe the objects that a `ServiceProvider` introduces. |
| 38 | + |
| 39 | +### API |
| 40 | + |
| 41 | +A `ServiceProvider` defines a `ServiceProviderAPI` to allow end users to request managed service. It is important to distinguish between `ServiceProviderAPI` and `DomainServiceAPI`. |
| 42 | + |
| 43 | +```mermaid |
| 44 | +graph LR |
| 45 | + USR[EndUser] |
| 46 | + SPA[ServiceProviderAPI] |
| 47 | + DSA[DomainServiceAPI] |
| 48 | + USR -->|manages instances|SPA |
| 49 | + USR -->|manages instances|DSA |
| 50 | +``` |
| 51 | + |
| 52 | +While both are end user facing, they serve different purposes: |
| 53 | + |
| 54 | +- The `ServiceProviderAPI` allows end users to request a `DomainService` and gain access to its `DomainServiceAPI`. |
| 55 | +- The `DomainServiceAPI` delivers direct value to end users by providing the functionality of a `DomainService`. |
| 56 | + |
| 57 | +```mermaid |
| 58 | +sequenceDiagram |
| 59 | + %% participants |
| 60 | + participant usr as EndUser |
| 61 | + participant sp as ServiceProvider |
| 62 | + participant ds as DomainService |
| 63 | +
|
| 64 | + %% messages |
| 65 | + usr->>sp: request domain service through ServiceProviderAPI |
| 66 | + sp->>ds: deploys |
| 67 | + sp-->>usr: installs DomainServiceAPI in user MCP |
| 68 | + usr->>ds: uses domain service through DomainServiceAPI |
| 69 | +``` |
| 70 | + |
| 71 | +### Config |
| 72 | + |
| 73 | +A `ServiceProvider` defines a `ServiceProviderConfig` that contains provider-specific options for platform operators to specify a managed service offering. For example, [service-provider-crossplane](https://github.com/openmcp-project/service-provider-crossplane/) allows platform operators to decide which Crossplane providers can be installed by end user as part of the managed service. |
| 74 | + |
| 75 | +```mermaid |
| 76 | +graph LR |
| 77 | + %% PlatformOperator |
| 78 | + OP[PlatformOperator] |
| 79 | + SP[ServiceProvider] |
| 80 | + SPA[ServiceProviderAPI] |
| 81 | + SPC[ServiceProviderConfig] |
| 82 | + OP -->|manages instances|SP |
| 83 | + OP -->|manages instances|SPC |
| 84 | + OP -. installs .-> SPA |
| 85 | +``` |
| 86 | + |
| 87 | +All operator tasks may be partially or fully automated. |
| 88 | + |
| 89 | +:::info |
| 90 | +The `ServiceProvider` object itself is a higher level platform concept that is described in the corresponding `PlatformService`, i.e. [openmcp-operator](https://github.com/openmcp-project/openmcp-operator). |
| 91 | +::: |
| 92 | + |
| 93 | +### Service Discovery and Access Management |
| 94 | + |
| 95 | +End users need to be aware of a) the available managed services, and b) valid input values to consume a service offering. |
| 96 | + |
| 97 | +A) The available service offerings are made visible by installing the `ServiceProviderAPI` on the `OnboardingCluster` (see [deployment model](#deployment-model)). This ensures that any platform tenant is aware of all available `ServiceProviderAPIs`. In other words, the platform does not hide its end-user-facing feature set, even if a user belongs to a tenant that cannot successfully consume a specific `ServiceProviderAPI`. |
| 98 | + |
| 99 | +B) Valid input values are communicated through a yet-to-be-defined 'Marketplace'-like API provided by a `PlatformService`. Note: This is still work in progress and outside the scope of this document. |
| 100 | + |
| 101 | +### Deployment Model |
| 102 | + |
| 103 | +A `ServiceProvider` runs on the `PlatformCluster` and reconcile its `ServiceProviderAPI` on the `OnboardingCluster`. It deploys a `DomainService` on either a `WorkloadCluster` or `MCPCluster`, which then reconciles the `DomainServiceAPI`. |
| 104 | + |
| 105 | +```mermaid |
| 106 | +graph TD |
| 107 | + %% Onboarding Cluster |
| 108 | + subgraph OnboardingCluster |
| 109 | + SPAPI[ServiceProviderAPI] |
| 110 | + end |
| 111 | +
|
| 112 | + %% Platform Cluster |
| 113 | + subgraph PlatformCluster |
| 114 | + SPO[openMCP-operator] |
| 115 | + SP[ServiceProvider] |
| 116 | + SPC[ServiceProviderConfig] |
| 117 | + end |
| 118 | +
|
| 119 | + %% MCP Cluster |
| 120 | + subgraph MCPCluster |
| 121 | + DS[DomainService] |
| 122 | + DSAPI[DomainServiceAPI] |
| 123 | + end |
| 124 | +
|
| 125 | + %% edges |
| 126 | + SP -->|reconciles|SPAPI |
| 127 | + SP -->|uses|SPC |
| 128 | + SP -->|manages|DSAPI |
| 129 | + SP -->|manages|DS |
| 130 | + DS -->|reconciles|DSAPI |
| 131 | + SPO-->|reconciles|SP |
| 132 | +``` |
| 133 | + |
| 134 | +The `DomainServiceAPI` is reconciled either on the `MCPCluster` or a `WorkloadCluster`. The following diagram illustrates two simplified `DomainService` examples, `Landscaper` and `Crossplane`, along with their corresponding `DomainServiceAPIs`, `Installation` and `Bucket`. |
| 135 | + |
| 136 | +```mermaid |
| 137 | +graph TD |
| 138 | + %% Workload Cluster |
| 139 | + subgraph WorkloadCluster |
| 140 | + Landscaper |
| 141 | + end |
| 142 | +
|
| 143 | + %% MCP Cluster |
| 144 | + subgraph MCPCluster |
| 145 | + Crossplane |
| 146 | + Installation |
| 147 | + Bucket |
| 148 | + end |
| 149 | +
|
| 150 | + %% edges |
| 151 | + Landscaper -->|reconciles|Installation |
| 152 | + Crossplane -->|reconciles|Bucket |
| 153 | +``` |
| 154 | + |
| 155 | +:::info |
| 156 | +In the long term, the goal is to deploy every `DomainService` on `WorkloadClusters`. Newly developed services should prioritize deploying their workloads on `WorkloadClusters` rather than `MCPClusters`. |
| 157 | +::: |
| 158 | + |
| 159 | +## Validation |
| 160 | + |
| 161 | +A `ServiceProvider` is considered healthy if both its `API` and `Run` components have been successfully synced and are ready for consumption. |
| 162 | + |
| 163 | +The following validation flow validates that a `ServiceProvider` is functioning as expected: |
| 164 | + |
| 165 | +0. SETUP: Create test environment by installing any `ServiceProvider` prerequisite: a) create `PlatformCluster` with kind, b) install [openmcp-operator](https://github.com/openmcp-project/openmcp-operator) and [cluster-provider-kind](https://github.com/openmcp-project/cluster-provider-kind) and wait for everything to become available |
| 166 | +1. ASSESS: Request `ServiceProvider` and wait for `ServiceProvider` deployment and `ServiceProviderAPI` to become available |
| 167 | +2. ASSESS: Consume `ServiceProviderAPI` to provision a `DomainService` and wait for the `DomainService` and `DomainServiceAPI` to become available |
| 168 | +3. ASSESS: Consume the `DomainServiceAPI` and validate that the `DomainService` is functioning as expected |
| 169 | +4. ASSESS: Delete the `ServiceProviderAPI` object and wait for the `DomainService` deployment and `DomainServiceAPI` to be successfully removed |
| 170 | +5. TEARDOWN: Delete the `ServiceProvider` and clean up by deleting the test environment components |
| 171 | + |
| 172 | +## Runtime |
| 173 | + |
| 174 | +A runtime is a collection of abstractions and contracts that provides an environment for executing user-defined logic. This establishes a clear separation between `ServiceProvider` the developer domain and the platform developer domain. |
| 175 | + |
| 176 | +The `service-provider-runtime` is built on top of `controller-runtime` and introduces a service provider specific reconciliation loop. The design enables us as a platform to implement platform specific features around service providers, while allowing `ServiceProvider` developers to focus solely on `DomainService` specific logic without needing to understand platform internals. This approach ensures a consistent experience for both end users and developers when working with `ServiceProviders`. |
| 177 | + |
| 178 | +The following table provides a simplified overview of the layers within a `ServiceProvider` controller: |
| 179 | + |
| 180 | +| Layer | Description | Target Audience | |
| 181 | +| :--- | :--- | :--- | |
| 182 | +| Service Provider | Defines `ServiceProviderAPI`/`ServiceProviderConfig` and implements service-provider-runtime operations | Service provider developers | |
| 183 | +| service-provider-runtime | Defines ServiceProvider reconciliation semantics | Platform developers | |
| 184 | +| multicluster/controller-runtime | Defines generic reconciliation semantics | Out of scope | |
| 185 | +| Kubernetes API machinery | Kubernetes essentials | Out of scope | |
| 186 | + |
| 187 | +### Functionality |
| 188 | + |
| 189 | +This section outlines the main functionality implemented within the runtime. Currently, the focus is on establishing consistency across `ServiceProvider` implementations. However, this section can be extended in the future to include more generic `ServiceProvider` concepts that are handled within the runtime. |
| 190 | + |
| 191 | +Main tasks towards MCP/Workload Clusters (based on watching the `ServiceProviderAPI`): |
| 192 | + |
| 193 | +- Observe Service Deployment (Drift Detection) -> IN: context, apiObject, reconcileScope; OUT: bool[exists, drift], error |
| 194 | +- Create Service Deployment (Init Lifecycle) -> IN: context, apiObject, reconcileScope; OUT: error |
| 195 | +- Update Service Deployment (Reconcile Drift) -> IN: context, apiObject, reconcileScope; OUT: error |
| 196 | +- Delete Service Deployment (End Lifecycle) -> IN: context, apiObject, reconcileScope; OUT: error |
| 197 | + |
| 198 | +In this context, `reconcileScope` holds the `ServiceProviderConfig` and provides clients to access onboarding, mcp and workload clusters. |
| 199 | + |
| 200 | +Main tasks towards Platform Cluster: |
| 201 | + |
| 202 | +- Resolve `ServiceProviderConfig`. If no `ServiceProviderConfig` can be resolved, the service request will fail. |
| 203 | + |
| 204 | +### Reconcile Sequence |
| 205 | + |
| 206 | +```mermaid |
| 207 | +sequenceDiagram |
| 208 | + %% participants |
| 209 | + participant cr as controller-runtime |
| 210 | + participant spr as service-provider-runtime |
| 211 | + participant sp as service-provider |
| 212 | +
|
| 213 | + %% messages |
| 214 | + cr->>spr: reconcile |
| 215 | + spr->>spr: set up reconcileScope |
| 216 | + spr-->>cr: end if no service provider config exists |
| 217 | + spr->>spr: fetch API object from onboarding cluster |
| 218 | + spr->>sp: observe(apiObject, reconcileScope) |
| 219 | + sp-->>spr: exists/drift |
| 220 | + spr->>sp: create/update/delete(apiObject, reconcileScope) |
| 221 | + spr-->>cr: requeueAfter |
| 222 | +``` |
| 223 | + |
| 224 | +:::info |
| 225 | +The validation of a `ServiceProviderConfig`, if required, is part of `ServiceProvider` layer and not the runtime layer. |
| 226 | +::: |
| 227 | + |
| 228 | +## Related Artifacts |
| 229 | + |
| 230 | +The following artifacts are derived from this document and must be continuously updated to maintain consistency: |
| 231 | + |
| 232 | +- Service Provider Template |
| 233 | +- Service Provider Runtime |
| 234 | +- Service Provider Development Guide |
| 235 | + |
| 236 | +## Out of Scope |
| 237 | + |
| 238 | +The remainder of this document contains topics that are out of scope for now. |
| 239 | + |
| 240 | +### Multicluster Execution Model |
| 241 | + |
| 242 | +Multi-cluster functionality for `ServiceProvider` is a design goal for future iterations and might get integrated into `service-provider-runtime`. This would generally enable to run any `DomainService` on shared `WorkloadCluster`. |
| 243 | + |
| 244 | +An approach could be to sync API objects between `API` and `RUN` clusters as a feature of service-provider-runtime. |
| 245 | + |
| 246 | +```mermaid |
| 247 | +graph TD |
| 248 | + %% WorkloadCluster |
| 249 | + subgraph WorkloadCluster/RUN |
| 250 | + DS[DomainService/RUN] |
| 251 | + DSAACopy[DomainServiceAPICopyA] |
| 252 | + DSABCopy[DomainServiceAPICopyB] |
| 253 | + DSACCopy[DomainServiceAPICopyC] |
| 254 | + subgraph ServiceProviderInstance |
| 255 | + SPR[service-provider-runtime] |
| 256 | + end |
| 257 | + end |
| 258 | +
|
| 259 | + %% MCPClusterA |
| 260 | + subgraph MCPClusterA/API |
| 261 | + DSAA[DomainServiceAPI] |
| 262 | + end |
| 263 | +
|
| 264 | + %% MCPClusterB |
| 265 | + subgraph MCPClusterB/API |
| 266 | + DSAB[DomainServiceAPI] |
| 267 | + end |
| 268 | +
|
| 269 | + %% MCPClusterC |
| 270 | + subgraph MCPClusterC/API |
| 271 | + DSAC[DomainServiceAPI] |
| 272 | + end |
| 273 | +
|
| 274 | + %% edges |
| 275 | + DS -->|reconciles| DSAACopy |
| 276 | + DS -->|reconciles| DSABCopy |
| 277 | + DS -->|reconciles| DSACCopy |
| 278 | + DSAACopy ---|sync| SPR |
| 279 | + DSABCopy ---|sync| SPR |
| 280 | + DSACCopy ---|sync| SPR |
| 281 | + SPR -->|sync|DSAA |
| 282 | + SPR -->|sync|DSAB |
| 283 | + SPR -->|sync|DSAC |
| 284 | +``` |
| 285 | + |
| 286 | +### Ideas |
| 287 | + |
| 288 | +- `SoftDelete` platform concept. A `managed` service can transition to a `unmanaged` service by soft deleting its corresponding `ServiceProviderConfig` without losing the `DomainService`. This way a tenant could offboard itself partially or entirely from the platform without losing the provisioned infrastructure. This obviously depends on the ownership model of the infrastructure. |
| 289 | +- Distinguish between `Run` and `API` artifacts on all platform layers |
| 290 | + |
| 291 | +### Terminology |
| 292 | + |
| 293 | +- `Run` clusters support scheduling workloads. A `Run` cluster may or may not also serve as `API` cluster. |
| 294 | +- `API` clusters serve APIs but do not support scheduling workload (note that `API`/`Run` is a higher level platform concept) |
| 295 | + |
| 296 | +### References |
| 297 | + |
| 298 | +Projects with similar concepts: |
| 299 | + |
| 300 | +- [Crossplane](https://www.crossplane.io/) |
| 301 | +- [kube-bind](https://github.com/kube-bind/kube-bind) |
| 302 | +- [multicluster-runtime](https://github.com/kubernetes-sigs/multicluster-runtime) |
0 commit comments