Skip to content

Commit 6a56ab5

Browse files
committed
docs(design): service provider
On-behalf-of: @SAP christopher.junk@sap.com Signed-off-by: Christopher Junk <christopher.junk@sap.com>
1 parent 58c27ba commit 6a56ab5

File tree

2 files changed

+154
-0
lines changed

2 files changed

+154
-0
lines changed

docs/about/design/_category_.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
label: Design
Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
# Service Provider Design
2+
3+
## Goals
4+
5+
- Define clear terminology around `ServiceProvider` in the OpenMCP space
6+
- Define `ServiceProvider` scope: responsibilities and boundaries of a `ServiceProvider`
7+
- Define a `ServiceProvider` model that implements the higher level `API`/`Run` platform concept (to allow flexible deployment models, e.g. with `ClusterProvider` kcp)
8+
- Define `ServiceProvider` contract to implement `ServiceProvider` as a loosely coupled component in the openMCP context
9+
- Define how a `ServiceProvider` can be validated
10+
11+
## Non-Goals
12+
13+
tbd
14+
15+
## Object Model
16+
17+
```mermaid
18+
graph TD
19+
%% Onboarding Cluster
20+
subgraph OnboardingCluster/API
21+
SC[ServiceConfig]
22+
end
23+
24+
%% Platform Cluster
25+
subgraph PlatformCluster/RUN
26+
SPO[service-provider-operator]
27+
SP[ServiceProvider]
28+
SPC[ServiceProviderConfig]
29+
end
30+
31+
%% MCP Cluster
32+
subgraph MCPCluster/RUN
33+
DS[DomainService]
34+
DSAPI[DomainServiceAPI]
35+
end
36+
37+
%% WorkloadCluster
38+
subgraph WorkloadCluster/RUN
39+
SDS[SharedDomainService]
40+
end
41+
42+
%% edges
43+
SP -->|installs/reconciles|SC
44+
SP -->|uses|SPC
45+
SP -->|creates/updates/deletes|DS
46+
SP -->|creates/updates/deletes|SDS
47+
DS -->|installs/reconciles|DSAPI
48+
SDS --->|reconciles/XOR|DSAPI
49+
SPO-->|installs/reconciles|SP
50+
```
51+
52+
Open Points:
53+
54+
- Does the `openmcp-operator` manage `ServiceProviders` or do we introduce a new operator for `ServiceProviders`? Benefits of a new component could be clear separation of concerns. The `openmcp-operator` already does a lot and we don't want the next `control-plane-operator`.
55+
- In the above model the `OnboardingCluster` is a continuous `API` cluster. We might want to provision dedicated or shared tenant `API` servers (e.g. with `ClusterProvider` kcp) based on some kind of component discovery that lets the tenant pick its feature/component set. This way the `OnboardingCluster` is only used to onboard new tenants. And we don't run into CRD management hell/bottlenecks.
56+
- Another thought regarding the `OnboardingCluster`. If we introduce tenant `API` clusters, they could be used to create MCPs. This again implies that instead of having the `OnboardingCluster` create `MCPs`, we might want to have the `OnboardingCluster` create `Tenants` as the entry point for users -> start with an identity object like `Tenant` or `Account` instead of a usage artifact like `MCP`.
57+
58+
TODO:
59+
60+
- Illustrate different deployment models with `Run`/`API` concept
61+
- Visually distinguish between `Run` and `API` artifacts
62+
63+
## Terminology
64+
65+
Defines the objects of the [object model](#object-model)
66+
67+
- `ServiceProvider` provides a service in tenant space
68+
- `PlatformService` provides a service in platform space
69+
- `Run` clusters support scheduling workloads. A `Run` cluster may or may not also serve as `API` cluster.
70+
- `API` clusters serve APIs but do not support scheduling workload (note that `API`/`Run` is a higher level platform concept)
71+
- `OnboardingCluster` is part of the platform domain and the config/setup part from a tenant perspective. It serves the `API` of a `ServiceProvider`
72+
- `MCPCluster` is part of the tenant domain and the application/functional part from a tenant perspective. It may or may not run the `Run` of a `ServiceProvider`
73+
- `PlatformCluster` is part of the platform domain and a black box from a tenant perspective. It may or may not run the `Run` of a `ServiceProvider`
74+
- A `ServiceConfig` defines the service provisioning in terms of the `DomainService` `API` and `Run` where e.g. Crossplane could be provisioned for a tenant by installing the `API` on the tenant MCP but the `Run` on a shared worker pool (`WorkloadCluster`) (clarify tenant IAM). A tenant can use this mechanism to decide how to consume a service.
75+
- A `ServiceProviderConfig` defines the config parts that are used in reconcile run, e.g. to define tenant boundaries
76+
77+
## Boundaries
78+
79+
- A `PlatformService` (e.g. `service-provider-operator`) watches platform `API` clusters, e.g. the `OnboardingCluster` and acts on platform `Run` clusters, e.g. itself or shared `WorkloadClusters`. It does not act on tenant clusters, e.g. MCPs
80+
- A `ServiceProvider` watches tenant `API` clusters, e.g. the `OnboardingCluster` and acts on `Run` clusters, e.g. MCPs.
81+
82+
tbc platform space vs tenant space
83+
84+
## Lifecycle
85+
86+
- A `PlatformService` is installed by a platform team and/or bootstrapping mechanism (out of scope)
87+
- A `ServiceProvider` is installed by creating ServiceProvider objects, the `service-provider-operator` manages the lifecycle of `ServiceProviders`... advantages disadvantages
88+
89+
## Validation
90+
91+
A `ServiceProvider` is considered healthy if both its `API` and `Run` part have been successfully synced and are ready for consumption.
92+
93+
The following validation flow validates that a `ServiceProvider` is working as expected:
94+
95+
0. SETUP: Create test environment by installing any `ServiceProvider` prerequisite: a) k8s cluster, e.g. kind, b) install `service-provider-operator` -> wait for operator to be available
96+
1. ASSESS: Request `ServiceProvider` -> wait for `API` and `Run` components to be `synced` and `ready`
97+
2. ASSESS: Consume `API` to provision `DomainService` -> wait for DomainService to be `synced` and `ready`
98+
3. ASSESS: (optional) Consume `DomainServiceAPI` depending on the provider/domain context this may or may not be required
99+
4. ASSESS: Delete `ServiceProvider` -> wait for `API`, `Run`, `ServiceProvider` to be successfully removed
100+
5. TEARDOWN: Delete test environment components
101+
102+
## Runtime
103+
104+
What is a runtime? A runtime is a collection of abstractions and contracts that provides an environment in which user-defined logic is executed.
105+
106+
The service provider runtime is built on top of controller-runtime and provides a service provider specific reconciliation loop.
107+
108+
It provides:
109+
110+
- client abstractions (in xp external clients, in openmcp e.g. reuse common juggler reconcilers like flux?)
111+
- lifecycle management abstractions of `ServiceProviderAPI` objects (the reconcile loop)
112+
- platform specific features (in xp e.g. late initialize, external-name and pause annotations), enables us to implement platform features for all service providers (a `ServiceProvider` only needs to update their runtime dependency)
113+
- handling of cross-cutting concerns like event recording, logging, metrics, rate limits
114+
115+
The following overview illustrates the layers in a simplified way:
116+
117+
| Layer | Description |
118+
| :--- | :--- |
119+
| Service Provider | defines `ServiceProviderAPI` and implements service-provider-runtime operations |
120+
| service-provider-runtime | defines ServiceProvider reconciliation semantics |
121+
| controller-runtime | defines generic reconciliation semantics |
122+
| Kubernetes API machinery | k8s essentials |
123+
| Go runtime / OS kernel | process/thread execution, memory management |
124+
125+
### Execution Model
126+
127+
Here we define what a run/reconcile cycle means, e.g. observe followed by an orchestration of actions like create, update, delete.
128+
129+
This may include special domain semantics similar to `ManagementPolicies` or the `pause` state/mechanism in Crossplane.
130+
131+
### Abstractions and Contracts
132+
133+
Here we define the core interfaces that a consumer (`ServiceProvider` developer) has to implement, e.g. in Crossplane `ExternalConnector` creates `ExternalClient` which implements CRUD operations with `ExternalObservation`, `ExternalCreation`, etc. `Managed` interface defines what makes a k8s object a managed Crossplane resource, e.g. by referencing a `ProviderConfig`, specifying `ManagementPolicies`, `ConnectionSecrets`, etc.
134+
135+
### Observability
136+
137+
Logging, metrics, traces?
138+
139+
## Domain
140+
141+
The actual domain layer of a `ServiceProvider` (layer on top of the [runtime](#runtime)). The foundation to build a `ServiceProvider` template.
142+
143+
### RBAC
144+
145+
What permissions does a service provider need...
146+
147+
## Service Provider Manager
148+
149+
The component that manages the lifecyclee of `ServiceProviders` and provides service discovery to platform `API` clusters, e.g. `OnboardingCluster`.
150+
151+
candidates e.g. `openmcp-operator` or `service-provider-operator`
152+
153+
out of scope?

0 commit comments

Comments
 (0)