Skip to content

Commit 95c5526

Browse files
committed
fixup! Merge pull request kubernetes#5662 from kannon92/nominate-kannon92-prr-approver
1 parent 2a881c6 commit 95c5526

File tree

1 file changed

+167
-0
lines changed
  • keps/sig-instrumentation/4785-resource-state-metrics

1 file changed

+167
-0
lines changed

keps/sig-instrumentation/4785-resource-state-metrics/README.md

Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -547,6 +547,173 @@ status:
547547
[3x faster]: https://github.com/rexagod/resource-state-metrics/blob/main/tests/bench/bench.sh
548548
[plural ambiguities]: https://github.com/kubernetes-sigs/kubebuilder/issues/3402
549549
550+
For BETA graduation, we plan on moving away from the extensible
551+
resolvers-based architecture to a single stub-based configuration.
552+
This is a direct consequence of the fact that it is not possible
553+
to sustainably maintain (from contributors' perspective) and utilize
554+
(from users' perspective) the former architecture (introduced in
555+
ALPHA) without facing the same set of issues that Kube State Metrics'
556+
Custom Resource State API faces today (passing context between
557+
fields, relying on workarounds, whenever possible, to achieve
558+
relatively simple metric generation use-cases, educating oneself
559+
on the steep nature and the side-effects of Kube State Metrics'
560+
Custom Resource State API's declarations to avoid pitfalls, etc).
561+
562+
Possibly, the **only** way to achieve sustainable stability for
563+
both of the aforementioned audiences is to not rely on multiple
564+
resolvers that compensate for each other's shortcomings, but to
565+
have a single, well-defined way of declaring metric generation
566+
configurations. This approach **must be** turing-complete, recognizing
567+
the fact that expression-based languages, such as CEL or `expr`,
568+
are not sufficient for the task at hand. Furthermore, they still
569+
introduce a steep learning curve for users unfamiliar with these
570+
DSLs. Promising DSLs still lack the constructs and principals that
571+
are necessary to express complex metric generation configurations
572+
properly, that is, they may work for simple, even some complex
573+
use-cases, but they will end up being barely maintainable or readable
574+
as the use-cases get more complex.
575+
576+
The stub-based configuration, on the other hand, relies on Golang
577+
itself, not compiled, but [interpreted] at runtime, which is a
578+
language that is widely known and used in the Kubernetes ecosystem.
579+
This will significantly lower the learning curve for users, while
580+
also providing the necessary constructs to express complex use-cases
581+
properly and cleanly. Folks can leverage reusability to import
582+
similar code into stubs, and can also rely on the rich ecosystem
583+
of Golang libraries to achieve their goals. This will also make it
584+
easier for contributors to add new features to the controller, as
585+
they can now focus on implementing new stubs, rather than having
586+
to deal with the complexities of the resolvers-based architecture.
587+
It is worth mentioning that the symbols and libraries made available
588+
to the stub sandboxes will be carefully curated to avoid security
589+
issues, run with a timed context to prevent any leaks. Users may
590+
inject symbols during initialization to allow defining stubs to
591+
utilize them at runtime. Furthermore, additional constraints, such
592+
as limiting stub execution for objects matching certain label or
593+
field selectors, can still be added.
594+
595+
A sample configuration that follows this idea is as follows:
596+
597+
```yaml
598+
apiVersion: resource-state-metrics.instrumentation.k8s-sigs.io/v1alpha1
599+
kind: ResourceMetricsMonitor
600+
metadata:
601+
name: prefilled
602+
namespace: default
603+
spec:
604+
configuration: |-
605+
stores:
606+
- group: "contoso.com"
607+
version: "v1alpha1"
608+
kind: "MyPlatform"
609+
resource: "myplatforms"
610+
families:
611+
- name: "test_metric"
612+
help: "helpless"
613+
metrics:
614+
- stubs:
615+
- |
616+
package foo
617+
import (
618+
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
619+
"github.com/kubernetes-sigs/resource-state-metrics/pkg/utils"
620+
klog "k8s.io/klog/v2"
621+
)
622+
func samples(o *unstructured.Unstructured) []utils.SampleType {
623+
klog.InfoS("Generating samples for resource",
624+
"name", o.GetName(),
625+
"namespace", o.GetNamespace(),
626+
"kind", o.GetKind(),
627+
"apiVersion", o.GetAPIVersion(),
628+
"uid", o.GetUID(),
629+
)
630+
return []utils.SampleType{
631+
{
632+
LabelKeys: []string{"name"},
633+
LabelValues: []string{o.GetName()},
634+
Value: 1,
635+
},
636+
}
637+
}
638+
```
639+
640+
Notice the standard as well as custom symbols used in the stub
641+
above. Additionally, because multiple fields are replaced by a
642+
single stub, context can easily be passed between label-sets and
643+
the metric value generation logic, whenever necessary. Additionally,
644+
owing to Golang's widespread adoption within the cloud-native
645+
ecosystem, users can easily hit the ground running in no time.
646+
647+
The proposed practise is to define more stubs which are coherent
648+
in themselves, rather than having a single monolithic stub that
649+
does everything. This will improve readability and maintainability
650+
of the stubs, while also allowing reusability of code between stubs.
651+
Users can also leverage Golang's package management capabilities
652+
to import and use existing libraries, whenever possible.
653+
654+
The `samples` function defined in the stub above will be invoked
655+
for each object of the managed resource, and the returned samples
656+
will be collected and exposed as Prometheus metrics.
657+
658+
Below is the code snippet that executes the stub and extracts the
659+
samples from it:
660+
661+
```go
662+
func executeStub(stub string, unstructuredTyped *unstructured.Unstructured) ([]SampleType, error) {
663+
timeout := 5 * time.Second
664+
ctx, cancelFn := context.WithTimeout(context.WithValue(context.Background(), "timeout", timeout), timeout)
665+
defer cancelFn()
666+
667+
interpreter := interp.New(interp.Options{})
668+
err := interpreter.Use(stdlib.Symbols)
669+
if err != nil {
670+
panic(err)
671+
}
672+
err = interpreter.Use(interp.Exports{
673+
// Yaegi uses "path/packagename" format.
674+
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured/unstructured": map[string]reflect.Value{
675+
"Unstructured": reflect.ValueOf((*unstructured.Unstructured)(nil)),
676+
},
677+
"github.com/kubernetes-sigs/resource-state-metrics/pkg/utils/utils": map[string]reflect.Value{
678+
"SampleType": reflect.ValueOf((*SampleType)(nil)),
679+
},
680+
"k8s.io/klog/v2/v2": map[string]reflect.Value{
681+
"InfoS": reflect.ValueOf(klog.InfoS),
682+
"Error": reflect.ValueOf(klog.Error),
683+
"ErrorS": reflect.ValueOf(klog.ErrorS),
684+
},
685+
})
686+
if err != nil {
687+
panic(err)
688+
}
689+
_, err = interpreter.EvalWithContext(ctx, stub)
690+
if err != nil {
691+
return nil, fmt.Errorf("error evaluating stub: %w", err)
692+
}
693+
samples, err := interpreter.EvalWithContext(ctx, "foo.samples")
694+
if err != nil {
695+
return nil, fmt.Errorf("error extracting samples from stub: %w", err)
696+
}
697+
if !samples.CanInterface() {
698+
return nil, fmt.Errorf("unable to interface stub result")
699+
}
700+
samplesInterface := samples.Interface()
701+
samplesFn, ok := samplesInterface.(func(*unstructured.Unstructured) []SampleType)
702+
if !ok {
703+
return nil, fmt.Errorf("expected stub result to be of type []SampleType but got %T", samplesInterface)
704+
}
705+
resolvedSamples := samplesFn(unstructuredTyped)
706+
707+
return resolvedSamples, nil
708+
}
709+
```
710+
711+
This approach ensures that there's never a shortage of expressiveness when it
712+
comes to defining metric generation configurations, while also lowering the
713+
learning and maintenance curve for users and maintainers, respectively.
714+
715+
[interpreted]: https://github.com/traefik/yaegi
716+
550717
### Test Plan
551718

552719
<!--

0 commit comments

Comments
 (0)