Skip to content

Commit f3979b1

Browse files
committed
updates
Signed-off-by: Daman Arora <aroradaman@gmail.com>
1 parent f5520da commit f3979b1

File tree

2 files changed

+24
-11
lines changed

2 files changed

+24
-11
lines changed

keps/sig-network/4963-kube-proxy-services-acceleration/README.md

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -74,15 +74,20 @@ This KEP proposes utilizing the flowtable infrastructure within the Linux kernel
7474

7575
## Motivation
7676

77-
Kube-proxy manages Service traffic by manipulating iptables/nftables rules. This approach can introduce performance overhead, particularly for services with high throughput or a large number of connections. The kernel's flowtables offer a more efficient alternative for handling established connections, bypassing the standard netfilter processing pipeline.
77+
Every packet entering the Linux kernel is evaluated against all rules attached to the netfilter hooks, even for established connections. These rules may be added by CNIs, system administrators, firewalls, or kube-proxy, and together they define how packets are filtered, routed, or rewritten. As a result, packets continue to traverse the full netfilter processing path, which can add unnecessary overhead for long-lived or high-throughput connections.
78+
79+
A connection becomes established only after the initial packets successfully pass through all applicable rules without being dropped or rejected. Once established, packets associated with a Kubernetes service can be offloaded to the kernel fast path using flowtables. This allows subsequent service packets to bypass the full netfilter stack, accelerating kube-proxy traffic and reducing CPU usage.
80+
7881

7982
### Goals
8083

81-
- Provide an option for kube-proxy users to enable Service traffic acceleration.
84+
- Provide an option for kube-proxy users to enable traffic acceleration for TCP and UDP services.
8285

8386
### Non-Goals
8487

8588
- Separation of Concerns: Kube-proxy's primary responsibility is to manage Service traffic. Extending the flowtable offloading functionality to non-Service traffic will potentially introduce unintended side effects. It's better to keep the feature focused on its core purpose.
89+
- Supporting service traffic acceleration for iptables and ipvs backends.
90+
- Supporting service traffic acceleration for SCTP services.
8691

8792
## Proposal
8893

@@ -155,14 +160,13 @@ As a cluster administrator managing a cluster where services typically handle sm
155160

156161
### Risks and Mitigations
157162

158-
Once the network traffic moves to the fastpath it completely bypass the kernel stack, so
159-
any other network applications that depend on the packets going through the network stack (monitoring per example) will not be able to see the connection details. The feature will only apply the fastpath based on a defined threshold, that will also allow to disable the feature.
163+
Moving network traffic to the fastpath causes packets to bypass the standard netfilter hooks after the ingress hook. Flowtables operate at the ingress hook, and packets still traverse taps, so tools like tcpdump and Wireshark will continue to observe traffic. However, any applications that rely on hooks or rules evaluated after the ingress hook may not observe or process these packets as expected. To mitigate this, fastpath offload will be applied selectively based on a configurable threshold, and users will have the option to disable the feature entirely.
160164

161165
Flowtables netfilter infrastructure is not well documented and we need to validate assumptions to avoid unsupported or suboptimal configurations. Establishing good relations and involve netfilter maintainers in the design will mitigate these possible problems.
162166

163167
## Design Details
164168

165-
This feature will only work with kube-proxy nftables mode. We will add a new configuration option to kube-proxy to enable Service traffic offload based on a number of packets threshold per connection.
169+
This feature will only work with kube-proxy nftables mode for TCP and UDP traffic. We will add a new configuration option to kube-proxy to enable Service traffic offload based on a number of packets threshold per connection.
166170

167171
The packet threshold approach offers several advantages over the [alternatives](#alternatives):
168172

@@ -299,11 +303,19 @@ Kube-proxy will insert a rule to offload all Services established traffic in the
299303
tx.Add(&knftables.Rule{
300304
Chain: filterForwardChain,
301305
Rule: knftables.Concat(
302-
"ct original", ipX, "daddr", "@", clusterIPsSet,
303-
"ct packets >", proxier.fastpathPacketThreshold,
306+
"ct packets >", proxier.fastpathPacketThreshold,
307+
"ct original", ipX, "daddr", "@", serviceIPsSet,
304308
"flow offload", "@", serviceFlowTable,
305309
),
306310
})
311+
tx.Add(&knftables.Rule{
312+
Chain: filterForwardChain,
313+
Rule: knftables.Concat(
314+
"ct packets >", proxier.fastpathPacketThreshold,
315+
"ct original", "th dport", "@", servicePortsSet,
316+
"flow offload", "@", serviceFlowTable,
317+
),
318+
})
307319
}
308320
```
309321

keps/sig-network/4963-kube-proxy-services-acceleration/kep.yaml

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ title: Kube-proxy Services Acceleration
22
kep-number: 4963
33
authors:
44
- "@aojea"
5+
- "@aroradaman"
56
owning-sig: sig-network
67
status: implementable
78
creation-date: 2024-11-14
@@ -22,9 +23,9 @@ latest-milestone: "v1.33"
2223

2324
# The milestone at which this feature was, or is targeted to be, at each stage.
2425
milestone:
25-
alpha: "v1.33"
26-
beta: "v1.34"
27-
stable: "v1.35"
26+
alpha: "v1.35"
27+
beta: "v1.36"
28+
stable: "v1.37"
2829

2930
# The following PRR answers are required at alpha release
3031
# List the feature gate name and the components for which it must be enabled
@@ -36,4 +37,4 @@ disable-supported: true
3637

3738
# The following PRR answers are required at beta release
3839
metrics:
39-
- count_accelerated_connections_total
40+
- kubeproxy_accelerated_connections_count_total

0 commit comments

Comments
 (0)