Skip to content

Commit 80d74a1

Browse files
author
Herton R. Krzesinski
committed
Merge: net: Fix return value of qdisc ingress handling on success
MR: https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/1926 Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2162711 Tested: vs bz reproducer Upstream commit: commit 672e97e Author: Paul Blakey <paulb@nvidia.com> Date: Tue Oct 18 10:34:38 2022 +0300 net: Fix return value of qdisc ingress handling on success Currently qdisc ingress handling (sch_handle_ingress()) doesn't set a return value and it is left to the old return value of the caller (__netif_receive_skb_core()) which is RX drop, so if the packet is consumed, caller will stop and return this value as if the packet was dropped. This causes a problem in the kernel tcp stack when having a egress tc rule forwarding to a ingress tc rule. The tcp stack sending packets on the device having the egress rule will see the packets as not successfully transmitted (although they actually were), will not advance it's internal state of sent data, and packets returning on such tcp stream will be dropped by the tcp stack with reason ack-of-unsent-data. See reproduction in [0] below. Fix that by setting the return value to RX success if the packet was handled successfully. [0] Reproduction steps: $ ip link add veth1 type veth peer name peer1 $ ip link add veth2 type veth peer name peer2 $ ifconfig peer1 5.5.5.6/24 up $ ip netns add ns0 $ ip link set dev peer2 netns ns0 $ ip netns exec ns0 ifconfig peer2 5.5.5.5/24 up $ ifconfig veth2 0 up $ ifconfig veth1 0 up $ tc qdisc add dev veth2 ingress $ tc qdisc add dev veth1 ingress $ tc filter add dev veth2 ingress prio 1 proto all flower \ action mirred egress redirect dev veth1 $ tc filter add dev veth1 ingress prio 1 proto all flower \ action mirred egress redirect dev veth2 $ tc qdisc add dev peer1 clsact $ tc filter add dev peer1 egress prio 20 proto ip flower \ action mirred ingress redirect dev veth1 $ iperf3 -s& $ ip netns exec ns0 iperf3 -c 5.5.5.6 -i 1 $ tc filter del dev peer1 egress $ ip netns exec ns0 iperf3 -c 5.5.5.6 -i 1 Fixes: f697c3e ("[NET]: Avoid unnecessary cloning for ingress filtering") Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Approved-by: Davide Caratti <dcaratti@redhat.com> Approved-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Approved-by: Andrea Claudi <aclaudi@redhat.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
2 parents c093875 + af86e36 commit 80d74a1

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

net/core/dev.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5111,11 +5111,13 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
51115111
case TC_ACT_SHOT:
51125112
mini_qdisc_qstats_cpu_drop(miniq);
51135113
kfree_skb_reason(skb, SKB_DROP_REASON_TC_INGRESS);
5114+
*ret = NET_RX_DROP;
51145115
return NULL;
51155116
case TC_ACT_STOLEN:
51165117
case TC_ACT_QUEUED:
51175118
case TC_ACT_TRAP:
51185119
consume_skb(skb);
5120+
*ret = NET_RX_SUCCESS;
51195121
return NULL;
51205122
case TC_ACT_REDIRECT:
51215123
/* skb_mac_header check was done by cls/act_bpf, so
@@ -5128,8 +5130,10 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret,
51285130
*another = true;
51295131
break;
51305132
}
5133+
*ret = NET_RX_SUCCESS;
51315134
return NULL;
51325135
case TC_ACT_CONSUMED:
5136+
*ret = NET_RX_SUCCESS;
51335137
return NULL;
51345138
default:
51355139
break;

0 commit comments

Comments
 (0)