Skip to content

in-kernel PM: Server is only using single IP to establish subflows #587

@roman-shuhov

Description

@roman-shuhov

Pre-requisites

  • A similar issue has not been reported before.
  • mptcp.dev website does not cover my case.
  • An up-to-date kernel is being used.
  • This case is not fixed with the latest stable (or LTS) version listed on kernel.org

What did you do?

I'm trying to make a simple MPTCP connection between 2 hosts, each of these machines has 4 physical network cards (eth0..eth3) with unique IPv6/128 subnets.

# zcat /proc/config.gz | grep -i mptcp
CONFIG_MPTCP=y
CONFIG_INET_MPTCP_DIAG=m
CONFIG_MPTCP_IPV6=y
# sysctl -a | grep mptcp
net.ipv4.tcp_available_ulp = mptcp tls
net.mptcp.add_addr_timeout = 120
net.mptcp.allow_join_initial_addr_port = 1
net.mptcp.checksum_enabled = 0
net.mptcp.close_timeout = 60
net.mptcp.enabled = 1
net.mptcp.pm_type = 0
net.mptcp.scheduler = default
net.mptcp.stale_loss_cnt = 4

To make a connection I'm using a simple python client/server scripts. Server is listening on :8080 (from eth3) with MPTCP socket and client connect from :8086 (ports are arbitrary in this case). Each host also has configured list of endpoints (as a side note, I tried different configurations of subflows with and without fullmesh, but final behavior was the same):

[server] # ip mptcp endpoint
<srv-ip1> id 1 signal subflow dev eth0 
<srv-ip2> id 2 subflow fullmesh dev eth1 
<srv-ip3> id 3 subflow fullmesh dev eth2 
<srv-ip4> id 4 subflow fullmesh dev eth3 
[client] # ip mptcp endpoint
<cl-ip1> id 1 signal subflow dev eth0 
<cl-ip2> id 2 subflow fullmesh dev eth1 
<cl-ip3> id 3 subflow fullmesh dev eth2 
<cl-ip4> id 4 subflow fullmesh dev eth3

ss shows the following connections/subflows:

[server] # ss -tunlap | grep python tcp   
LISTEN     0      5   [<srv-ip4>]:8080  [::]:*     users:(("python3.12",pid=1678010,fd=3)) tcp   
ESTAB      0      0   [<srv-ip4>]:8080  [<cl-ip2>]:8086  users:(("python3.12",pid=1678010,fd=4)) tcp   
ESTAB      0      0   [<srv-ip4>]:8080  [<cl-ip1>]:43913 users:(("python3.12",pid=1678010,fd=4)) tcp  
ESTAB      0      0   [<srv-ip4>]:8080  [<cl-ip4]:34685 users:(("python3.12",pid=1678010,fd=4))
# ss -Mn
State       Recv-Q        Send-Q                          Local Address:Port                           Peer Address:Port
ESTAB       0             0                    [<srv-ip4>]:8080               [<cli-ip4>]:8087

The odd thing in this case, is that server is using the same <srv-ip4> in all subflows, even as fullmesh and several endpoints are configured. From my understanding server should use different endpoints in this case.

From the mptcp monitor, on server it shows that single IP was used for server side:

[server] # ip mptcp monitor
[LISTENER_CREATED] saddr6=<srv-ip4> sport=8080
[         CREATED] token=f2516375 remid=0 locid=0 saddr6=<srv-ip4> daddr6=<cl-ip4> sport=8080 dport=8087
[     ESTABLISHED] token=f2516375 remid=0 locid=0 saddr6=<srv-ip4> daddr6=<cl-ip4> sport=8080 dport=8087
[       ANNOUNCED] token=f2516375 remid=1 daddr6=<cl-ip1> dport=8087
[  SF_ESTABLISHED] token=f2516375 remid=4 locid=0 saddr6=<srv-ip4> daddr6=<cl-ip2> sport=8080 dport=42225 backup=0
[  SF_ESTABLISHED] token=f2516375 remid=2 locid=0 saddr6=<srv-ip4> daddr6=<cl-ip3> sport=8080 dport=37795 backup=0
[          CLOSED] token=f2516375
[ LISTENER_CLOSED] saddr6=<srv-ip4> sport=8080
[client] # ip mptcp monitor 
[         CREATED] token=8e8cde69 remid=0 locid=0 saddr6=<cl-ip4> daddr6=<srv-ip4> sport=8087 dport=8080
[     ESTABLISHED] token=8e8cde69 remid=0 locid=0 saddr6=<cl-ip4> daddr6=<srv-ip4> sport=8087 dport=8080
[       ANNOUNCED] token=8e8cde69 remid=1 daddr6=<srv-ip1> dport=8080
[       SF_CLOSED] token=8e8cde69 remid=1 locid=4 saddr6=<cl-ip1> daddr6=<srv-ip1> sport=38413 dport=8080 backup=0 ifindex=3
[  SF_ESTABLISHED] token=8e8cde69 remid=0 locid=4 saddr6=<cl-ip1> daddr6=<srv-ip4> sport=42225 dport=8080 backup=0 ifindex=3
[  SF_ESTABLISHED] token=8e8cde69 remid=0 locid=2 saddr6=<cl-ip2> daddr6=<srv-ip4> sport=37795 dport=8080 backup=0 ifindex=10
[       SF_CLOSED] token=8e8cde69 remid=1 locid=2 saddr6=<cl-ip2> daddr6=<srv-ip1> sport=60757 dport=8080 backup=0 ifindex=10
[       SF_CLOSED] token=8e8cde69 remid=1 locid=3 saddr6=<cl-ip4> daddr6=<srv-ip1> sport=45561 dport=8080 backup=0 ifindex=4
[          CLOSED] token=8e8cde69

But on closer look with bpftrace, I found that server is actually trying to use "local" ip addresses, all 3 and get EINVAL error, coming from this line https://elixir.bootlin.com/linux/v6.9/source/net/mptcp/subflow.c#L1663 (i'm using 6.9 kernel):

[server] # 
14:59:04.557572 -> 14:59:04.557585 TID/PID 837141/837141 (kworker/23:1/kworker/23:1):

                    ret_from_fork_asm+0x11                        (arch/x86/entry/entry_64.S:257)
                    ret_from_fork+0x2f                            (arch/x86/kernel/process.c:153)
                    kthread+0xae                                  (kernel/kthread.c:389)
                    worker_thread+0xc6                            (kernel/workqueue.c:3430)
                    . keep_working                                (kernel/workqueue.c:955)
                    . list_empty                                  (include/linux/list.h:373)
                    process_scheduled_works+0x184                 (kernel/workqueue.c:3348)
                    . process_one_work                            (kernel/workqueue.c:3272)
                    mptcp_worker+0x4d                             (net/mptcp/protocol.c:2745)
                    mptcp_pm_nl_work+0x1ed                        (net/mptcp/pm_netlink.c:874)
                    mptcp_pm_create_subflow_or_signal_addr+0x2b0  (net/mptcp/pm_netlink.c:600)
                    __mptcp_subflow_connect+0x55                  (net/mptcp/subflow.c:1537)
!    3us [-EINVAL]  mptcp_subflow_create_socket

Do you think it's the right expectation that server should use different source IP addresses? Do you think there any obvious errors with my setup?

Any help/advice is highly appreciated!

What happened?

all subflows are established with the same server-ip, even if I configure multiple subflow endpoints on server (tried with and without fullmesh). Server also reports EINVAL errors on attempt to use local addresses.

What did you expect to have?

expect server to pick multiple addresses from configured endpoints

System info: Client

6.9

System info: Server

6.9

Additional context

[server] #
14:59:04.557572 -> 14:59:04.557585 TID/PID 837141/837141 (kworker/23:1/kworker/23:1):

                ret_from_fork_asm+0x11                        (arch/x86/entry/entry_64.S:257)
                ret_from_fork+0x2f                            (arch/x86/kernel/process.c:153)
                kthread+0xae                                  (kernel/kthread.c:389)
                worker_thread+0xc6                            (kernel/workqueue.c:3430)
                . keep_working                                (kernel/workqueue.c:955)
                . list_empty                                  (include/linux/list.h:373)
                process_scheduled_works+0x184                 (kernel/workqueue.c:3348)
                . process_one_work                            (kernel/workqueue.c:3272)
                mptcp_worker+0x4d                             (net/mptcp/protocol.c:2745)
                mptcp_pm_nl_work+0x1ed                        (net/mptcp/pm_netlink.c:874)
                mptcp_pm_create_subflow_or_signal_addr+0x2b0  (net/mptcp/pm_netlink.c:600)
                __mptcp_subflow_connect+0x55                  (net/mptcp/subflow.c:1537)

! 3us [-EINVAL] mptcp_subflow_create_socket

see this error for each local endpoint server tries to use and it still somehow fallbacks to single IP address.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions