Commit 4eedf85
committed
Reduce the overhead of UCX add_procs with intercommunicators
* When creating a large number of intercommunicators with `MPI_Intercomm_create`
the UCX pml add_procs routine is called for each "new" process. This results
in a call to `ucp_ep_create` and overwrites the old endpoint at the PML level
if there was already one in place. However, it adds a new endpoint to the UCX
instance below without removing the old endpoint. This results in accumulating
a large number of endpoints paired with the UCX worker. Creating the endpoints
has overhead which contributes to the slowdown for the `MPI_Intercomm_create`
function.
* On Finalize cleaning these up endpoints occurs in the `ucp_worker_destroy`
function. Since there are a signifiant number of endpoints it takes quite a
while to cleanup.
* In this patch, we first check to see if an endpoint has already been created
for this process. If so then we skip adding it again. Otherwise we create a
new endpoint.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>1 parent fb66617 commit 4eedf85
1 file changed
+5
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
410 | 410 | | |
411 | 411 | | |
412 | 412 | | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
413 | 418 | | |
414 | 419 | | |
415 | 420 | | |
| |||
0 commit comments