Skip to content

Commit 23fceb4

Browse files
authored
Add NCCL_NCHANNELS_PER_NET_PEER to NCCL documentation (#295)
1 parent 1774e6b commit 23fceb4

File tree

2 files changed

+11
-0
lines changed

2 files changed

+11
-0
lines changed

.github/actions/spelling/allow.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -273,6 +273,7 @@ pytorch
273273
quantumespresso
274274
quasiparticles
275275
quickstart
276+
recv
276277
rgw
277278
ripgrep
278279
rocm

docs/software/communication/nccl.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,16 @@ While the container engine sets these automatically when using the NCCL hook, th
2222

2323
[_Demystifying NCCL: An In-depth Analysis of GPU Communication Protocols and Algorithms_](https://arxiv.org/abs/2507.04786v2) contains detailed information about NCCL algorithms and protocols, which can be helpful for deciding if your application could benefit from an alternative configuration.
2424

25+
In addition to the above variables, setting `NCCL_NCHANNELS_PER_NET_PEER` can improve point-to-point performance (operations based directly on send/recv):
26+
27+
```bash
28+
export NCCL_NCHANNELS_PER_NET_PEER=4
29+
```
30+
31+
A value of 4 is generally a good compromise to improve point-to-point performance without affecting collectives performance.
32+
Setting it to a higher value such as 16 or 32 can still further improve send/recv performance, but may degrade collectives performance, so the optimal value depends on the mix of operations used in an application.
33+
The option is undocumented, but [this issue](https://github.com/NVIDIA/nccl/issues/1272) and the paper linked above contain additional details.
34+
2535
!!! warning "NCCL watchdog timeout or hanging process"
2636
In some cases, still under investigation, NCCL may hang resulting in a stuck process or a watchdog timeout error.
2737
In this scenario, we recommend disabling Slingshot eager messages with the following workaround:

0 commit comments

Comments
 (0)