Skip to content

Commit 1034f3a

Browse files
dmwumeta-codesync[bot]
authored andcommitted
Adjust nChannels for mi350 for single host use cases
Summary: MI350 uses less channels than mi300x. Need more channels for single node communicators that do not uses all the xgmi links per gpu (i.e., nranks <8). For such cases, we inflate the nChannels a bit to achieve higher b/w. Keep nChannels for multi-node or single-node but nranks=8 cases the same. update: Upstream PR merged: ROCm/rccl#2027 Reviewed By: haoyuz Differential Revision: D86027843 fbshipit-source-id: 6d8e8024a013d63db251b9de4e69d5f0027914a1
1 parent e0a1f99 commit 1034f3a

File tree

1 file changed

+7
-1
lines changed

1 file changed

+7
-1
lines changed

comms/rcclx/develop/src/init.cc

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1340,7 +1340,13 @@ static ncclResult_t initTransportsRank(struct ncclComm* comm, struct ncclComm* p
13401340
}
13411341
}
13421342
if (IsArchMatch(comm->topo->nodes[GPU].nodes[idx].gpu.gcn, "gfx950")) {
1343-
allGather3Data[rank].nc = 4;
1343+
if (nranks == 2 && nNodes == 1){
1344+
allGather3Data[rank].nc = 16;
1345+
} else if (nranks == 4 && nNodes == 1){
1346+
allGather3Data[rank].nc = 8;
1347+
} else {
1348+
allGather3Data[rank].nc = 4;
1349+
}
13441350
}
13451351

13461352
allGather3Data[rank].pivotA2AEnabled = comm->topo->pivotA2AEnabled && rcclParamPivotAlltoallEnable();

0 commit comments

Comments
 (0)