You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Handling sequence embedding table-wise sharding onto subset of world size
Summary:
While doing table wise sharding, we may have input cases where we don't have enough tables to shard them across all the ranks. In those cases, some embedding modules may not have any embeddings placed onto a few ranks. For table-wise sequence sharding using usharding approach it fails correctly as we modified the split boundary for usharding.
Handling empty ranks for those emedding modules where we can just skip those ranks while collecting the results from all the shards
Differential Revision: D80360860
fbshipit-source-id: 50fd076b194e3426f1ecdcb6e0e8cc5e9ddab43c
0 commit comments