File tree Expand file tree Collapse file tree 1 file changed +13
-1
lines changed Expand file tree Collapse file tree 1 file changed +13
-1
lines changed Original file line number Diff line number Diff line change @@ -722,7 +722,19 @@ def finalize(self) -> tuple[list[int], int]:
722722
723723
724724class NoShuffleBeamWriter :
725- """Shuffles / writes Examples beam collection to sharded files."""
725+ """Writes examples to sharded files using Beam in a non-deterministic way.
726+
727+ The number of shards and in what shard an example is written is
728+ non-deterministic. This means that there may be a shards with few examples
729+ and other shards with many examples.
730+
731+ This writer class should only be used when the ordering of the examples is not
732+ important, e.g., when a file format that supports random access is used.
733+
734+ The speed of writing is faster than the Writer class because it does not need
735+ to shuffle the examples and make sure that the examples are written to the
736+ correct shard.
737+ """
726738
727739 _OUTPUT_TAG_BUCKETS_LEN_SIZE = "tag_buckets_len_size"
728740
You can’t perform that action at this time.
0 commit comments