-
Notifications
You must be signed in to change notification settings - Fork 4k
sql/bulksst: coordinate distributed SST metadata merge with CombineFileInfo #157004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/bulksst: coordinate distributed SST metadata merge with CombineFileInfo #157004
Conversation
f3ed057 to
51fed3f
Compare
mw5h
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mw5h reviewed 1 of 4 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @jeffswenson and @kev-cao)
pkg/sql/bulksst/combine_file_info.go line 17 at r1 (raw file):
) // CombineFileInfo combines SST file metadata and determines merge task spans based on key samples.
Perhaps we should describe what we're producing here and why? There's not really an in-code description of what the samples are, so it's not necessarily clear why we want to use them as break points for our merge spans.
…leInfo Implements CombineFileInfo(), a coordinator utility that aggregates SST metadata from distributed workers and determines merge task spans based on sampled keys. The function combines SST file metadata from multiple workers and uses their row samples to split schema spans into merge task spans. This will be used by the new distributed merge pipeline. Resolves: cockroachdb#156662 Epic: CRDB-48845 Release note: none Co-authored by: @jeffswenson
51fed3f to
1fc8f79
Compare
spilchen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jeffswenson, @kev-cao, and @mw5h)
pkg/sql/bulksst/combine_file_info.go line 17 at r1 (raw file):
Previously, mw5h (Matt White) wrote…
Perhaps we should describe what we're producing here and why? There's not really an in-code description of what the samples are, so it's not necessarily clear why we want to use them as break points for our merge spans.
Good idea. I beefed up the function comment.
jeffswenson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
@mw5h ready for another look |
mw5h
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@mw5h reviewed 2 of 4 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @kev-cao)
|
TFTRs! bors r+ |
Implements CombineFileInfo(), a coordinator utility that aggregates SST metadata from distributed workers and determines merge task spans based on sampled keys.
The function combines SST file metadata from multiple workers and uses their row samples to split schema spans into merge task spans. This will be used by the new distributed merge pipeline.
Resolves: #156662
Epic: CRDB-48845
Release note: none
Co-authored by: @jeffswenson