-
Notifications
You must be signed in to change notification settings - Fork 518
objstorageprovider: add dual-write for cold metadata #5612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
a52a2d6 to
6ea2777
Compare
Change TestProvider to use key=value format for arguments, making the test easier to read and simplifying the test code.
This commit adds logic to write the metadata portion of cold blob files on both tiers, and to transparently read the metadata from the hot tier. We assume that the metadata is a suffix of the file; the metadata filenames encode the start offset, for example "000001.blobmeta.1234" means that the file contains the data from "00001.blob" starting at offset 1234. We add a `Writable` implementation which writes the metadata portion of a file to both cold and hot storage and a `Readable` implementation which reads data from a cold tier Readable but uses a `vfs.File` for the metadata. The provider keeps track internally of which hot metadata files exist and their start offsets.
6ea2777 to
938d5c6
Compare
Potential Bug(s) DetectedThe three-stage Claude Code analysis has identified potential bug(s) in this PR that may warrant investigation. Next Steps: Note: When viewing the workflow output, scroll to the bottom to find the Final Analysis Summary. After you review the findings, please tag the issue as follows:
|
sumeerbhola
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumeerbhola reviewed 2 of 2 files at r1, 11 of 11 files at r2, 12 of 13 files at r3.
Reviewable status: 24 of 25 files reviewed, 7 unresolved discussions (waiting on @jbowens and @RaduBerinde)
objstorage/objstorageprovider/cold_readable.go line 87 at r3 (raw file):
// NewReadHandle is part of the objstorage.Readable interface. func (r *coldReadable) NewReadHandle( readBeforeSize objstorage.ReadBeforeSize,
This readBeforeSize was for reading metadata for files stored in object storage. So there is some relation to the fact that for these cold files the metadata is already in a local hot tier. A code comment here would be helpful. Perhaps something like:
// The readBeforeSize was decided to optimize reading the metadata suffix of a file, for cases where small reads are expensive. In our case, the metadata is specifically in a hot tier file, for which small reads are cheap.
// We continue to pass on the readBeforeSize to the cold file, since the readBeforeSize is ignored by callees except when the file is in remote storage. Currently cold files are in local storage. We may need to revisit this when cold files are in remote storage.
objstorage/objstorageprovider/cold_writable.go line 82 at r3 (raw file):
return w.err } }
It's a bit odd that we don't necessarily call flushMeta in this coldWritable.Write call and accumulate in this buf explicitly. I suppose this is explicitly trying to account for the fact that any normal user of objstorage.Writable has wrapped it in a fileBufferedWritable, so we are doing the same for these metadata writes.
The decision to do this buffering without being asked to is fine, but can we use a bufio.Writer instead of rolling our own logic?
internal/base/filenames.go line 136 at r3 (raw file):
// <offset> indicates that the file mirrors the contents of the corresponding // blob file starting at this offset. FileTypeBlobMeta
is the lifetime of these files fully hidden inside the provider?
objstorage/objstorage.go line 112 at r3 (raw file):
// // If Finish fails, it is expected that the caller will delete the created // object. If the process crashes during Finish, it is expected that the file
"the caller will delete the created object". It isn't clear to me how, given the Writable interface itself does not have a delete method.
I see two paths that have to handle errors on Finish: in layout.go (during sstable writing) and blob.FileWriter.Close (during blob file writing). The latter calls Abort when Finish returns an error (and fileBufferedWritable.Abort doesn't delete anything), and the former just returns it upstream (I don't see anything that is necessarily cleaning up files).
Would be good to make this clearer. Should the caller call Provider.Remove?
objstorage/objstorage.go line 123 at r3 (raw file):
// // An error means that we won't be able to successfully finish this object. // - Any constraints on when this can be called relative to Write()
Is the "Any constraints ..." a forgotten comment?
objstorage/objstorageprovider/vfs.go line 135 at r3 (raw file):
return nil, err } r, err := newFileReadable(file, p.st.Local.FS, p.st.Local.ReadaheadConfig, filename)
why is this using p.st.Local.FS as a parameter, given this could be a cold tier file?
objstorage/objstorageprovider/vfs.go line 142 at r3 (raw file):
if startOffset, ok := p.getColdObjectMetaFile(fileType, fileNum); ok { metaPath := p.metaPath(fileType, fileNum, startOffset) return newColdReadable(r, p.st.Local.FS, metaPath, startOffset), nil
should this be called newColdReadableWithHotMeta since if there isn't a hot metadata file it will not do this wrapping?
go.mod: update datadriven
objstorageprovider: clean up datadriven tests
Change TestProvider to use key=value format for arguments, making the
test easier to read and simplifying the test code.
objstorageprovider: add dual-write for cold metadata
This commit adds logic to write the metadata portion of cold blob
files on both tiers, and to transparently read the metadata from the
hot tier.
We assume that the metadata is a suffix of the file; the metadata
filenames encode the start offset, for example "000001.blobmeta.1234"
means that the file contains the data from "00001.blob" starting at
offset 1234.
We add a
Writableimplementation which writes the metadata portionof a file to both cold and hot storage and a
Readableimplementationwhich reads data from a cold tier Readable but uses a
vfs.Fileforthe metadata.
The provider keeps track internally of which hot metadata files
exist and their start offsets.