Skip to content

Conversation

@RaduBerinde
Copy link
Member

go.mod: update datadriven

objstorageprovider: clean up datadriven tests

Change TestProvider to use key=value format for arguments, making the
test easier to read and simplifying the test code.

objstorageprovider: add dual-write for cold metadata

This commit adds logic to write the metadata portion of cold blob
files on both tiers, and to transparently read the metadata from the
hot tier.

We assume that the metadata is a suffix of the file; the metadata
filenames encode the start offset, for example "000001.blobmeta.1234"
means that the file contains the data from "00001.blob" starting at
offset 1234.

We add a Writable implementation which writes the metadata portion
of a file to both cold and hot storage and a Readable implementation
which reads data from a cold tier Readable but uses a vfs.File for
the metadata.

The provider keeps track internally of which hot metadata files
exist and their start offsets.

@RaduBerinde RaduBerinde requested a review from a team as a code owner November 24, 2025 20:15
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@RaduBerinde RaduBerinde force-pushed the cold-meta-on-hot branch 2 times, most recently from a52a2d6 to 6ea2777 Compare November 25, 2025 01:37
Change TestProvider to use key=value format for arguments, making the
test easier to read and simplifying the test code.
This commit adds logic to write the metadata portion of cold blob
files on both tiers, and to transparently read the metadata from the
hot tier.

We assume that the metadata is a suffix of the file; the metadata
filenames encode the start offset, for example "000001.blobmeta.1234"
means that the file contains the data from "00001.blob" starting at
offset 1234.

We add a `Writable` implementation which writes the metadata portion
of a file to both cold and hot storage and a `Readable` implementation
which reads data from a cold tier Readable but uses a `vfs.File` for
the metadata.

The provider keeps track internally of which hot metadata files
exist and their start offsets.
@github-actions
Copy link

Potential Bug(s) Detected

The three-stage Claude Code analysis has identified potential bug(s) in this PR that may warrant investigation.

Next Steps:
Please review the detailed findings in the workflow run.

Note: When viewing the workflow output, scroll to the bottom to find the Final Analysis Summary.

After you review the findings, please tag the issue as follows:

  • If the detected issue is real or was helpful in any way, please tag the issue with O-AI-Review-Real-Issue-Found
  • If the detected issue was not helpful in any way, please tag the issue with O-AI-Review-Not-Helpful

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumeerbhola reviewed 2 of 2 files at r1, 11 of 11 files at r2, 12 of 13 files at r3.
Reviewable status: 24 of 25 files reviewed, 7 unresolved discussions (waiting on @jbowens and @RaduBerinde)


objstorage/objstorageprovider/cold_readable.go line 87 at r3 (raw file):

// NewReadHandle is part of the objstorage.Readable interface.
func (r *coldReadable) NewReadHandle(
	readBeforeSize objstorage.ReadBeforeSize,

This readBeforeSize was for reading metadata for files stored in object storage. So there is some relation to the fact that for these cold files the metadata is already in a local hot tier. A code comment here would be helpful. Perhaps something like:

// The readBeforeSize was decided to optimize reading the metadata suffix of a file, for cases where small reads are expensive. In our case, the metadata is specifically in a hot tier file, for which small reads are cheap.
// We continue to pass on the readBeforeSize to the cold file, since the readBeforeSize is ignored by callees except when the file is in remote storage. Currently cold files are in local storage. We may need to revisit this when cold files are in remote storage.


objstorage/objstorageprovider/cold_writable.go line 82 at r3 (raw file):

					return w.err
				}
			}

It's a bit odd that we don't necessarily call flushMeta in this coldWritable.Write call and accumulate in this buf explicitly. I suppose this is explicitly trying to account for the fact that any normal user of objstorage.Writable has wrapped it in a fileBufferedWritable, so we are doing the same for these metadata writes.
The decision to do this buffering without being asked to is fine, but can we use a bufio.Writer instead of rolling our own logic?


internal/base/filenames.go line 136 at r3 (raw file):

	// <offset> indicates that the file mirrors the contents of the corresponding
	// blob file starting at this offset.
	FileTypeBlobMeta

is the lifetime of these files fully hidden inside the provider?


objstorage/objstorage.go line 112 at r3 (raw file):

	//
	// If Finish fails, it is expected that the caller will delete the created
	// object. If the process crashes during Finish, it is expected that the file

"the caller will delete the created object". It isn't clear to me how, given the Writable interface itself does not have a delete method.
I see two paths that have to handle errors on Finish: in layout.go (during sstable writing) and blob.FileWriter.Close (during blob file writing). The latter calls Abort when Finish returns an error (and fileBufferedWritable.Abort doesn't delete anything), and the former just returns it upstream (I don't see anything that is necessarily cleaning up files).

Would be good to make this clearer. Should the caller call Provider.Remove?


objstorage/objstorage.go line 123 at r3 (raw file):

	//
	// An error means that we won't be able to successfully finish this object.
	// - Any constraints on when this can be called relative to Write()

Is the "Any constraints ..." a forgotten comment?


objstorage/objstorageprovider/vfs.go line 135 at r3 (raw file):

		return nil, err
	}
	r, err := newFileReadable(file, p.st.Local.FS, p.st.Local.ReadaheadConfig, filename)

why is this using p.st.Local.FS as a parameter, given this could be a cold tier file?


objstorage/objstorageprovider/vfs.go line 142 at r3 (raw file):

		if startOffset, ok := p.getColdObjectMetaFile(fileType, fileNum); ok {
			metaPath := p.metaPath(fileType, fileNum, startOffset)
			return newColdReadable(r, p.st.Local.FS, metaPath, startOffset), nil

should this be called newColdReadableWithHotMeta since if there isn't a hot metadata file it will not do this wrapping?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants