Skip to content

Commit 7048e74

Browse files
pks-tgitster
authored andcommitted
object: fix performance regression when peeling tags
Our Bencher dashboards [1] have recently alerted us about a bunch of performance regressions when writing references, specifically with the reftable backend. There is a 3x regression when writing many refs with preexisting refs in the reftable format, and a 10x regression when migrating refs between backends in either of the formats. Bisecting the issue lands us at 6ec4c0b (refs: don't store peeled object IDs for invalid tags, 2025-10-23). The gist of the commit is that we may end up storing peeled objects in both reftables and packed-refs for corrupted tags, where the claimed tagged object type is different than the actual tagged object type. This will then cause us to create the `struct object *` with a wrong type, as well, and obviously nothing good comes out of that. The fix for this issue was to introduce a new flag to `peel_object()` that causes us to verify the tagged object's type before writing it into the refdb -- if the tag is corrupt, we skip writing the peeled value. To verify whether the peeled value is correct we have to look up the object type via the ODB and compare the actual type with the claimed type, and that additional object lookup is costly. This also explains why we see the regression only when writing refs with the reftable backend, but we see the regression with both backends when migrating refs: - The reftable backend knows to store peeled values in the new table immediately, so it has to try and peel each ref it's about to write to the transaction. So the performance regression is visible for all writes. - The files backend only stores peeled values when writing the packed-refs file, so it wouldn't hit the performance regression for normal writes. But on ref migrations we know to write all new values into the packed-refs file immediately, and that's why we see the regression for both backends there. Taking a step back though reveals an oddity in the new verification logic: we not only verify the _tagged_ object's type, but we also verify the type of the tag itself. But this isn't really needed, as we wouldn't hit the bug in such a case anyway, as we only hit the issue with corrupt tags claiming an invalid type for the tagged object. The consequence of this is that we now started to look up the target object of every single reference we're about to write, regardless of whether it even is a tag or not. And that is of course quite costly. Fix the issue by only verifying the type of the tagged objects. This means that we of course still have a performance hit for actual tags. But this only happens for writes anyway, and I'd claim it's preferable to not store corrupted data in the refdb than to be fast here. Rename the flag accordingly to clarify that we only verify the tagged object's type. This fix brings performance back to previous levels: Benchmark 1: baseline Time (mean ± σ): 46.0 ms ± 0.4 ms [User: 40.0 ms, System: 5.7 ms] Range (min … max): 45.0 ms … 47.1 ms 54 runs Benchmark 2: regression Time (mean ± σ): 140.2 ms ± 1.3 ms [User: 77.5 ms, System: 60.5 ms] Range (min … max): 138.0 ms … 142.7 ms 20 runs Benchmark 3: fix Time (mean ± σ): 46.2 ms ± 0.4 ms [User: 40.2 ms, System: 5.7 ms] Range (min … max): 45.0 ms … 47.3 ms 55 runs Summary update-ref: baseline 1.00 ± 0.01 times faster than fix 3.05 ± 0.04 times faster than regression [1]: https://bencher.dev/perf/git/plots Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 parent 994869e commit 7048e74

File tree

5 files changed

+11
-11
lines changed

5 files changed

+11
-11
lines changed

object.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -214,7 +214,7 @@ enum peel_status peel_object(struct repository *r,
214214
{
215215
struct object *o = lookup_unknown_object(r, name);
216216

217-
if (o->type == OBJ_NONE || flags & PEEL_OBJECT_VERIFY_OBJECT_TYPE) {
217+
if (o->type == OBJ_NONE) {
218218
int type = odb_read_object_info(r->objects, name, NULL);
219219
if (type < 0 || !object_as_type(o, type, 0))
220220
return PEEL_INVALID;
@@ -228,7 +228,7 @@ enum peel_status peel_object(struct repository *r,
228228
if (o && o->type == OBJ_TAG && ((struct tag *)o)->tagged) {
229229
o = ((struct tag *)o)->tagged;
230230

231-
if (flags & PEEL_OBJECT_VERIFY_OBJECT_TYPE) {
231+
if (flags & PEEL_OBJECT_VERIFY_TAGGED_OBJECT_TYPE) {
232232
int type = odb_read_object_info(r->objects, &o->oid, NULL);
233233
if (type < 0 || !object_as_type(o, type, 0))
234234
return PEEL_INVALID;

object.h

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -289,13 +289,13 @@ enum peel_status {
289289

290290
enum peel_object_flags {
291291
/*
292-
* Always verify the object type, even in the case where the looked-up
293-
* object already has an object type. This can be useful when the
294-
* stored object type may be invalid. One such case is when looking up
295-
* objects via tags, where we blindly trust the object type declared by
296-
* the tag.
292+
* Always verify the object type of the tagged object, even in the case
293+
* where the looked-up object already has an object type. This can be
294+
* useful when the tagged object type may be invalid. One such case is
295+
* when looking up objects via tags, where we blindly trust the object
296+
* type declared by the tag.
297297
*/
298-
PEEL_OBJECT_VERIFY_OBJECT_TYPE = (1 << 0),
298+
PEEL_OBJECT_VERIFY_TAGGED_OBJECT_TYPE = (1 << 0),
299299
};
300300

301301
/*

ref-filter.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2654,7 +2654,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
26542654
if (!is_null_oid(&ref->peeled_oid)) {
26552655
oidcpy(&oi_deref.oid, &ref->peeled_oid);
26562656
} else if (!peel_object(the_repository, &oi.oid, &oi_deref.oid,
2657-
PEEL_OBJECT_VERIFY_OBJECT_TYPE)) {
2657+
PEEL_OBJECT_VERIFY_TAGGED_OBJECT_TYPE)) {
26582658
/* We managed to peel the object ourselves. */
26592659
} else {
26602660
die("bad tag");

refs/packed-backend.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1528,7 +1528,7 @@ static enum ref_transaction_error write_with_updates(struct packed_ref_store *re
15281528
} else {
15291529
struct object_id peeled;
15301530
int peel_error = peel_object(refs->base.repo, &update->new_oid,
1531-
&peeled, PEEL_OBJECT_VERIFY_OBJECT_TYPE);
1531+
&peeled, PEEL_OBJECT_VERIFY_TAGGED_OBJECT_TYPE);
15321532

15331533
if (write_packed_entry(out, update->refname,
15341534
&update->new_oid,

refs/reftable-backend.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1633,7 +1633,7 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data
16331633
ref.update_index = ts;
16341634

16351635
peel_error = peel_object(arg->refs->base.repo, &u->new_oid, &peeled,
1636-
PEEL_OBJECT_VERIFY_OBJECT_TYPE);
1636+
PEEL_OBJECT_VERIFY_TAGGED_OBJECT_TYPE);
16371637
if (!peel_error) {
16381638
ref.value_type = REFTABLE_REF_VAL2;
16391639
memcpy(ref.value.val2.target_value, peeled.hash, GIT_MAX_RAWSZ);

0 commit comments

Comments
 (0)