@@ -2461,44 +2461,18 @@ func shouldWaitForPendingMerge(
24612461 return & kvpb.MergeInProgressError {}
24622462}
24632463
2464- // isNewerThanSplit is a helper used in split(Pre|Post)Apply to
2465- // determine whether the Replica on the right hand side of the split must
2466- // have been removed from this store after the split .
2464+ // isNewerThanSplit is a helper used in split(Pre|Post)Apply to determine
2465+ // whether the RHS replica of the split must have been removed from this store
2466+ // after the split, and the given Replica has a higher ID .
24672467//
2468- // TODO(tbg): the below is true as of 22.2: we persist any Replica's ReplicaID
2469- // under RaftReplicaIDKey, so the below caveats should be addressed now.
2470- //
2471- // TODO(ajwerner): There is one false negative where false will be returned but
2472- // the hard state may be due to a newer replica which is outlined below. It
2473- // should be safe.
2474- // Ideally if this store had ever learned that the replica created by the split
2475- // were removed it would not forget that fact. There exists one edge case where
2476- // the store may learn that it should house a replica of the same range with a
2477- // higher replica ID and then forget. If the first raft message this store ever
2478- // receives for the this range contains a replica ID higher than the replica ID
2479- // in the split trigger then an in-memory replica at that higher replica ID will
2480- // be created and no tombstone at a lower replica ID will be written. If the
2481- // server then crashes it will forget that it had ever been the higher replica
2482- // ID. The server may then proceed to process the split and initialize a replica
2483- // at the replica ID implied by the split. This is potentially problematic as
2484- // the replica may have voted as this higher replica ID and when it rediscovers
2485- // the higher replica ID it will delete all of the state corresponding to the
2486- // older replica ID including its hard state which may have been synthesized
2487- // with votes as the newer replica ID. This case tends to be handled safely in
2488- // practice because the replica should only be receiving messages as the newer
2489- // replica ID after it has been added to the range as a learner.
2490- //
2491- // Despite the safety due to the change replicas protocol explained above it'd
2492- // be good to know for sure that a replica ID for a range on a store is always
2493- // monotonically increasing, even across restarts.
2468+ // NB: from v22.2, we persist any Replica's ID under RaftReplicaIDKey. A
2469+ // complementary mechanism, RangeTombstone, ensures that replica deletions are
2470+ // persistent as well. As a result, the ReplicaID existence and non-existence is
2471+ // monotonic and survives restarts.
24942472//
24952473// See TestProcessSplitAfterRightHandSideHasBeenRemoved.
24962474func (r * Replica ) isNewerThanSplit (split * roachpb.SplitTrigger ) bool {
24972475 rightDesc , _ := split .RightDesc .GetReplicaDescriptor (r .StoreID ())
2498- // If the first raft message we received for the RHS range was for a replica
2499- // ID which is above the replica ID of the split then we would not have
2500- // written a tombstone but we will have a replica ID that will exceed the
2501- // split replica ID.
25022476 return r .replicaID > rightDesc .ReplicaID
25032477}
25042478
0 commit comments