Skip to content

Conversation

@max-charlamb
Copy link
Member

@max-charlamb max-charlamb commented Nov 6, 2025

See context for changes: #120303 (comment)

markdown lint failure is unrelated and fixed in: #121421

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @steveisok, @dotnet/dotnet-diag
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces version 2 of the DebugInfo contract with a unified header format, while refactoring common code into a helper class. The new version replaces the flag byte approach of version 1 with a fat/slim chunk table for better extensibility.

  • Adds DebugInfo_2 implementing a new header format that encodes chunk sizes in nibble format
  • Extracts the bounds decoding logic into DebugInfoHelpers.DoBounds() for code reuse between versions
  • Updates documentation to describe the version 2 header encoding format
  • Removes PatchpointInfo data descriptors and related CDAC infrastructure

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/DebugInfo/DebugInfo_2.cs Implements version 2 of the DebugInfo contract with unified fat/slim header format for chunk size encoding
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/DebugInfo/DebugInfo_1.cs Refactored to use shared DebugInfoHelpers.DoBounds() method
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/DebugInfo/DebugInfoHelpers.cs New helper class containing shared bounds decoding logic with parameterized IL offset bias
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/DebugInfo/DebugInfoFactory.cs Registers version 2 implementation in the factory
src/coreclr/vm/datadescriptor/datadescriptor.inc Removes PatchpointInfo type definition from CDAC data descriptors
src/coreclr/inc/patchpointinfo.h Removes CDAC-related template specialization and friend declarations
docs/design/datacontracts/DebugInfo.md Documents version 2 header encoding and removes incorrect CodeVersions contract reference from version 1
Comments suppressed due to low confidence (1)

src/coreclr/vm/datadescriptor/datadescriptor.inc:667

  • The removal of the PatchpointInfo type will break DebugInfo_1 which still depends on it. The DebugInfo_1.cs implementation (lines 51-55) uses DataType.PatchpointInfo to read patchpoint information when the EXTRA_DEBUG_INFO_PATCHPOINT flag is set. This type definition should not be removed unless DebugInfo_1 is also updated to not rely on it, or Version 1 is being completely removed.
CDAC_TYPE_BEGIN(CodeHeapListNode)
CDAC_TYPE_FIELD(CodeHeapListNode, /*pointer*/, Next, offsetof(HeapList, hpNext))
CDAC_TYPE_FIELD(CodeHeapListNode, /*pointer*/, StartAddress, offsetof(HeapList, startAddress))
CDAC_TYPE_FIELD(CodeHeapListNode, /*pointer*/, EndAddress, offsetof(HeapList, endAddress))
CDAC_TYPE_FIELD(CodeHeapListNode, /*pointer*/, MapBase, offsetof(HeapList, mapBase))
CDAC_TYPE_FIELD(CodeHeapListNode, /*pointer*/, HeaderMap, offsetof(HeapList, pHdrMap))
CDAC_TYPE_END(CodeHeapListNode)

if ((mappingDataEncoded & 0x4) != 0)
sourceType |= SourceTypes.Async;

mappingDataEncoded >>= 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mappingDataEncoded >>= 2;
mappingDataEncoded >>= 3;

(Suggest introducing a constant BitsForSourceType)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed and added constant

[Flags]
public enum SourceTypes : uint
{
SourceTypeInvalid = 0x00, // To indicate that nothing else applies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pre-existing issue, but calling this constant SourceTypeInvalid doesn't sound right. I believe we do emit mappings that don't have any bits set. "Default" might be a better name.

| --- | --- | --- |
| IL_OFFSET_BIAS | IL offsets bias (unchanged from Version 1) | `0xfffffffd` (-3) |
| DEBUG_INFO_FAT | Marker value in first nibble-coded integer indicating a fat header follows | `0x0` |
| SOURCE_TYPE_BITS | Number of bits per bounds entry used for source type flags | 3 |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidwr @jakobbotsch - I noticed you were chatting about the extra bit on the other PR. These four source types are all mutually exclusive so they are encodable in only two bits if you'd like to update the runtime implementation. The interface defines them as a bit field implying they could be combined, but they can't in practice. (Technically the 'Invalid' value and the 'StackEmpty' value are combinable, buts its fine to encode that combination as 'StackEmpty')

| 1 | 0x2 | `StackEmpty` |
| 2 | 0x4 | `Async` (new in Version 2) |

`SourceTypeInvalid` is represented by all three bits clear (0). Combinations are produced by OR-ing masks (e.g., `StackEmpty | CallInstruction`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the encoding allows this combination to be represented, I'd never expect to see it. The bit patterns I'd expect to see are 0, 1, 2, and 4. We can either change the runtime implementation to make the combinations unrepresentable (a 4 value enumeration in 2 bits), or we could document it as-is.

Copy link
Member

@jakobbotsch jakobbotsch Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STACK_EMPTY | CALL_INSTRUCTION is produced by the JIT today -- in debug codegen when a call also happens to be the stack empty position.
What mappings would you expect to see in that case? It is odd to me that the source types are flags in the first place if this is not an expected possibility.


namespace Microsoft.Diagnostics.DataContractReader.Contracts;

internal sealed class DebugInfo_2(Target target) : IDebugInfo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than defining a new implementation that replicates most of the code from V1, what if we modify the existing DebugInfo_1 into DebugInfo_1_To_2(Target target, int version) and give it a little bit of conditional behavior in the right places? While I don't think this example is bad on its own, I'm hoping we can avoid having cDAC grow into something that has lots of duplicated code as contracts keep versioning. We always have the option to create a new type if we need it, but I hope we can reserve that for situations where the new version is substantially different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants