draft: feat(backend/sdoc_source_code): Support merge by MID #2549

haxtibal · 2025-11-07T13:59:41Z

Until now, finding a static merge candidate for source nodes relied on having a UID field set at source code side. Now we can also use MIDs for this purpose. Merge by MID gets priority over merge merge by UID.

This requires thinking about several edge cases. See test descriptions for some of them.

strictdoc/core/file_traceability_index.py

haxtibal · 2025-11-08T22:38:31Z

strictdoc/core/file_traceability_index.py

+                # TODO:
+                # If we really want to support changing the auto-assigned MID,
+                # at least the graph database and the document search index need an update (remove old MID, add new MID).
+                # I currently struggle to update the search index.


I think I've cleaned up up a bit now. Here is my remaining problem. I try to allow and handle this case

# sdoc [SRC_NODE] UID: SREQ-1 # example.c /* * UID: SREQ-1 * MID: 12345678 */ example_1() { }

meaning a MID from source code would overwrite the earlier auto-assigned MID. However, auto-assigned MID has been entered to graph and search index early (at least, maybe also to other places?) and I would need to update them. I know how to update the graph, but couldn't figure out how to consistently update the search index.

What do you think, would it be easy enough to do these updates, or should I simply not allow that edge case (exit with error)?

EDIT: For myself, I tend to not allow it. That edge case is not needed for the Linux showcase. I don't like the idea of having to modify already established graph connections while we're still in traceability construction phase. Rather, it should be possible to conceptually separate things into a "compile" and "link" phase as you already suggested. Source node parsing and merging would be part of the compile phase. At it's end we would know all nodes with some "I would like to link to..." information, but nothing is actually linked yet. And only the final link stage will add links to the graph DB and create the search index.

I need to think about your comment but something that is not clear to me even before and in general is how we want to auto-generate the UUID for both source code and sidecar when both or either of SDoc node's or source file's MID/UUID do not exist yet. Is my understanding correct that we will not have the human-readable UID at all in the Linux context?

# sdoc [SRC_NODE] * Has no MID or UID, or MID only but the source node may not have it initially? # example.c /* * Has no MID or UID, or MID only but the SDoc document node may not have it initially? */ example_1() { }

On other words, it is some sort of a chicken-and-egg problem. How are we imaging the workflow of auto-generating MID/UUID between source code and sidecars?

something that is not clear to me even before and in general is how we want to auto-generate the UUID for both source code and sidecar when both or either of SDoc node or source file do not exist yet

We can look at ELISA's trace_events.c annotations and their idgen.py script.

Source code starts of like

/* * SPDX-Req-ID: [TODO: automatically generate it] * ... */

Then one shall call idgen.py generate trace_events.c to calculate sha256sum("linux" + "trace_events.c" + instance + code), where instance is the text after SPDX-Text: in the comment, and code is the full C-function definition without comment.

The script acknowledges the problem you have mentioned

# TODO: since sidecar is not yet defined, this script doesn't consider the sidecar added content to the instance.

I see a few options:

For initial uuid generation, only hash over content in source code but neglect the sdoc part (that's what idgen.py currently does). Copying the generated UUID to sdoc is a manual step. Only the second run will have the nodes merged.

Start off with SPDX-Req-ID: UUID-TICKET-123, and MID: UUID-TICKET-123 in related sdoc. When StrictDoc sees such a preliminary UID, it will replace it with a proper calculated hash value

Start off with SPDX-Req-ID: [TODO: automatically generate it], and MID: tracing.c/__ftrace_event_enable_disable in related sdoc. Let StrictDoc merge by conventional MID and replace conventional UID with proper calculated hash value

Use UID (manually assigned) + MID, and merge by UID.

I have no clear favorite right now from that options. Maybe we should ask Gabriele?

Is my understanding correct that we will not have the UID at all in the Linux context?

Yes, that's also my understanding. The pilot work nowhere mentions a UID. If we wanted one, it's up to us to propose it.

Until now, finding a static merge candidate for source nodes relied on having a UID field set at source code side. Now we can also use MIDs for this purpose. Merge by MID gets priority over merge merge by UID. This requires thinking about several edge cases. See test desriptions for some of them.

stanislaw reviewed Nov 8, 2025

View reviewed changes

strictdoc/core/file_traceability_index.py Outdated Show resolved Hide resolved

stanislaw reviewed Nov 8, 2025

View reviewed changes

strictdoc/core/file_traceability_index.py Outdated Show resolved Hide resolved

stanislaw reviewed Nov 8, 2025

View reviewed changes

strictdoc/core/file_traceability_index.py Outdated Show resolved Hide resolved

haxtibal force-pushed the tdmg/source_node_mid branch from 7b45145 to cdf5a9e Compare November 8, 2025 22:28

haxtibal commented Nov 8, 2025

View reviewed changes

haxtibal force-pushed the tdmg/source_node_mid branch from cdf5a9e to 97af3e1 Compare November 9, 2025 12:48

haxtibal mentioned this pull request Nov 10, 2025

draft: feat(backend/sdoc_source_code): Allow empty lines in source node fields #2554

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

draft: feat(backend/sdoc_source_code): Support merge by MID #2549

draft: feat(backend/sdoc_source_code): Support merge by MID #2549

Uh oh!

haxtibal commented Nov 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

haxtibal Nov 8, 2025 •

edited

Loading

Uh oh!

stanislaw Nov 9, 2025 •

edited

Loading

Uh oh!

stanislaw Nov 9, 2025

Uh oh!

haxtibal Nov 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

draft: feat(backend/sdoc_source_code): Support merge by MID #2549

Are you sure you want to change the base?

draft: feat(backend/sdoc_source_code): Support merge by MID #2549

Uh oh!

Conversation

haxtibal commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

haxtibal Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stanislaw Nov 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stanislaw Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

haxtibal Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

haxtibal commented Nov 7, 2025 •

edited

Loading

haxtibal Nov 8, 2025 •

edited

Loading

stanislaw Nov 9, 2025 •

edited

Loading