Reintroduce translation of identifiers to Protobuf compliant names #3736

alecgrieser · 2025-11-10T16:57:13Z

This reintroduces the work that was reverted (namely #3696 and #3706; see also #3726). It mainly copies the original PR for the translation work itself, though it takes a different approach to actually performing the translation.

Perhaps most notably, this purposefully omits translation of function, view, and index names--essentially, any item that is not serialized into a Protobuf identifier is skipped. That means that we only worry about type (including table, struct, and enum names), enum values, and struct/table fields.

From an implementation this perspective, this moves the translation of the names into the areas that are designed to interface between the Type system of the planner and the Protobuf-based system of Record Layer core. This is then stored in the Type information that is associated with the plan. At the moment, we cheat in a few places and require that the type's fields are always the result of applying the translation function to the display name, or vice versa. However, it allows us to think about allowing custom field names in the future. In a world where we only use Protobuf for storage but have a different mechanism for keeping track of fields in the runtime, we'd need a similar mechanism, though it may take the form of something like a field name to ordinal position mapping, say.

This borrows and expands the tests from #3696 and #3706. In particular, it adds additional tests around enums and view and function definitions with non-compliant names, which now work.

This reintroduces the work that was reverted (namely FoundationDB#3696 and FoundationDB#3706; see also FoundationDB#3726). It mainly copies the original PR for the translation work itself, though it takes a different approach to actually performing the translation. Perhaps most notably, this purposefully omits translation of function, view, and index names--essentially, any item that is not serialized into a Protobuf identifier is skipped. That means that we only worry about type (including table, struct, and enum names), enum values, and struct/table fields. From an implementation this perspective, this moves the translation of the names into the areas that are designed to interface between the `Type` system of the planner and the Protobuf-based system of Record Layer core. This is then stored in the `Type` information that is associated with the plan. At the moment, we cheat in a few places and require that the type's fields are always the result of applying the translation function to the display name, or vice versa. However, it allows us to think about allowing custom field names in the future. In a world where we only use Protobuf for storage but have a different mechanism for keeping track of fields in the runtime, we'd need a similar mechanism, though it may take the form of something like a field name to ordinal position mapping, say. This borrows and expands the tests from #3969 and FoundationDB#3706. In particular, it adds additional tests around enums and view and function definitions with non-compliant names, which now work.

…using trouble

…ined

github-actions · 2025-11-13T10:39:40Z

📊 Metrics Diff Analysis Report

Summary

New queries: 50
Dropped queries: 0
Plan changed + metrics changed: 0
Plan unchanged + metrics changed: 0

ℹ️ About this analysis

This automated analysis compares query planner metrics between the base branch and this PR. It categorizes changes into:

New queries: Queries added in this PR
Dropped queries: Queries removed in this PR. These should be reviewed to ensure we are not losing coverage.
Plan changed + metrics changed: The query plan has changed along with planner metrics.
Metrics only changed: Same plan but different metrics

The last category in particular may indicate planner regressions that should be investigated.

New Queries

Count of new queries by file:

yaml-tests/src/test/resources/valid-identifiers.metrics.yaml: 50

hatyo · 2025-11-13T12:36:41Z

The inconsistency between table names, and the corresponding (qualified) field names seems confusing. Would it be possible to fix it? (see below, foo__2tableA when used a type in the type filter vs. foo.tableA when used as a field qualifier. Although I am not sure this is what we want, considering this makes the plan diverge from the actual user SQL, we could do it the other way around as well, i.e. printing the user-defined type names instead of storage names in the TypeFilter and so on, which seems better I guess?

If I am not mistaken, that shouldn't be difficult, maybe we can just print the storage name of the FieldValue's FieldAccessor(s) in FieldValue#toString.

alecgrieser · 2025-11-13T12:43:30Z

Yes, we could address that. There are two ways you could address that:

Use the storage name in the FieldAccessor's path, so that you get storage names everywhere, as you said
Use the user-visible names everywhere, in which case we'd want to adjust the way we handle the inputs in the FUSE and LogicalTypeFilterExpressions (and also, while we're there, InsertExpressions, UpdateExpressions, and DeleteExpressions).

I actually kind of prefer the latter. One thing I'm not super happy about is that the user still needs to be a bit careful about which String they supply when things are a string. If the various expressions that cared about type names took, like, a TypeIdentifier that had both, then they could use one for everything except interfacing with the Record Layer. And that would include toString.

One reason to keep the user-visible names during FieldValue's toString is that it makes it easier for the user to make sense of a plan. You could imagine in a future where we completely separate the type name from the internal storage name that it would be hard for a user to understand _._uuid1._uuid2 instead of _.parent.child. Which is why I'd like to standardize the other way

hatyo

LGTM.

alecgrieser · 2025-11-13T12:56:11Z

I'll go ahead and address the table name display inconsistency in a follow up PR.

alecgrieser added the enhancement New feature or request label Nov 10, 2025

alecgrieser added 6 commits November 11, 2025 10:19

address teamscale warnings ; remove change to dereference that was ca…

e4a6db4

…using trouble

fix spotbugs and PMD failures

2c2557a

remove unused method and switch functions around so they are more cha…

ce372a4

…ined

add additional tests to cover test gaps from teamscale

1c5d76e

add unit tests for proto utils methods

4327f99

address Field::withNullability test gap

dee3b95

alecgrieser requested a review from hatyo November 13, 2025 10:06

alecgrieser marked this pull request as ready for review November 13, 2025 10:06

make DeleteExpression and UpdateExpression consistent

6df9fcb

hatyo approved these changes Nov 13, 2025

View reviewed changes

alecgrieser merged commit 4873b51 into FoundationDB:main Nov 13, 2025
13 of 14 checks passed

alecgrieser deleted the reintroduce-protobuf-escaping branch November 13, 2025 12:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reintroduce translation of identifiers to Protobuf compliant names #3736

Reintroduce translation of identifiers to Protobuf compliant names #3736

alecgrieser commented Nov 10, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 13, 2025

Uh oh!

hatyo commented Nov 13, 2025 •

edited

Loading

Uh oh!

alecgrieser commented Nov 13, 2025

Uh oh!

hatyo left a comment

Uh oh!

alecgrieser commented Nov 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Reintroduce translation of identifiers to Protobuf compliant names #3736

Reintroduce translation of identifiers to Protobuf compliant names #3736

Conversation

alecgrieser commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 13, 2025

📊 Metrics Diff Analysis Report

Summary

New Queries

Uh oh!

hatyo commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alecgrieser commented Nov 13, 2025

Uh oh!

hatyo left a comment

Choose a reason for hiding this comment

Uh oh!

alecgrieser commented Nov 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alecgrieser commented Nov 10, 2025 •

edited

Loading

hatyo commented Nov 13, 2025 •

edited

Loading