Skip to content

Commit 9a18447

Browse files
authored
Protobuf compliant translation for supporting richer identifiers for tables and columns (#3696)
This PR makes it possible to have some special characters in the identifiers of most of the constructs in Relational DDL and enables querying on them. Since the tables are defined as Descriptor and stored in DynamicMessage, it is imperative for table and column names to be a valid symbol in protoBuf, which is limited to a-z, A-Z, 0-9 and _. Hence, it support (slightly) richer character set, this PR mainly uses surgical translations over names to make the RecordLayer Type system and Protobuf types happy. However, since the Relational Type and RecordLayer type themselves honour the fact that the Relational name and RecordLayer name should be the same, the translation spills to the topmost layer and is done quite preemptively. To alleviate the issue, we might want to have a more intrinsic understanding of "display" name and "underlying" name. Another issue, comes from the fact that data flows through the execution plan in protobuf Dynamic Messages, which makes it imperative for the query aliases to be protobuf compliant as well!. We could potentially break out of this by doing some schenanigans around what the recordLayer type puts into the TypeRepository, but the work here takes a relatively simpler approach of just making all aliases and user names protobuf-friendly. Note that, not all contructs and names are affecting. For example, index names, schema template names, schema and database names should support unicode charset out of the box. However, this PR does not really test around that. I tried testing some part of it - while index names and schema template names do work, databases naming is restrictive currently and does not work. So, the scope of this PR is (or, should be) everything else other than what is mentioned above. From the implementation POV, the work takes a longer path of doing point translations in many places rather than just one translation in IdentifierVisitor to preemptively translate all identifiers. This is owing to the following reasons: - We would still want to leave some constructs from this translation - those that do not require one. We could rather argue that translation could be done even if not needed. However, this would have complicated things around customer facing operations like `show`, `describe` items. - Theoretically, we could very soon drift apart from using protobuf for runtime data types flowing through plans. Once on it, the scope of translation would reduce to barely to storage access operators - supporting the fact that we should not do translations blindly.
1 parent 8c089e6 commit 9a18447

File tree

15 files changed

+380
-82
lines changed

15 files changed

+380
-82
lines changed

fdb-relational-core/src/main/antlr/RelationalParser.g4

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -876,7 +876,7 @@ recordConstructorForInlineTable
876876
;
877877

878878
recordConstructor
879-
: ofTypeClause? '(' (uid DOT STAR | STAR | expressionWithName /* this can be removed */ | expressionWithOptionalName (',' expressionWithOptionalName)*) ')'
879+
: ofTypeClause? '(' (uid DOT STAR | STAR | expressionWithOptionalName (',' expressionWithOptionalName)*) ')'
880880
;
881881

882882
ofTypeClause
@@ -910,10 +910,6 @@ expressionOrDefault
910910
: expression | DEFAULT
911911
;
912912

913-
expressionWithName
914-
: expression AS uid
915-
;
916-
917913
expressionWithOptionalName
918914
: expression (AS uid)?
919915
;

fdb-relational-core/src/main/java/com/apple/foundationdb/relational/recordlayer/ddl/RecordLayerCatalogQueryFactory.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@
3737
import com.apple.foundationdb.relational.api.metadata.SchemaTemplate;
3838
import com.apple.foundationdb.relational.recordlayer.ArrayRow;
3939
import com.apple.foundationdb.relational.recordlayer.IteratorResultSet;
40+
import com.apple.foundationdb.relational.recordlayer.metadata.DataTypeUtils;
4041

4142
import javax.annotation.Nonnull;
4243
import java.net.URI;
@@ -64,7 +65,7 @@ public Type getResultSetMetadata() {
6465
public RelationalResultSet executeAction(Transaction txn) throws RelationalException {
6566
final Schema schema = catalog.loadSchema(txn, dbId, schemaId);
6667

67-
final List<String> tableNames = schema.getTables().stream().map(Metadata::getName)
68+
final List<String> tableNames = schema.getTables().stream().map(Metadata::getName).map(DataTypeUtils::toUserIdentifier)
6869
.collect(Collectors.toList());
6970

7071
final List<String> indexNames = schema.getTables().stream().flatMap(t -> t.getIndexes().stream()).map(Metadata::getName)

fdb-relational-core/src/main/java/com/apple/foundationdb/relational/recordlayer/metadata/DataTypeUtils.java

Lines changed: 37 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,11 @@
2121
package com.apple.foundationdb.relational.recordlayer.metadata;
2222

2323
import com.apple.foundationdb.annotation.API;
24-
2524
import com.apple.foundationdb.record.query.plan.cascades.typing.Type;
2625
import com.apple.foundationdb.relational.api.exceptions.ErrorCode;
2726
import com.apple.foundationdb.relational.api.metadata.DataType;
2827
import com.apple.foundationdb.relational.util.Assert;
2928
import com.apple.foundationdb.relational.util.SpotBugsSuppressWarnings;
30-
3129
import com.google.common.collect.BiMap;
3230
import com.google.common.collect.HashBiMap;
3331

@@ -41,6 +39,10 @@
4139
@API(API.Status.EXPERIMENTAL)
4240
public class DataTypeUtils {
4341

42+
private static final String DOUBLE_UNDERSCORE_ESCAPE = "__0";
43+
private static final String DOLLAR_ESCAPE = "__1";
44+
private static final String DOT_ESCAPE = "__2";
45+
4446
@Nonnull
4547
private static final BiMap<DataType, Type> primitivesMap;
4648

@@ -55,6 +57,20 @@ public class DataTypeUtils {
5557
@SpotBugsSuppressWarnings(value = "NP_NONNULL_RETURN_VIOLATION", justification = "should never happen, there is failUnchecked directly before that.")
5658
@Nonnull
5759
public static DataType toRelationalType(@Nonnull final Type type) {
60+
return toRelationalType(type, false);
61+
}
62+
63+
/**
64+
* Converts a Record Layer {@link Type} into a Relational {@link DataType}.
65+
*
66+
* Note: This method is expensive, use with care, i.e. try to cache its result as much as possible.
67+
*
68+
* @param type The Relational data type.
69+
* @return The corresponding Record Layer type.
70+
*/
71+
@SpotBugsSuppressWarnings(value = "NP_NONNULL_RETURN_VIOLATION", justification = "should never happen, there is failUnchecked directly before that.")
72+
@Nonnull
73+
public static DataType toRelationalType(@Nonnull final Type type, boolean toUserIdentifier) {
5874
if (primitivesMap.containsValue(type)) {
5975
return primitivesMap.inverse().get(type);
6076
}
@@ -70,32 +86,44 @@ public static DataType toRelationalType(@Nonnull final Type type) {
7086
switch (typeCode) {
7187
case RECORD:
7288
final var record = (Type.Record) type;
73-
final var columns = record.getFields().stream().map(field -> DataType.StructType.Field.from(field.getFieldName(), toRelationalType(field.getFieldType()), field.getFieldIndex())).collect(Collectors.toList());
74-
return DataType.StructType.from(record.getName() == null ? toProtoBufCompliantName(UUID.randomUUID().toString()) : record.getName(), columns, record.isNullable());
89+
final var columns = record.getFields().stream().map(field -> {
90+
final var fieldName = toUserIdentifier ? DataTypeUtils.toUserIdentifier(field.getFieldName()) : field.getFieldName();
91+
return DataType.StructType.Field.from(fieldName, toRelationalType(field.getFieldType(), toUserIdentifier), field.getFieldIndex());
92+
}).collect(Collectors.toList());
93+
final var name = record.getName() == null ? getUniqueName() : (toUserIdentifier ? DataTypeUtils.toUserIdentifier(record.getName()) : record.getName());
94+
return DataType.StructType.from(name, columns, record.isNullable());
7595
case ARRAY:
7696
final var asArray = (Type.Array) type;
7797
return DataType.ArrayType.from(toRelationalType(Assert.notNullUnchecked(asArray.getElementType())), asArray.isNullable());
7898
case ENUM:
7999
final var asEnum = (Type.Enum) type;
80100
final var enumValues = asEnum.getEnumValues().stream().map(v -> DataType.EnumType.EnumValue.of(v.getName(), v.getNumber())).collect(Collectors.toList());
81-
return DataType.EnumType.from(asEnum.getName() == null ? toProtoBufCompliantName(UUID.randomUUID().toString()) : asEnum.getName(), enumValues, asEnum.isNullable());
101+
return DataType.EnumType.from(asEnum.getName() == null ? getUniqueName() : asEnum.getName(), enumValues, asEnum.isNullable());
82102
default:
83103
Assert.failUnchecked(String.format(Locale.ROOT, "unexpected type %s", type));
84104
return null; // make compiler happy.
85105
}
86106
}
87107

88108
@Nonnull
89-
private static String toProtoBufCompliantName(@Nonnull final String input) {
90-
Assert.thatUnchecked(input.length() > 0);
91-
final var modified = input.replace("-", "_");
92-
final char c = input.charAt(0);
109+
private static String getUniqueName() {
110+
final var uuid = UUID.randomUUID().toString();
111+
final var modified = uuid.replace("-", "_");
112+
final char c = uuid.charAt(0);
93113
if (c == '_' || ('a' <= c && c <= 'z') || ('A' <= c && c <= 'Z')) {
94114
return modified;
95115
}
96116
return "id" + modified;
97117
}
98118

119+
public static String toProtoBufCompliantName(String userIdentifier) {
120+
return userIdentifier.replace("__", DOUBLE_UNDERSCORE_ESCAPE).replace("$", DOLLAR_ESCAPE).replace(".", DOT_ESCAPE);
121+
}
122+
123+
public static String toUserIdentifier(String protoIdentifier) {
124+
return protoIdentifier.replace(DOT_ESCAPE, ".").replace(DOLLAR_ESCAPE, "$").replace(DOUBLE_UNDERSCORE_ESCAPE, "__");
125+
}
126+
99127
/**
100128
* Converts a given Relational {@link DataType} into a corresponding Record Layer {@link Type}.
101129
*

fdb-relational-core/src/main/java/com/apple/foundationdb/relational/recordlayer/query/Identifier.java

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,13 +22,15 @@
2222

2323
import com.apple.foundationdb.annotation.API;
2424

25+
import com.apple.foundationdb.relational.recordlayer.metadata.DataTypeUtils;
2526
import com.google.common.collect.ImmutableList;
2627

2728
import javax.annotation.Nonnull;
2829
import java.util.Collection;
2930
import java.util.List;
3031
import java.util.Objects;
3132
import java.util.function.Function;
33+
import java.util.stream.Collectors;
3234

3335
@API(API.Status.EXPERIMENTAL)
3436
public class Identifier {
@@ -142,6 +144,13 @@ public boolean qualifiedWith(@Nonnull Identifier identifier) {
142144
return true;
143145
}
144146

147+
@Nonnull
148+
public static Identifier toProtobufCompliant(@Nonnull final Identifier identifier) {
149+
final var qualifier = identifier.getQualifier().stream().map(DataTypeUtils::toProtoBufCompliantName).collect(Collectors.toList());
150+
final var name = DataTypeUtils.toProtoBufCompliantName(identifier.getName());
151+
return Identifier.of(name, qualifier);
152+
}
153+
145154
@Override
146155
public boolean equals(Object obj) {
147156
if (obj == this) {

fdb-relational-core/src/main/java/com/apple/foundationdb/relational/recordlayer/query/PseudoColumn.java

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,4 +73,13 @@ public static Optional<Expression> mapToExpressionMaybe(@Nonnull LogicalOperator
7373
}
7474
return Optional.empty();
7575
}
76+
77+
public static boolean isPseudoColumn(@Nonnull String name) {
78+
for (PseudoColumn pseudo : PseudoColumn.values()) {
79+
if (name.equals(pseudo.getColumnName())) {
80+
return true;
81+
}
82+
}
83+
return false;
84+
}
7685
}

fdb-relational-core/src/main/java/com/apple/foundationdb/relational/recordlayer/query/QueryPlan.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -404,7 +404,7 @@ private RelationalResultSet executePhysicalPlan(@Nonnull final RecordLayerSchema
404404
parsedContinuation.getExecutionState(),
405405
executeProperties));
406406
final var currentPlanHashMode = OptionsUtils.getCurrentPlanHashMode(options);
407-
final var dataType = (DataType.StructType) DataTypeUtils.toRelationalType(type);
407+
final var dataType = (DataType.StructType) DataTypeUtils.toRelationalType(type, true);
408408
return executionContext.metricCollector.clock(RelationalMetric.RelationalEvent.CREATE_RESULT_SET_ITERATOR, () -> {
409409
final ResumableIterator<Row> iterator = RecordLayerIterator.create(cursor, messageFDBQueriedRecord -> new MessageTuple(messageFDBQueriedRecord.getMessage()));
410410
return new RecordLayerResultSet(RelationalStructMetaData.of(dataType), iterator, connection,

fdb-relational-core/src/main/java/com/apple/foundationdb/relational/recordlayer/query/visitors/BaseVisitor.java

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1343,12 +1343,6 @@ public Object visitExpressionOrDefault(@Nonnull RelationalParser.ExpressionOrDef
13431343
return visitChildren(ctx);
13441344
}
13451345

1346-
@Nonnull
1347-
@Override
1348-
public Expression visitExpressionWithName(@Nonnull RelationalParser.ExpressionWithNameContext ctx) {
1349-
return expressionVisitor.visitExpressionWithName(ctx);
1350-
}
1351-
13521346
@Nonnull
13531347
@Override
13541348
public Expression visitExpressionWithOptionalName(@Nonnull RelationalParser.ExpressionWithOptionalNameContext ctx) {

fdb-relational-core/src/main/java/com/apple/foundationdb/relational/recordlayer/query/visitors/DdlVisitor.java

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ public static DdlVisitor of(@Nonnull BaseVisitor delegate,
106106
public DataType visitFunctionColumnType(@Nonnull final RelationalParser.FunctionColumnTypeContext ctx) {
107107
final var semanticAnalyzer = getDelegate().getSemanticAnalyzer();
108108
if (ctx.customType != null) {
109-
final var columnType = visitUid(ctx.customType);
109+
final var columnType = Identifier.toProtobufCompliant(visitUid(ctx.customType));
110110
return semanticAnalyzer.lookupType(columnType, true, false, metadataBuilder::findType);
111111
}
112112
return visitPrimitiveType(ctx.primitiveType()).withNullable(true);
@@ -144,7 +144,7 @@ public DataType visitPrimitiveType(@Nonnull RelationalParser.PrimitiveTypeContex
144144
@Nonnull
145145
@Override
146146
public RecordLayerColumn visitColumnDefinition(@Nonnull RelationalParser.ColumnDefinitionContext ctx) {
147-
final var columnId = visitUid(ctx.colName);
147+
final var columnId = Identifier.toProtobufCompliant(visitUid(ctx.colName));
148148
final var isRepeated = ctx.ARRAY() != null;
149149
final var isNullable = ctx.columnConstraint() != null ? (Boolean) ctx.columnConstraint().accept(this) : true;
150150
// TODO: We currently do not support NOT NULL for any type other than ARRAY. This is because there is no way to
@@ -153,7 +153,7 @@ public RecordLayerColumn visitColumnDefinition(@Nonnull RelationalParser.ColumnD
153153
// but a way to represent it in RecordMetadata.
154154
Assert.thatUnchecked(isRepeated || isNullable, ErrorCode.UNSUPPORTED_OPERATION, "NOT NULL is only allowed for ARRAY column type");
155155
containsNullableArray = containsNullableArray || (isRepeated && isNullable);
156-
final var columnTypeId = ctx.columnType().customType != null ? visitUid(ctx.columnType().customType) : Identifier.of(ctx.columnType().getText());
156+
final var columnTypeId = ctx.columnType().customType != null ? Identifier.toProtobufCompliant(visitUid(ctx.columnType().customType)) : Identifier.of(ctx.columnType().getText());
157157
final var semanticAnalyzer = getDelegate().getSemanticAnalyzer();
158158
final var columnType = semanticAnalyzer.lookupType(columnTypeId, isNullable, isRepeated, metadataBuilder::findType);
159159
return RecordLayerColumn.newBuilder().setName(columnId.getName()).setDataType(columnType).build();
@@ -162,14 +162,15 @@ public RecordLayerColumn visitColumnDefinition(@Nonnull RelationalParser.ColumnD
162162
@Nonnull
163163
@Override
164164
public RecordLayerTable visitTableDefinition(@Nonnull RelationalParser.TableDefinitionContext ctx) {
165-
final var tableId = visitUid(ctx.uid());
165+
final var tableId = Identifier.toProtobufCompliant(visitUid(ctx.uid()));
166166
final var columns = ctx.columnDefinition().stream().map(this::visitColumnDefinition).collect(ImmutableList.toImmutableList());
167167
final var tableBuilder = RecordLayerTable.newBuilder(metadataBuilder.isIntermingleTables())
168168
.setName(tableId.getName())
169169
.addColumns(columns);
170170
if (ctx.primaryKeyDefinition().fullIdList() != null) {
171171
visitFullIdList(ctx.primaryKeyDefinition().fullIdList())
172172
.stream()
173+
.map(Identifier::toProtobufCompliant)
173174
.map(Identifier::fullyQualifiedName)
174175
.forEach(tableBuilder::addPrimaryKeyPart);
175176
}
@@ -179,7 +180,7 @@ public RecordLayerTable visitTableDefinition(@Nonnull RelationalParser.TableDefi
179180
@Nonnull
180181
@Override
181182
public RecordLayerTable visitStructDefinition(@Nonnull RelationalParser.StructDefinitionContext ctx) {
182-
final var structId = visitUid(ctx.uid());
183+
final var structId = Identifier.toProtobufCompliant(visitUid(ctx.uid()));
183184
final var columns = ctx.columnDefinition().stream().map(this::visitColumnDefinition).collect(ImmutableList.toImmutableList());
184185
final var structBuilder = RecordLayerTable.newBuilder(metadataBuilder.isIntermingleTables())
185186
.setName(structId.getName())
@@ -337,7 +338,7 @@ private RecordLayerInvokedRoutine getInvokedRoutineMetadata(@Nonnull final Parse
337338
final var isTemporary = functionCtx instanceof RelationalParser.CreateTempFunctionContext;
338339

339340
// 1. get the function name.
340-
final var functionName = visitFullId(functionSpecCtx.schemaQualifiedRoutineName).toString();
341+
final var functionName = Identifier.toProtobufCompliant(visitFullId(functionSpecCtx.schemaQualifiedRoutineName)).toString();
341342

342343
// 2. get the function SQL definition string.
343344
final var queryString = getDelegate().getPlanGenerationContext().getQuery();
@@ -409,7 +410,7 @@ public ProceduralPlan visitCreateTempFunction(@Nonnull RelationalParser.CreateTe
409410

410411
@Override
411412
public ProceduralPlan visitDropTempFunction(@Nonnull RelationalParser.DropTempFunctionContext ctx) {
412-
final var functionName = visitFullId(ctx.schemaQualifiedRoutineName).toString();
413+
final var functionName = Identifier.toProtobufCompliant(visitFullId(ctx.schemaQualifiedRoutineName)).toString();
413414
var throwIfNotExists = ctx.IF() == null && ctx.EXISTS() == null;
414415
return ProceduralPlan.of(metadataOperationsFactory.getDropTemporaryFunctionConstantAction(throwIfNotExists, functionName));
415416
}
@@ -428,7 +429,7 @@ private UserDefinedFunction visitSqlInvokedFunction(@Nonnull final RelationalPar
428429
@Nonnull final RelationalParser.RoutineBodyContext bodyCtx,
429430
boolean isTemporary) {
430431
// get the function name.
431-
final var functionName = visitFullId(functionSpecCtx.schemaQualifiedRoutineName).toString();
432+
final var functionName = Identifier.toProtobufCompliant(visitFullId(functionSpecCtx.schemaQualifiedRoutineName)).toString();
432433

433434
// run implementation-specific validations.
434435
final var props = functionSpecCtx.routineCharacteristics();
@@ -536,7 +537,7 @@ public Expressions visitSqlParameterDeclarations(final RelationalParser.SqlParam
536537
@Override
537538
public Expression visitSqlParameterDeclaration(@Nonnull RelationalParser.SqlParameterDeclarationContext ctx) {
538539
Assert.thatUnchecked(ctx.sqlParameterName != null, "unnamed parameters not supported");
539-
final var parameterName = visitUid(ctx.sqlParameterName);
540+
final var parameterName = Identifier.toProtobufCompliant(visitUid(ctx.sqlParameterName));
540541
final var parameterType = visitFunctionColumnType(ctx.parameterType);
541542
final var underlyingType = DataTypeUtils.toRecordLayerType(parameterType);
542543
Assert.thatUnchecked(parameterType.isResolved());

fdb-relational-core/src/main/java/com/apple/foundationdb/relational/recordlayer/query/visitors/DelegatingVisitor.java

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1191,12 +1191,6 @@ public Object visitExpressionOrDefault(@Nonnull RelationalParser.ExpressionOrDef
11911191
return getDelegate().visitExpressionOrDefault(ctx);
11921192
}
11931193

1194-
@Nonnull
1195-
@Override
1196-
public Expression visitExpressionWithName(@Nonnull RelationalParser.ExpressionWithNameContext ctx) {
1197-
return getDelegate().visitExpressionWithName(ctx);
1198-
}
1199-
12001194
@Nonnull
12011195
@Override
12021196
public Expression visitExpressionWithOptionalName(@Nonnull RelationalParser.ExpressionWithOptionalNameContext ctx) {

0 commit comments

Comments
 (0)