Skip to content

Commit 405a905

Browse files
kyleconroyclaude
andauthored
feat(ast): implement comprehensive SQL AST formatting (#4205)
* refactor(fmt_test): use config-based engine detection and parser for statement boundaries - Parse sqlc config file to determine database engine instead of hardcoding pgx/v5 path filter - Use parser's StmtLocation/StmtLen for proper statement boundaries instead of naive semicolon splitting - Handle both file and directory paths in queries config - Only test PostgreSQL for now (formatting support is PostgreSQL-only) This fixes issues with multi-query files containing semicolons in strings, PL/pgSQL functions, or DO blocks. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat(ast): add and improve Format methods for SQL AST nodes Add Format methods: - A_ArrayExpr: Format ARRAY[...] literals - NullIfExpr: Format NULLIF(arg1, arg2) function calls - OnConflictClause: Format ON CONFLICT ... DO UPDATE/NOTHING - InferClause: Format conflict target (columns) or ON CONSTRAINT - IndexElem: Format index elements for conflict targets - WindowDef: Format window definitions with PARTITION BY, ORDER BY, and frame clauses Improve existing Format methods: - A_Expr: Add BETWEEN, NOT BETWEEN, ILIKE, SIMILAR TO, IS DISTINCT FROM handling - A_Expr_Kind: Add all expression kind constants - CaseExpr: Handle CASE with test argument and optional ELSE - DeleteStmt: Add USING clause formatting - FuncCall: Add DISTINCT, ORDER BY, FILTER, and OVER clause support - InsertStmt: Delegate to OnConflictClause.Format - JoinExpr: Add RIGHT JOIN, FULL JOIN, NATURAL, and USING clause - LockingClause: Add OF clause, SKIP LOCKED, NOWAIT, and fix strength values - RangeFunction: Add LATERAL support and fix alias spacing - SelectStmt: Add HAVING clause formatting These changes reduce test failures from 135 to 102. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat(ast): add more Format methods for SQL AST nodes Add Format methods: - NullTest: Format IS NULL / IS NOT NULL expressions - ScalarArrayOpExpr: Format scalar op ANY/ALL (array) expressions - CommonTableExpr: Add column alias list support Improve existing Format methods: - WithClause: Fix spacing after WITH and RECURSIVE keywords These changes reduce test failures from 102 to 91. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat(postgresql): add custom Deparse wrapper with bug fixes - Switch fmt_test.go to use postgresql.Deparse instead of ast.Format - Add deparse.go and deparse_wasi.go with Deparse wrapper function - Fix pg_query_go bug: missing space before SKIP LOCKED - Skip tests with parse errors (e.g., syntax_errors test cases) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * feat(ast): complete SQL AST formatting implementation Fixes all ast.Format test failures by implementing comprehensive Format methods for SQL AST nodes. Key improvements include: - Named parameters (@param) formatting without space after @ - NULLIF expression support in A_Expr - NULLS FIRST/LAST in ORDER BY clauses - Type name mapping (int4→integer, timestamptz→timestamp with time zone) - Array type support (text[]) and type modifiers (varchar(32)) - CREATE FUNCTION with parameters, options (AS, LANGUAGE), and modes - CREATE EXTENSION statement formatting - DO $$ ... $$ anonymous code blocks - WITHIN GROUP clause for ordered-set aggregates - Automatic quoting for SQL reserved words and mixed-case identifiers - CROSS JOIN detection (JOIN without ON/USING clause) - LATERAL keyword in subselects and function calls - Array subscript access in UPDATE statements (names[$1]) - Proper AS keyword before aliases Also removes unused deparse files and cleans up fmt_test.go to use ast.Format directly. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor(postgresql): use existing convert functions instead of translate helpers Replace custom translate functions (translateTypeNameFromPG, translateOptions, translateNode, translateDefElem) with existing convert.go functions (convertTypeName, convertSlice) to maintain architectural consistency. Both parse.go and convert.go import the same pg_query_go/v6 package, so the types are compatible and the existing convert functions can be used directly. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor(format): add Formatter interface for SQL dialect-specific quoting - Create internal/sql/format package with Formatter interface - Add QuoteIdent method to TrackedBuffer that delegates to Formatter - Implement QuoteIdent on postgresql.Parser using existing IsReservedKeyword - Update all Format() methods to use buf.QuoteIdent() instead of local quoteIdent() - Remove duplicate reserved word logic from ast/column_ref.go - Update ast.Format() to accept a Formatter parameter This allows each SQL dialect to provide its own identifier quoting logic based on its reserved keywords and quoting rules. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * refactor(format): add TypeName method to Formatter interface - Add TypeName(ns, name string) string method to Formatter interface - Implement TypeName on postgresql.Parser with pg_catalog type mappings - Add TypeName method to TrackedBuffer that delegates to Formatter - Update ast.TypeName.Format to use buf.TypeName() - Remove mapTypeName from ast package (moved to postgresql package) This allows each SQL dialect to provide its own type name mappings (e.g., pg_catalog.int4 -> integer for PostgreSQL). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(postgresql): restore parseRelationFromNodes for column type resolution The convertTypeName function populates extra fields (Names, ArrayBounds, Typmods) on the TypeName struct which breaks the catalog's type equality check used for ALTER TYPE RENAME operations. This change: - Reverts to using parseRelationFromNodes + rel.TypeName() which only populates Catalog, Schema, Name fields needed for type resolution - Updates ColumnDef.Format to use IsArray field for array formatting since TypeName.ArrayBounds is no longer set 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
1 parent 3436b28 commit 405a905

40 files changed

+851
-107
lines changed

internal/endtoend/fmt_test.go

Lines changed: 98 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -8,66 +8,121 @@ import (
88
"strings"
99
"testing"
1010

11+
"github.com/sqlc-dev/sqlc/internal/config"
1112
"github.com/sqlc-dev/sqlc/internal/debug"
1213
"github.com/sqlc-dev/sqlc/internal/engine/postgresql"
1314
"github.com/sqlc-dev/sqlc/internal/sql/ast"
1415
)
1516

1617
func TestFormat(t *testing.T) {
1718
t.Parallel()
18-
parse := postgresql.NewParser()
1919
for _, tc := range FindTests(t, "testdata", "base") {
2020
tc := tc
21-
22-
if !strings.Contains(tc.Path, filepath.Join("pgx/v5")) {
23-
continue
24-
}
25-
26-
q := filepath.Join(tc.Path, "query.sql")
27-
if _, err := os.Stat(q); os.IsNotExist(err) {
28-
continue
29-
}
30-
3121
t.Run(tc.Name, func(t *testing.T) {
32-
contents, err := os.ReadFile(q)
22+
// Parse the config file to determine the engine
23+
configPath := filepath.Join(tc.Path, tc.ConfigName)
24+
configFile, err := os.Open(configPath)
3325
if err != nil {
3426
t.Fatal(err)
3527
}
36-
for i, query := range bytes.Split(bytes.TrimSpace(contents), []byte(";")) {
37-
if len(query) <= 1 {
38-
continue
39-
}
40-
query := query
41-
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
42-
expected, err := postgresql.Fingerprint(string(query))
43-
if err != nil {
44-
t.Fatal(err)
45-
}
46-
stmts, err := parse.Parse(bytes.NewReader(query))
47-
if err != nil {
48-
t.Fatal(err)
49-
}
50-
if len(stmts) != 1 {
51-
t.Fatal("expected one statement")
52-
}
53-
if false {
54-
r, err := postgresql.Parse(string(query))
55-
debug.Dump(r, err)
56-
}
28+
conf, err := config.ParseConfig(configFile)
29+
configFile.Close()
30+
if err != nil {
31+
t.Fatal(err)
32+
}
33+
34+
// Skip if there are no SQL packages configured
35+
if len(conf.SQL) == 0 {
36+
return
37+
}
38+
39+
// For now, only test PostgreSQL since that's the only engine with Format support
40+
engine := conf.SQL[0].Engine
41+
if engine != config.EnginePostgreSQL {
42+
return
43+
}
5744

58-
out := ast.Format(stmts[0].Raw)
59-
actual, err := postgresql.Fingerprint(out)
45+
// Find query files from config
46+
var queryFiles []string
47+
for _, sql := range conf.SQL {
48+
for _, q := range sql.Queries {
49+
queryPath := filepath.Join(tc.Path, q)
50+
info, err := os.Stat(queryPath)
6051
if err != nil {
61-
t.Error(err)
52+
continue
6253
}
63-
if expected != actual {
64-
debug.Dump(stmts[0].Raw)
65-
t.Errorf("- %s", expected)
66-
t.Errorf("- %s", string(query))
67-
t.Errorf("+ %s", actual)
68-
t.Errorf("+ %s", out)
54+
if info.IsDir() {
55+
// If it's a directory, glob for .sql files
56+
matches, err := filepath.Glob(filepath.Join(queryPath, "*.sql"))
57+
if err != nil {
58+
continue
59+
}
60+
queryFiles = append(queryFiles, matches...)
61+
} else {
62+
queryFiles = append(queryFiles, queryPath)
6963
}
70-
})
64+
}
65+
}
66+
67+
if len(queryFiles) == 0 {
68+
return
69+
}
70+
71+
parse := postgresql.NewParser()
72+
73+
for _, queryFile := range queryFiles {
74+
if _, err := os.Stat(queryFile); os.IsNotExist(err) {
75+
continue
76+
}
77+
78+
contents, err := os.ReadFile(queryFile)
79+
if err != nil {
80+
t.Fatal(err)
81+
}
82+
83+
// Parse the entire file to get proper statement boundaries
84+
stmts, err := parse.Parse(bytes.NewReader(contents))
85+
if err != nil {
86+
// Skip files with parse errors (e.g., syntax_errors test cases)
87+
return
88+
}
89+
90+
for i, stmt := range stmts {
91+
stmt := stmt
92+
t.Run(fmt.Sprintf("%d", i), func(t *testing.T) {
93+
// Extract the original query text using statement location and length
94+
start := stmt.Raw.StmtLocation
95+
length := stmt.Raw.StmtLen
96+
if length == 0 {
97+
// If StmtLen is 0, it means the statement goes to the end of the input
98+
length = len(contents) - start
99+
}
100+
query := strings.TrimSpace(string(contents[start : start+length]))
101+
102+
expected, err := postgresql.Fingerprint(query)
103+
if err != nil {
104+
t.Fatal(err)
105+
}
106+
107+
if false {
108+
r, err := postgresql.Parse(query)
109+
debug.Dump(r, err)
110+
}
111+
112+
out := ast.Format(stmt.Raw, parse)
113+
actual, err := postgresql.Fingerprint(out)
114+
if err != nil {
115+
t.Error(err)
116+
}
117+
if expected != actual {
118+
debug.Dump(stmt.Raw)
119+
t.Errorf("- %s", expected)
120+
t.Errorf("- %s", query)
121+
t.Errorf("+ %s", actual)
122+
t.Errorf("+ %s", out)
123+
}
124+
})
125+
}
71126
}
72127
})
73128
}

internal/engine/postgresql/convert.go

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1965,6 +1965,22 @@ func convertNullTest(n *pg.NullTest) *ast.NullTest {
19651965
}
19661966
}
19671967

1968+
func convertNullIfExpr(n *pg.NullIfExpr) *ast.NullIfExpr {
1969+
if n == nil {
1970+
return nil
1971+
}
1972+
return &ast.NullIfExpr{
1973+
Xpr: convertNode(n.Xpr),
1974+
Opno: ast.Oid(n.Opno),
1975+
Opresulttype: ast.Oid(n.Opresulttype),
1976+
Opretset: n.Opretset,
1977+
Opcollid: ast.Oid(n.Opcollid),
1978+
Inputcollid: ast.Oid(n.Inputcollid),
1979+
Args: convertSlice(n.Args),
1980+
Location: int(n.Location),
1981+
}
1982+
}
1983+
19681984
func convertObjectWithArgs(n *pg.ObjectWithArgs) *ast.ObjectWithArgs {
19691985
if n == nil {
19701986
return nil
@@ -3420,6 +3436,9 @@ func convertNode(node *pg.Node) ast.Node {
34203436
case *pg.Node_NullTest:
34213437
return convertNullTest(n.NullTest)
34223438

3439+
case *pg.Node_NullIfExpr:
3440+
return convertNullIfExpr(n.NullIfExpr)
3441+
34233442
case *pg.Node_ObjectWithArgs:
34243443
return convertObjectWithArgs(n.ObjectWithArgs)
34253444

internal/engine/postgresql/parse.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -494,6 +494,7 @@ func translate(node *nodes.Node) (ast.Node, error) {
494494
ReturnType: rt,
495495
Replace: n.Replace,
496496
Params: &ast.List{},
497+
Options: convertSlice(n.Options),
497498
}
498499
for _, item := range n.Parameters {
499500
arg := item.Node.(*nodes.Node_FunctionParameter).FunctionParameter

internal/engine/postgresql/reserved.go

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,59 @@ package postgresql
22

33
import "strings"
44

5+
// hasMixedCase returns true if the string has any uppercase letters
6+
// (identifiers with mixed case need quoting in PostgreSQL)
7+
func hasMixedCase(s string) bool {
8+
for _, r := range s {
9+
if r >= 'A' && r <= 'Z' {
10+
return true
11+
}
12+
}
13+
return false
14+
}
15+
16+
// QuoteIdent returns a quoted identifier if it needs quoting.
17+
// This implements the format.Formatter interface.
18+
func (p *Parser) QuoteIdent(s string) string {
19+
if p.IsReservedKeyword(s) || hasMixedCase(s) {
20+
return `"` + s + `"`
21+
}
22+
return s
23+
}
24+
25+
// TypeName returns the SQL type name for the given namespace and name.
26+
// This implements the format.Formatter interface.
27+
func (p *Parser) TypeName(ns, name string) string {
28+
if ns == "pg_catalog" {
29+
switch name {
30+
case "int4":
31+
return "integer"
32+
case "int8":
33+
return "bigint"
34+
case "int2":
35+
return "smallint"
36+
case "float4":
37+
return "real"
38+
case "float8":
39+
return "double precision"
40+
case "bool":
41+
return "boolean"
42+
case "bpchar":
43+
return "character"
44+
case "timestamptz":
45+
return "timestamp with time zone"
46+
case "timetz":
47+
return "time with time zone"
48+
default:
49+
return name
50+
}
51+
}
52+
if ns != "" {
53+
return ns + "." + name
54+
}
55+
return name
56+
}
57+
558
// https://www.postgresql.org/docs/current/sql-keywords-appendix.html
659
func (p *Parser) IsReservedKeyword(s string) bool {
760
switch strings.ToLower(s) {

internal/sql/ast/a_array_expr.go

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,12 @@ type A_ArrayExpr struct {
88
func (n *A_ArrayExpr) Pos() int {
99
return n.Location
1010
}
11+
12+
func (n *A_ArrayExpr) Format(buf *TrackedBuffer) {
13+
if n == nil {
14+
return
15+
}
16+
buf.WriteString("ARRAY[")
17+
buf.join(n.Elements, ", ")
18+
buf.WriteString("]")
19+
}

internal/sql/ast/a_expr.go

Lines changed: 73 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,19 +16,88 @@ func (n *A_Expr) Format(buf *TrackedBuffer) {
1616
if n == nil {
1717
return
1818
}
19-
buf.astFormat(n.Lexpr)
20-
buf.WriteString(" ")
2119
switch n.Kind {
2220
case A_Expr_Kind_IN:
21+
buf.astFormat(n.Lexpr)
2322
buf.WriteString(" IN (")
2423
buf.astFormat(n.Rexpr)
2524
buf.WriteString(")")
2625
case A_Expr_Kind_LIKE:
26+
buf.astFormat(n.Lexpr)
2727
buf.WriteString(" LIKE ")
2828
buf.astFormat(n.Rexpr)
29+
case A_Expr_Kind_ILIKE:
30+
buf.astFormat(n.Lexpr)
31+
buf.WriteString(" ILIKE ")
32+
buf.astFormat(n.Rexpr)
33+
case A_Expr_Kind_SIMILAR:
34+
buf.astFormat(n.Lexpr)
35+
buf.WriteString(" SIMILAR TO ")
36+
buf.astFormat(n.Rexpr)
37+
case A_Expr_Kind_BETWEEN:
38+
buf.astFormat(n.Lexpr)
39+
buf.WriteString(" BETWEEN ")
40+
if l, ok := n.Rexpr.(*List); ok && len(l.Items) == 2 {
41+
buf.astFormat(l.Items[0])
42+
buf.WriteString(" AND ")
43+
buf.astFormat(l.Items[1])
44+
}
45+
case A_Expr_Kind_NOT_BETWEEN:
46+
buf.astFormat(n.Lexpr)
47+
buf.WriteString(" NOT BETWEEN ")
48+
if l, ok := n.Rexpr.(*List); ok && len(l.Items) == 2 {
49+
buf.astFormat(l.Items[0])
50+
buf.WriteString(" AND ")
51+
buf.astFormat(l.Items[1])
52+
}
53+
case A_Expr_Kind_DISTINCT:
54+
buf.astFormat(n.Lexpr)
55+
buf.WriteString(" IS DISTINCT FROM ")
56+
buf.astFormat(n.Rexpr)
57+
case A_Expr_Kind_NOT_DISTINCT:
58+
buf.astFormat(n.Lexpr)
59+
buf.WriteString(" IS NOT DISTINCT FROM ")
60+
buf.astFormat(n.Rexpr)
61+
case A_Expr_Kind_NULLIF:
62+
buf.WriteString("NULLIF(")
63+
buf.astFormat(n.Lexpr)
64+
buf.WriteString(", ")
65+
buf.astFormat(n.Rexpr)
66+
buf.WriteString(")")
67+
case A_Expr_Kind_OP:
68+
// Check if this is a named parameter (@name)
69+
opName := ""
70+
if n.Name != nil && len(n.Name.Items) == 1 {
71+
if s, ok := n.Name.Items[0].(*String); ok {
72+
opName = s.Str
73+
}
74+
}
75+
if opName == "@" && !set(n.Lexpr) && set(n.Rexpr) {
76+
// Named parameter: @name (no space after @)
77+
buf.WriteString("@")
78+
buf.astFormat(n.Rexpr)
79+
} else {
80+
// Standard binary operator
81+
if set(n.Lexpr) {
82+
buf.astFormat(n.Lexpr)
83+
buf.WriteString(" ")
84+
}
85+
buf.astFormat(n.Name)
86+
if set(n.Rexpr) {
87+
buf.WriteString(" ")
88+
buf.astFormat(n.Rexpr)
89+
}
90+
}
2991
default:
92+
// Fallback for other cases
93+
if set(n.Lexpr) {
94+
buf.astFormat(n.Lexpr)
95+
buf.WriteString(" ")
96+
}
3097
buf.astFormat(n.Name)
31-
buf.WriteString(" ")
32-
buf.astFormat(n.Rexpr)
98+
if set(n.Rexpr) {
99+
buf.WriteString(" ")
100+
buf.astFormat(n.Rexpr)
101+
}
33102
}
34103
}

internal/sql/ast/a_expr_kind.go

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,20 @@ package ast
33
type A_Expr_Kind uint
44

55
const (
6-
A_Expr_Kind_IN A_Expr_Kind = 7
7-
A_Expr_Kind_LIKE A_Expr_Kind = 8
6+
A_Expr_Kind_OP A_Expr_Kind = 1
7+
A_Expr_Kind_OP_ANY A_Expr_Kind = 2
8+
A_Expr_Kind_OP_ALL A_Expr_Kind = 3
9+
A_Expr_Kind_DISTINCT A_Expr_Kind = 4
10+
A_Expr_Kind_NOT_DISTINCT A_Expr_Kind = 5
11+
A_Expr_Kind_NULLIF A_Expr_Kind = 6
12+
A_Expr_Kind_IN A_Expr_Kind = 7
13+
A_Expr_Kind_LIKE A_Expr_Kind = 8
14+
A_Expr_Kind_ILIKE A_Expr_Kind = 9
15+
A_Expr_Kind_SIMILAR A_Expr_Kind = 10
16+
A_Expr_Kind_BETWEEN A_Expr_Kind = 11
17+
A_Expr_Kind_NOT_BETWEEN A_Expr_Kind = 12
18+
A_Expr_Kind_BETWEEN_SYM A_Expr_Kind = 13
19+
A_Expr_Kind_NOT_BETWEEN_SYM A_Expr_Kind = 14
820
)
921

1022
func (n *A_Expr_Kind) Pos() int {

0 commit comments

Comments
 (0)