Skip to content

Commit dac7c8c

Browse files
committed
Initial commit
0 parents  commit dac7c8c

File tree

11 files changed

+917
-0
lines changed

11 files changed

+917
-0
lines changed

.github/workflows/ci.yml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches:
6+
- master
7+
pull_request:
8+
branches:
9+
- master
10+
11+
jobs:
12+
test:
13+
runs-on: ubuntu-latest
14+
15+
steps:
16+
- name: Checkout repository
17+
uses: actions/checkout@v4
18+
19+
- name: Install the latest version of rye
20+
uses: eifinger/setup-rye@v4
21+
22+
- name: Install project dependencies
23+
run: rye sync
24+
25+
- name: Run tests
26+
run: rye test

.gitignore

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# python generated files
2+
__pycache__/
3+
*.py[oc]
4+
build/
5+
dist/
6+
wheels/
7+
*.egg-info
8+
9+
# venv
10+
.venv

.python-version

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
3.12.4

LICENSE

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Copyright 2024-2025 Rusty Conover <rusty@query.farm> - https://query.farm
2+
3+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4+
5+
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6+
7+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

README.md

Lines changed: 185 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,185 @@
1+
# Query.Farm SQL Manipulation
2+
3+
A Python library for intelligent SQL predicate manipulation using [SQLGlot](https://sqlglot.com/sqlglot.html). This library provides tools to safely remove specific predicates from `SQL WHERE` clauses and filter SQL statements based on column availability.
4+
5+
## Features
6+
7+
- **Predicate Removal**: Safely remove specific predicates from complex `SQL WHERE` clauses while preserving logical structure
8+
- **Column Filtering**: Filter SQL statements to only include predicates referencing allowed columns
9+
- **Intelligent Logic Handling**: Properly handles `AND/OR` logic, nested expressions, `CASE` statements, and parentheses
10+
- **SQLGlot Integration**: Built on top of [SQLGlot](https://sqlglot.com/sqlglot.html) for robust SQL parsing and manipulation
11+
- **Multiple Dialect Support**: Works with various SQL dialects (default: DuckDB)
12+
13+
## Installation
14+
15+
```bash
16+
pip install query-farm-sql-manipulation
17+
```
18+
19+
## Requirements
20+
21+
- Python >= 3.12
22+
- SQLGlot >= 26.33.0
23+
24+
## Quick Start
25+
26+
### Basic Predicate Removal
27+
28+
```python
29+
import sqlglot
30+
from query_farm_sql_manipulation import transforms
31+
32+
# Parse a SQL statement
33+
sql = 'SELECT * FROM data WHERE x = 1 AND y = 2'
34+
statement = sqlglot.parse_one(sql, dialect="duckdb")
35+
36+
# Find the predicate you want to remove
37+
predicates = list(statement.find_all(sqlglot.expressions.Predicate))
38+
target_predicate = predicates[0] # x = 1
39+
40+
# Remove the predicate
41+
transforms.remove_expression_part(target_predicate)
42+
43+
# Result: SELECT * FROM data WHERE y = 2
44+
print(statement.sql())
45+
```
46+
47+
### Column-Based Filtering
48+
49+
```python
50+
from query_farm_sql_manipulation import transforms
51+
52+
# Filter SQL to only include predicates with allowed columns
53+
sql = 'SELECT * FROM data WHERE color = "red" AND size > 10 AND type = "car"'
54+
allowed_columns = {"color", "type"}
55+
56+
filtered = transforms.filter_column_references_statement(
57+
sql=sql,
58+
allowed_column_names=allowed_columns,
59+
dialect="duckdb"
60+
)
61+
62+
# Result: SELECT * FROM data WHERE color = "red" AND type = "car"
63+
print(filtered.sql())
64+
```
65+
66+
## API Reference
67+
68+
### `remove_expression_part(child: sqlglot.Expression) -> None`
69+
70+
Removes the specified expression from its parent, respecting logical structure.
71+
72+
**Parameters:**
73+
- `child`: The SQLGlot expression to remove
74+
75+
**Raises:**
76+
- `ValueError`: If the expression cannot be safely removed
77+
78+
**Supported Parent Types:**
79+
- `AND`/`OR` expressions: Replaces parent with the remaining operand
80+
- `WHERE` clauses: Removes the entire WHERE clause if it becomes empty
81+
- `Parentheses`: Recursively removes the parent
82+
- `NOT` expressions: Removes the entire NOT expression
83+
- `CASE` statements: Removes conditional branches
84+
85+
### `filter_column_references_statement(*, sql: str, allowed_column_names: Container[str], dialect: str = "duckdb") -> sqlglot.Expression`
86+
87+
Filters a SQL statement to remove predicates containing columns not in the allowed set.
88+
89+
**Parameters:**
90+
- `sql`: The SQL statement to filter
91+
- `allowed_column_names`: Container of column names that should be preserved
92+
- `dialect`: SQL dialect for parsing (default: "duckdb")
93+
94+
**Returns:**
95+
- Filtered SQLGlot expression with non-allowed columns removed
96+
97+
**Raises:**
98+
- `ValueError`: If a column can't be cleanly removed due to interactions with allowed columns
99+
100+
## Examples
101+
102+
### Complex Logic Handling
103+
104+
The library intelligently handles complex logical expressions:
105+
106+
```python
107+
# Original: (x = 1 AND y = 2) OR z = 3
108+
# Remove y = 2: x = 1 OR z = 3
109+
110+
# Original: NOT (x = 1 AND y = 2)
111+
# Remove x = 1: NOT y = 2 (which becomes y <> 2)
112+
113+
# Original: CASE WHEN x = 1 THEN 'yes' WHEN x = 2 THEN 'maybe' ELSE 'no' END
114+
# Remove x = 1: CASE WHEN x = 2 THEN 'maybe' ELSE 'no' END
115+
```
116+
117+
### Column Filtering with Complex Expressions
118+
119+
```python
120+
sql = '''
121+
SELECT * FROM users
122+
WHERE age > 18
123+
AND (status = 'active' OR role = 'admin')
124+
AND department IN ('engineering', 'sales')
125+
'''
126+
127+
# Only keep predicates involving 'age' and 'role'
128+
allowed_columns = {'age', 'role'}
129+
130+
result = transforms.filter_column_references_statement(
131+
sql=sql,
132+
allowed_column_names=allowed_columns
133+
)
134+
135+
# Result: SELECT * FROM users WHERE age > 18 AND role = 'admin'
136+
```
137+
138+
### Error Handling
139+
140+
The library will raise `ValueError` when predicates cannot be safely removed:
141+
142+
```python
143+
# This will raise ValueError because x = 1 is part of a larger expression
144+
sql = "SELECT * FROM data WHERE result = (x = 1)"
145+
# Cannot remove x = 1 because it's used as a value, not a predicate
146+
```
147+
148+
## Supported SQL Constructs
149+
150+
- **Logical Operators**: AND, OR, NOT
151+
- **Comparison Operators**: =, <>, <, >, <=, >=, LIKE, IN, IS NULL, etc.
152+
- **Complex Expressions**: CASE statements, subqueries, function calls
153+
- **Nested Logic**: Parentheses and nested boolean expressions
154+
- **Multiple Dialects**: DuckDB, PostgreSQL, MySQL, SQLite, and more via SQLGlot
155+
156+
## Testing
157+
158+
Run the test suite:
159+
160+
```bash
161+
pytest src/query_farm_sql_manipulation/test_transforms.py
162+
```
163+
164+
The test suite includes comprehensive examples of:
165+
- Basic predicate removal scenarios
166+
- Complex logical expression handling
167+
- Error cases and edge conditions
168+
- Column filtering with various SQL constructs
169+
170+
## Contributing
171+
172+
This project uses:
173+
- **Rye** for dependency management
174+
- **pytest** for testing
175+
- **mypy** for type checking
176+
- **ruff** for linting
177+
178+
179+
## Author
180+
181+
This Python module was created by [Query.Farm](https://query.farm).
182+
183+
# License
184+
185+
MIT Licensed.

pyproject.toml

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
[project]
2+
name = "query-farm-sql-manipulation"
3+
version = "0.1.0"
4+
description = "Add your description here"
5+
authors = [
6+
{ name = "Rusty Conover", email = "rusty@query.farm" }
7+
]
8+
dependencies = [
9+
"sqlglot>=26.33.0",
10+
]
11+
readme = "README.md"
12+
requires-python = ">= 3.12"
13+
keywords = ["sql", "predicates", "predicate parsing", "sql parsing"]
14+
classifiers = [
15+
"Development Status :: 4 - Beta",
16+
"Intended Audience :: Developers",
17+
"Topic :: Database",
18+
"Topic :: Database :: Database Engines/Servers",
19+
"Programming Language :: Python :: 3.12"
20+
]
21+
22+
[project.urls]
23+
Repository = "https://github.com/query-farm/python-sql-manipulation.git"
24+
Issues = "https://github.com/query-farm/python-sql-manipulation/issues"
25+
26+
27+
[build-system]
28+
requires = ["hatchling==1.26.3", "hatch-vcs"]
29+
build-backend = "hatchling.build"
30+
31+
32+
dev-dependencies = [
33+
"pytest>=8.3.2",
34+
"pytest-mypy>=0.10.3",
35+
"pytest-env>=1.1.3",
36+
"pytest-cov>=5.0.0",
37+
"ruff>=0.6.2",
38+
]
39+
40+
[tool.rye]
41+
dev-dependencies = [
42+
"pytest>=8.4.1",
43+
"mypy>=1.16.1",
44+
"lxml>=6.0.0",
45+
]
46+
47+
[tool.rye.include]
48+
files = ["py.typed"]
49+
50+
51+
[tool.hatch.metadata]
52+
allow-direct-references = true
53+
54+
[tool.hatch.build.targets.wheel]
55+
packages = ["src/query_farm_sql_manipulation"]
56+
57+
58+
[tool.pytest]
59+
60+
[tool.pytest.ini_options]
61+
62+
63+
[tool.mypy]
64+
ignore_missing_imports = true
65+
66+
67+
follow_imports = "silent"
68+
warn_redundant_casts = true
69+
warn_unused_ignores = true
70+
disallow_any_generics = true
71+
check_untyped_defs = true
72+
no_implicit_reexport = true
73+
74+
# for strict mypy: (this is the tricky one :-))
75+
disallow_untyped_defs = true
76+
77+
78+
[tool.ruff]
79+
line-length = 100
80+
81+
[tool.ruff.lint]
82+
select = [
83+
# pycodestyle
84+
"E",
85+
# Pyflakes
86+
"F",
87+
# pyupgrade
88+
"UP",
89+
# flake8-bugbear
90+
"B",
91+
# flake8-simplify
92+
"SIM",
93+
# isort
94+
"I",
95+
]
96+
ignore = ['E501']

requirements-dev.lock

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# generated by rye
2+
# use `rye lock` or `rye sync` to update this lockfile
3+
#
4+
# last locked with the following flags:
5+
# pre: false
6+
# features: []
7+
# all-features: false
8+
# with-sources: false
9+
# generate-hashes: false
10+
# universal: false
11+
12+
-e file:.
13+
iniconfig==2.1.0
14+
# via pytest
15+
lxml==6.0.0
16+
mypy==1.16.1
17+
mypy-extensions==1.1.0
18+
# via mypy
19+
packaging==25.0
20+
# via pytest
21+
pathspec==0.12.1
22+
# via mypy
23+
pluggy==1.6.0
24+
# via pytest
25+
pygments==2.19.2
26+
# via pytest
27+
pytest==8.4.1
28+
sqlglot==26.33.0
29+
# via query-farm-sql-manipulation
30+
typing-extensions==4.14.1
31+
# via mypy

requirements.lock

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# generated by rye
2+
# use `rye lock` or `rye sync` to update this lockfile
3+
#
4+
# last locked with the following flags:
5+
# pre: false
6+
# features: []
7+
# all-features: false
8+
# with-sources: false
9+
# generate-hashes: false
10+
# universal: false
11+
12+
-e file:.
13+
sqlglot==26.33.0
14+
# via query-farm-sql-manipulation
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
def hello() -> str:
2+
return "Hello from query-farm-sql-manipulation!"

0 commit comments

Comments
 (0)