-
Notifications
You must be signed in to change notification settings - Fork 285
block reader supports simple OR conditions on the PK column. #22911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
heni02
approved these changes
Nov 19, 2025
XuPeng-SH
approved these changes
Nov 20, 2025
Contributor
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
User description
What type of PR is this?
Which issue(s) this PR fixes:
issue ##17994
What this PR does / why we need it:
PR Summary
expressions such as (pk = 1 AND pk BETWEEN 5 AND 9) OR pk IN (20, 30) to produce multiple atomic filters.
filters remain single-op.
outputs), and cover (a AND b) OR c plus long OR chains across all supported PK types.
Examples
block overlap is found.
multi-op filters invalid.
PR Type
Enhancement, Tests
Description
Added OR-aware parsing in
BasePKFilterthat distributes AND over OR, builds disjunctive normal form, and invalidates invalid OR branchesIntroduced
Disjunctsfield toBasePKFilterto store multiple atomic filters from OR expressionsRefactored
ConstructBlockPKFilterto support disjuncts by extracting search function building intobuildBlockPKSearchFuncshelperAdded
combineOffsetFuncsto merge offset results from multiple disjunctive predicates with deduplication and sortingRestricted memory path filters to reject disjuncts, as they only support single atomic predicates
Added comprehensive unit tests (
TestConstructBasePKFilterWithOr,TestConstructBlockPKFilterWithOr) covering composite OR shapes and multiple data typesAdded extensive BVT test suites for single PK, composite PK, and non-PK tables with OR conditions across 8+ data types (int, uint, double, decimal, date, timestamp, uuid, varchar)
Tests cover mixed operators (equality, IN, BETWEEN, comparison) and complex OR chains with 8K-row datasets and fault injection
Diagram Walkthrough
File Walkthrough
2 files
pk_filter.go
Refactor block PK filter to support OR conditions with disjunctspkg/vm/engine/readutil/pk_filter.go
ConstructBlockPKFilterto support OR conditions byextracting search function building into a separate
buildBlockPKSearchFuncsfunctioncombineOffsetFuncshelper to merge offset results from multipledisjunctive predicates using deduplication and sorting
Disjunctsfield inBasePKFilterto handlemultiple atomic filters from OR expressions
the new
buildBlockPKSearchFuncsfunction for better code organizationpk_filter_base.go
Add disjunctive normal form support to base PK filterpkg/vm/engine/readutil/pk_filter_base.go
Disjunctsfield toBasePKFilterstruct to store OR-ed atomicfilters
ConstructBasePKFilterto distribute ANDover OR and build disjunctive normal form
toDisjunctshelper function to flatten filters into disjunctlists
instead of failing immediately on invalid ones
1 files
pk_filter_mem.go
Restrict memory filters to single atomic predicatespkg/vm/engine/readutil/pk_filter_mem.go
memory filters currently only support single atomic predicates
memory-based filtering
7 files
filter_test.go
Add comprehensive unit tests for OR condition supportpkg/vm/engine/readutil/filter_test.go
TestConstructBasePKFilterWithOrto test OR-aware parsing withcomposite OR shapes and multiple data types
TestConstructBlockPKFilterWithOrto verify combined searchfunctions correctly merge offsets from disjunctive predicates
Test_ConstructBasePKFilterto skip OR expressions duringvalidation
TestConstructBlockPKFilterWithBloomFilterwith helperfunctions for flexible result validation
block_or_single_pk.result
Add BVT test results for single PK OR queriestest/distributed/cases/disttae/disttae_filters/reader_filters/block_reader/block_or_single_pk.result
conditions across multiple data types (int, uint, double, decimal,
date, timestamp, uuid, varchar)
operators, and complex OR chains
pk = 5 OR pk BETWEEN 100 AND 120 OR pk IN(512, 1024)return correct resultsblock_or_composite_pk.result
Add BVT test results for composite PK OR queriestest/distributed/cases/disttae/disttae_filters/reader_filters/block_reader/block_or_composite_pk.result
conditions on the first column
complex OR chains
with disjunctive predicates
block_or_no_pk.sql
Add BVT test suite for non-PK OR queriestest/distributed/cases/disttae/disttae_filters/reader_filters/block_reader/block_or_no_pk.sql
filtering works correctly in non-PK scenarios
mixed operators
flushing
block_or_single_pk.sql
Single PK OR condition test suite with multi-type coveragetest/distributed/cases/disttae/disttae_filters/reader_filters/block_reader/block_or_single_pk.sql
injection across 7 data types (int, bigint unsigned, double, decimal,
date, timestamp, uuid, varchar)
including equality, IN clauses, BETWEEN ranges, and comparison
operators
expressions and long OR chains with multiple operators
block-level filter testing
block_or_no_pk.result
No-PK table OR condition test expected resultstest/distributed/cases/disttae/disttae_filters/reader_filters/block_reader/block_or_no_pk.result
with int and varchar columns
BETWEEN, and comparison operators
filtering behavior on non-PK tables
key constraints
block_or_composite_pk.sql
Composite PK OR condition test suite with multi-column keystest/distributed/cases/disttae/disttae_filters/reader_filters/block_reader/block_or_composite_pk.sql
injection across 2 data type combinations (int pairs and varchar
pairs)
PK column with various operators
long OR chains
conditions across multiple blocks