You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: preserve byte-size statistics in AggregateExec (#18885)
Previously, AggregateExec dropped total_byte_size statistics
(Precision::Absent) through aggregation operations, preventing the
optimizer from making informed decisions about memory allocation and
execution strategies(join side selection -> dynamic filters).
This commit implements proportional byte-size scaling based on row count
ratios:
- Added calculate_scaled_byte_size helper with inline optimization
- Scales byte size for Final/FinalPartitioned without GROUP BY
- Scales byte size proportionally for all other aggregation modes
- Always returns Precision::Inexact for estimates (semantically correct)
- Returns Precision::Absent when insufficient input statistics
Added test coverage for edge cases (absent statistics, zero rows).
## Which issue does this PR close?
#18850
- Closes#18850
## Rationale for this change
Without byte-size statistics, the optimizer cannot estimate memory
requirements for join-side selection, dynamic filter generation, and
memory allocation decisions. This preserves statistics using
proportional scaling (bytes_per_row × output_rows).
## What changes are included in this PR?
1. Modified `statistics_inner` to calculate proportional byte size
instead of returning `Precision::Absent`
2. Added `calculate_scaled_byte_size` helper (inline optimized, guards
against division by zero)
3. Updated test assertions and added edge case coverage
## Are these changes tested?
Yes:
- New `test_aggregate_statistics_edge_cases` covers edge cases scenarios
- Existing tests confirm stats propagate correctly through the
aggregation pipeline
## Are there any user-facing changes?
No breaking changes.
Internal optimization that may improve query planning and provide more
accurate memory estimates in EXPLAIN output.
Co-authored-by: Daniël Heres <danielheres@gmail.com>
0 commit comments