You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix: Support nested struct field filtering (#2628)
Fixes#953
# Rationale for this change
Fixes filtering on nested struct fields when using PyArrow for scan
operations.
## Are these changes tested?
Yes, the full test suite + new tests
## Are there any user-facing changes?
Now, filtering a scan using a nested field will work
## Problem
When filtering on nested struct fields (e.g., `parentField.childField ==
'value'`), PyArrow would fail with:
```
ArrowInvalid: No match for FieldRef.Name(childField) in ...
```
The issue occurred because PyArrow requires nested field references as
tuples (e.g., `("parent", "child")`) rather than dotted strings (e.g.,
`"parent.child"`).
## Solution
1. Modified `_ConvertToArrowExpression` to accept an optional `Schema`
parameter
2. Added `_get_field_name()` method that converts dotted field paths to
tuples for nested struct fields
3. Updated `expression_to_pyarrow()` to accept and pass the schema
parameter
4. Updated all call sites to pass the schema when available
## Changes
- `pyiceberg/io/pyarrow.py`:
- Modified `_ConvertToArrowExpression` class to handle nested field
paths
- Updated `expression_to_pyarrow()` signature to accept schema
- Updated `_expression_to_complementary_pyarrow()` signature
- `pyiceberg/table/__init__.py`:
- Updated call to `_expression_to_complementary_pyarrow()` to pass
schema
- Tests:
- Added `test_ref_binding_nested_struct_field()` for comprehensive
nested field testing
- Enhanced `test_nested_fields()` with issue #953 scenarios
## Example
```python
# Now works correctly:
table.scan(row_filter="parent.child == 'abc123'").to_polars()
```
The fix converts the field reference from:
- ❌ `FieldRef.Name(run_id)` (fails - field not found)
- ✅ `FieldRef.Nested(FieldRef.Name(mazeMetadata) FieldRef.Name(run_id))`
(works!)
---------
Co-authored-by: Yftach Zur <yftach@atlas.security>
Co-authored-by: Claude <noreply@anthropic.com>
0 commit comments