Skip to content

Conversation

@atksh
Copy link
Owner

@atksh atksh commented Nov 8, 2025

This commit addresses multiple precision and validation issues identified
in the codebase analysis:

1. Input Validation

  • Add NaN/Inf validation to insert() methods for both float32 and float64
  • Ensures consistency with constructor validation
  • Prevents invalid data from entering the tree structure

2. Float64 Insert Support

  • Add float64 overload for insert() method
  • Maintains idx2exact map for dynamically inserted items
  • Preserves double-precision refinement capability for inserted boxes
  • Uses explicit py::overload_cast in Python bindings to handle overloads

3. Precision Testing

  • Add comprehensive tests for NaN/Inf validation in insert operations
  • Add tests for float64 insert() maintaining precision
  • Add tests verifying rebuild() preserves idx2exact
  • Add systematic precision boundary tests (adjusted for float32 limits)
  • Document float32 precision limitations in test comments

Technical Notes

  • Float64 input is converted to float32 for tree structure
  • Double-precision refinement helps reduce false positives
  • Precision limits: gaps below ~1e-7 may not be reliably detected
  • At large magnitudes (e.g., 1e6), absolute precision degrades

Fixes validation gaps in insert operations and maintains precision
capabilities for dynamically updated trees.

This commit addresses multiple precision and validation issues identified
in the codebase analysis:

## 1. Input Validation
- Add NaN/Inf validation to insert() methods for both float32 and float64
- Ensures consistency with constructor validation
- Prevents invalid data from entering the tree structure

## 2. Float64 Insert Support
- Add float64 overload for insert() method
- Maintains idx2exact map for dynamically inserted items
- Preserves double-precision refinement capability for inserted boxes
- Uses explicit py::overload_cast in Python bindings to handle overloads

## 3. Precision Testing
- Add comprehensive tests for NaN/Inf validation in insert operations
- Add tests for float64 insert() maintaining precision
- Add tests verifying rebuild() preserves idx2exact
- Add systematic precision boundary tests (adjusted for float32 limits)
- Document float32 precision limitations in test comments

## Technical Notes
- Float64 input is converted to float32 for tree structure
- Double-precision refinement helps reduce false positives
- Precision limits: gaps below ~1e-7 may not be reliably detected
- At large magnitudes (e.g., 1e6), absolute precision degrades

Fixes validation gaps in insert operations and maintains precision
capabilities for dynamically updated trees.
Addresses Float32 Precision Issues from precision validation gaps:
- Add adaptive epsilon calculation that scales with coordinate magnitude
- Add configurable precision parameters (relative_epsilon, absolute_epsilon)
- Implement subnormal number detection in validate_box()
- Update both insert() overloads to use adaptive epsilon
- Add precision control methods (setters/getters) to PRTree class
- Expose precision control to Python via pybind11 bindings
- Add comprehensive test suite for adaptive epsilon behavior

This improves precision handling across different coordinate scales,
from small (< 1.0) to large (> 1e6) magnitudes.

Note: Current architecture still forces float32 tree structure on all
users. Float64 data only used for idx2exact refinement. Future work
should consider separating float32/float64 builds or templating Real type.
This commit eliminates the complex idx2exact post-processing architecture
and replaces it with native float32/float64 template specialization,
significantly simplifying the codebase and optimizing for each precision level.

Key Changes:
- Templated PRTree with Real type parameter (float or double)
- Removed idx2exact map and refine_candidates() complexity entirely
- Exposed 6 separate C++ classes (_PRTree{2D,3D,4D}_{float32,float64})
- Added automatic dtype-based precision selection in Python wrapper
- Propagated Real template parameter through all detail classes:
  - BB (bounding_box.h)
  - DataType (data_type.h)
  - PRTreeNode, PRTreeLeaf, PRTreeElement (nodes.h)
  - PseudoPRTree, PseudoPRTreeNode (pseudo_tree.h)

Benefits:
- Eliminates "strange post-processing" that forced float32 on all users
- Each precision level now uses native types throughout
- Simpler, more maintainable codebase
- Better performance through compile-time type optimization
- Users get the precision they request without conversion overhead

All tests passing with new architecture.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants