Skip to content

Conversation

@Chiwendaiyue
Copy link
Contributor

This PR improves the robustness of DST handling in resample operations involving timezone-aware DatetimeIndex, especially for timezones that contain DST gaps such as Africa/Cairo. #62601
What this PR does

  1. Fixes NonExistentTimeError triggered by internally generated bin edges
    During resampling, pandas internally constructs bin edges using date_range.
    For certain DST transitions (e.g., Africa/Cairo 2024-04-26), these generated edges may fall into nonexistent local times, producing NonExistentTimeError even when user data does not include such timestamps.

This PR resolves this by:

  • Integrating the UTC fallback path directly into _get_time_bins
  • Automatically retrying range generation in UTC when encountering a nonexistent time
  1. Adds a comprehensive new DST test suite
    A new file pandas/tests/resample/test_dst_handling.py is added.
    It includes:
  • Africa/Cairo DST transition tests
  • Before-DST and after-DST boundary conditions
  • Tests for nonexistent times & shifted times
  • Cross-timezone DST behavior (London, New York, Cairo)
  • Edge cases: NA values, microseconds, single-point, empty index
  • Parametrized frequency tests (2h, 6h, 12h, D)
  • Tests matching original issue reports

These tests are API-driven and avoid relying on internal implementation.

No performance tests are added to pytest; those belong to asv.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant