Skip to content

Conversation

@allamlobna
Copy link
Contributor

@allamlobna allamlobna commented Oct 10, 2025

Summary

This PR fixes an inconsistency where read_csv replaced empty
MultiIndex column level values with automatically generated
"Unnamed: x_level_y" placeholders.

Empty values are now preserved as empty strings (""),
matching MultiIndex index behavior and ensuring consistent
roundtrip results between to_csv and read_csv.

Changes

  • Added _clean_column_levels() helper to normalize MultiIndex
    column labels in BaseParser.
  • Updated _extract_multi_indexer_columns() to use it.
  • Adjusted test_multi_index_unnamed() expectations.
  • Added regression test for GH#59560 in test_header.py.
  • Added whatsnew entry under Bug Fixes → IO.

Impact

  • Aligns column + index behavior for MultiIndex CSVs.
  • No change for single-level headers.
  • Both C and Python parsers tested successfully.

@allamlobna allamlobna force-pushed the bugfix/clean-empty-vals-multiindex branch from 6a40ca3 to a399a8e Compare October 10, 2025 19:08
@allamlobna

This comment was marked as outdated.

@allamlobna allamlobna marked this pull request as draft October 10, 2025 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: inconsistency when read_csv reads MultiIndex with empty values

1 participant