Skip to content

Conversation

@parthava-adabala
Copy link
Contributor

@parthava-adabala
Copy link
Contributor Author

Thought the fix would be simple but I ran into several edge cases during testing, and so I tried this approach

  1. Find all unique columns while preserving order
  2. Iterate through rows using a complete column list from step 1 to fill in values in order to handle both empty and non-empty rows correctly.

I appreciate any feedback!

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Some questions.


def _from_nested_dict(
data: Mapping[HashableT, Mapping[HashableT2, T]],
) -> collections.defaultdict[HashableT2, dict[HashableT, T]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is T -> Any necessary?

if isinstance(s, (dict, ABCSeries)):
for col in all_cols_list:
new_data[col][index] = s.get(col, None)
elif s is None or is_scalar(s):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what input is this case hit?

new_data[col][index] = v
if isinstance(s, (dict, ABCSeries)):
for col in all_cols_list:
new_data[col][index] = s.get(col, None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a Series, I believe this will be much slower than the previous implementation. Can you profile this case.

@rhshadrach rhshadrach added Constructors Series/DataFrame/Index/pd.array Constructors Bug IO Data IO issues that don't fit into a more specific label and removed Constructors Series/DataFrame/Index/pd.array Constructors labels Nov 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug IO Data IO issues that don't fit into a more specific label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: DataFrame.from_dict() drops empty rows with orient='index', inconsistently with empty columns

2 participants