Skip to content

Conversation

@ali-aqib
Copy link

@ali-aqib ali-aqib commented Oct 23, 2025

Closes #49352

When numeric_only=True is used with a list of aggregation functions in DataFrame.agg() or GroupBy.agg(), non-numeric columns cause a TypeError because filtering isn't applied before the functions run.

This PR filter to numeric columns before applying the list of functions when numeric_only=True is specified.


Changes-

  • Modified NDFrameApply and GroupByApply in pandas/core/apply.py to filter numeric columns before aggregation
  • Added tests in test_frame_apply_numeric_only.py and test_aggregate_numeric_only.py
  • Updated whatsnew for v3.0.0

Example

df = pd.DataFrame({
    'key': ['A', 'B', 'A', 'B'],
    'num': [1, 2, 3, 4],
    'text': ['a', 'b', 'c', 'd']
})

# Now works with numeric_only = True
result1 = df.agg(['sum', 'mean'], numeric_only=True) 
result2 = df.groupby('key').agg(['sum', 'mean'], numeric_only=True) 

@mroeschke mroeschke requested a review from rhshadrach October 23, 2025 16:03
@rhshadrach
Copy link
Member

Thanks for the PR. The text in the OP here is very long and appears to be AI generated. It also contains a lot of information that is entirely redundant. Please do not do this - it is mostly noise and very little signal. A short summary of the PR and any additional details or context that are not found elsewhere would be appropriate.

@ali-aqib
Copy link
Author

Thanks for the feedback
I've updated the description to be more concise.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

# GH#49352 - Handle numeric_only with list of functions
# When numeric_only=True is passed with a list of functions, filter
# to numeric columns before processing to avoid TypeError on non-numeric Series
if op_name == "agg" and kwargs.get("numeric_only", False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only two of the five paths in DataFrameGroupBy.aggregate hits here, thus introducing inconsistencies. To accept a PR, we would need to do this consistently through the op. Take a look at the code in pandas.core.groupby.generic.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right I need to ensure the filtering is applied consistently across all code paths in DataFrameGroupBy.aggregate.
I'll update the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: inconsistent DataFrame.agg behavoir when passing as kwargs numeric_only=True

2 participants