Skip to content

BUG: Arrow duration reduction consistency #63170

@rhshadrach

Description

@rhshadrach

When using .sum() on duration[ns][pyarrow] we get a Timedelta, but all other units give datetime.timedelta.

import datetime as dt

ser = pd.Series([dt.timedelta(seconds=1)], dtype="duration[ns][pyarrow]")
print(type(ser.iloc[0]), type(ser.sum()))
# <class 'pandas.Timedelta'> <class 'pandas.Timedelta'>

ser = pd.Series([dt.timedelta(seconds=1)], dtype="duration[us][pyarrow]")
print(type(ser.iloc[0]), type(ser.sum()))
# <class 'pandas.Timedelta'> <class 'datetime.timedelta'>

ser = pd.Series([dt.timedelta(seconds=1)], dtype="duration[ms][pyarrow]")
print(type(ser.iloc[0]), type(ser.sum()))
# <class 'pandas.Timedelta'> <class 'datetime.timedelta'>

ser = pd.Series([dt.timedelta(seconds=1)], dtype="duration[s][pyarrow]")
print(type(ser.iloc[0]), type(ser.sum()))
# <class 'pandas.Timedelta'> <class 'datetime.timedelta'>

I suspect we want to make all of the sums be Timedeltas. I haven't investigated other reductions, but suspect we'll find similar behavior.

cc @jbrockmendel

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions