-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
Open
Labels
BugNeeds TestsUnit test(s) needed to prevent regressionsUnit test(s) needed to prevent regressionscombine/combine_first/updateNDFrame.combine, combine_first, updateNDFrame.combine, combine_first, updategood first issue
Description
Code to reproduce
import numpy as np
df_int = pd.DataFrame(
{'col': ['foo', 'bar', np.nan]},
index=[1,2,3]
)
df_obj = pd.DataFrame(
{'col': [np.nan, np.nan, 'baz']},
index=['1', '2', '3']
)
print(df_int)
print(df_obj)
# >>>
# col
# 1 foo
# 2 bar
# 3 NaN
# col
# 1 NaN
# 2 NaN
# 3 baz
# Note that the indices appear identical, but are actually different dtypes
df_int.update(df_obj)
print(df_int)
# Intended output
# >>>
# col
# 1 foo
# 2 bar
# 3 baz
# Actual output
# >>>
# a
# 1 foo
# 2 bar
# 3 NaN
Problem description
Since update compares values of indices, when two dataframes with differing index dtypes are compared, it is possible that no matches are made when this is not the intended behaviour the user expects, and there is no feedback to the user that this has happened. This is particularly surprising when indices appear to be identical, as highlighted above. A warning should be raised to signal that either:
- tells the user that the indices are not the same type, which may produce some unintended results.
- states that a type comparison is taking place that will never produce any matches.
Metadata
Metadata
Assignees
Labels
BugNeeds TestsUnit test(s) needed to prevent regressionsUnit test(s) needed to prevent regressionscombine/combine_first/updateNDFrame.combine, combine_first, updateNDFrame.combine, combine_first, updategood first issue