Skip to content

Commit f0886de

Browse files
committed
Merge remote-tracking branch 'upstream/main' into aijams-take-function-invalid-dtype
2 parents 8dcf8b2 + 8e0ae73 commit f0886de

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+369
-222
lines changed

doc/make.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ def _process_single_doc(self, single_doc):
105105
@staticmethod
106106
def _run_os(*args) -> None:
107107
"""
108-
Execute a command as a OS terminal.
108+
Execute a command as an OS terminal.
109109
110110
Parameters
111111
----------

doc/source/whatsnew/v3.0.0.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -939,6 +939,7 @@ Performance improvements
939939
- Performance improvement in :meth:`RangeIndex.reindex` returning a :class:`RangeIndex` instead of a :class:`Index` when possible. (:issue:`57647`, :issue:`57752`)
940940
- Performance improvement in :meth:`RangeIndex.take` returning a :class:`RangeIndex` instead of a :class:`Index` when possible. (:issue:`57445`, :issue:`57752`)
941941
- Performance improvement in :func:`merge` if hash-join can be used (:issue:`57970`)
942+
- Performance improvement in :func:`merge` when join keys have different dtypes and need to be upcast (:issue:`62902`)
942943
- Performance improvement in :meth:`CategoricalDtype.update_dtype` when ``dtype`` is a :class:`CategoricalDtype` with non ``None`` categories and ordered (:issue:`59647`)
943944
- Performance improvement in :meth:`DataFrame.__getitem__` when ``key`` is a :class:`DataFrame` with many columns (:issue:`61010`)
944945
- Performance improvement in :meth:`DataFrame.astype` when converting to extension floating dtypes, e.g. "Float64" (:issue:`60066`)
@@ -1176,6 +1177,7 @@ Groupby/resample/rolling
11761177

11771178
Reshaping
11781179
^^^^^^^^^
1180+
- Bug in :func:`concat` with mixed integer and bool dtypes incorrectly casting the bools to integers (:issue:`45101`)
11791181
- Bug in :func:`qcut` where values at the quantile boundaries could be incorrectly assigned (:issue:`59355`)
11801182
- Bug in :meth:`DataFrame.combine_first` not preserving the column order (:issue:`60427`)
11811183
- Bug in :meth:`DataFrame.explode` producing incorrect result for :class:`pyarrow.large_list` type (:issue:`61091`)
@@ -1235,6 +1237,7 @@ Other
12351237
- Bug in :meth:`DataFrame.query` where using duplicate column names led to a ``TypeError``. (:issue:`59950`)
12361238
- Bug in :meth:`DataFrame.query` which raised an exception or produced incorrect results when expressions contained backtick-quoted column names containing the hash character ``#``, backticks, or characters that fall outside the ASCII range (U+0001..U+007F). (:issue:`59285`) (:issue:`49633`)
12371239
- Bug in :meth:`DataFrame.query` which raised an exception when querying integer column names using backticks. (:issue:`60494`)
1240+
- Bug in :meth:`DataFrame.rename` and :meth:`Series.rename` when passed a ``mapper``, ``index``, or ``columns`` argument that is a :class:`Series` with non-unique ``ser.index`` producing a corrupted result instead of raising ``ValueError`` (:issue:`58621`)
12381241
- Bug in :meth:`DataFrame.sample` with ``replace=False`` and ``(n * max(weights) / sum(weights)) > 1``, the method would return biased results. Now raises ``ValueError``. (:issue:`61516`)
12391242
- Bug in :meth:`DataFrame.shift` where passing a ``freq`` on a DataFrame with no columns did not shift the index correctly. (:issue:`60102`)
12401243
- Bug in :meth:`DataFrame.sort_index` when passing ``axis="columns"`` and ``ignore_index=True`` and ``ascending=False`` not returning a :class:`RangeIndex` columns (:issue:`57293`)

pandas/_config/localization.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ def can_set_locale(lc: str, lc_var: int = locale.LC_ALL) -> bool:
7979
with set_locale(lc, lc_var=lc_var):
8080
pass
8181
except (ValueError, locale.Error):
82-
# horrible name for a Exception subclass
82+
# horrible name for an Exception subclass
8383
return False
8484
else:
8585
return True

pandas/core/arrays/categorical.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -269,7 +269,7 @@ class Categorical(NDArrayBackedExtensionArray, PandasObject, ObjectStringArrayMi
269269
categories are assumed to be the unique values of `values` (sorted, if
270270
possible, otherwise in the order in which they appear).
271271
ordered : bool, default False
272-
Whether or not this categorical is treated as a ordered categorical.
272+
Whether or not this categorical is treated as an ordered categorical.
273273
If True, the resulting categorical will be ordered.
274274
An ordered categorical respects, when sorted, the order of its
275275
`categories` attribute (which in turn is the `categories` argument, if
@@ -1103,7 +1103,7 @@ def set_categories(
11031103
new_categories : Index-like
11041104
The categories in new order.
11051105
ordered : bool, default None
1106-
Whether or not the categorical is treated as a ordered categorical.
1106+
Whether or not the categorical is treated as an ordered categorical.
11071107
If not given, do not change the ordered information.
11081108
rename : bool, default False
11091109
Whether or not the new_categories should be considered as a rename
@@ -1277,7 +1277,7 @@ def reorder_categories(self, new_categories, ordered=None) -> Self:
12771277
new_categories : Index-like
12781278
The categories in new order.
12791279
ordered : bool, optional
1280-
Whether or not the categorical is treated as a ordered categorical.
1280+
Whether or not the categorical is treated as an ordered categorical.
12811281
If not given, do not change the ordered information.
12821282
12831283
Returns

pandas/core/arrays/timedeltas.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1212,7 +1212,7 @@ def _objects_to_td64ns(
12121212
data, unit=None, errors: DateTimeErrorChoices = "raise"
12131213
) -> np.ndarray:
12141214
"""
1215-
Convert a object-dtyped or string-dtyped array into an
1215+
Convert an object-dtyped or string-dtyped array into a
12161216
timedelta64[ns]-dtyped array.
12171217
12181218
Parameters

pandas/core/dtypes/base.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -517,7 +517,7 @@ class Registry:
517517
"""
518518
Registry for dtype inference.
519519
520-
The registry allows one to map a string repr of a extension
520+
The registry allows one to map a string repr of an extension
521521
dtype to an extension dtype. The string alias can be used in several
522522
places, including
523523

pandas/core/dtypes/concat.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,10 @@ def _get_result_dtype(
161161
# coerce to object
162162
target_dtype = np.dtype(object)
163163
kinds = {"o"}
164+
elif "b" in kinds and len(kinds) > 1:
165+
# GH#21108, GH#45101
166+
target_dtype = np.dtype(object)
167+
kinds = {"o"}
164168
else:
165169
# error: Argument 1 to "np_find_common_type" has incompatible type
166170
# "*Set[Union[ExtensionDtype, Any]]"; expected "dtype[Any]"

pandas/core/dtypes/dtypes.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -176,7 +176,7 @@ class CategoricalDtype(PandasExtensionDtype, ExtensionDtype):
176176
The categories are stored in an Index,
177177
and if an index is provided the dtype of that index will be used.
178178
ordered : bool or None, default False
179-
Whether or not this categorical is treated as a ordered categorical.
179+
Whether or not this categorical is treated as an ordered categorical.
180180
None can be used to maintain the ordered value of existing categoricals when
181181
used in operations that combine categoricals, e.g. astype, and will resolve to
182182
False if there is no existing ordered to maintain.

pandas/core/generic.py

Lines changed: 90 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1054,6 +1054,10 @@ def _rename(
10541054
if level is not None:
10551055
level = ax._get_level_number(level)
10561056

1057+
if isinstance(replacements, ABCSeries) and not replacements.index.is_unique:
1058+
# GH#58621
1059+
raise ValueError("Cannot rename with a Series with non-unique index.")
1060+
10571061
# GH 13473
10581062
if not callable(replacements):
10591063
if ax._is_multi and level is not None:
@@ -8156,7 +8160,6 @@ def asof(self, where, subset=None):
81568160
# ----------------------------------------------------------------------
81578161
# Action Methods
81588162

8159-
@doc(klass=_shared_doc_kwargs["klass"])
81608163
def isna(self) -> Self:
81618164
"""
81628165
Detect missing values.
@@ -8169,15 +8172,18 @@ def isna(self) -> Self:
81698172
81708173
Returns
81718174
-------
8172-
{klass}
8173-
Mask of bool values for each element in {klass} that
8174-
indicates whether an element is an NA value.
8175+
Series/DataFrame
8176+
Mask of bool values for each element in Series/DataFrame
8177+
that indicates whether an element is an NA value.
81758178
81768179
See Also
81778180
--------
8178-
{klass}.isnull : Alias of isna.
8179-
{klass}.notna : Boolean inverse of isna.
8180-
{klass}.dropna : Omit axes labels with missing values.
8181+
Series.isnull : Alias of isna.
8182+
DataFrame.isnull : Alias of isna.
8183+
Series.notna : Boolean inverse of isna.
8184+
DataFrame.notna : Boolean inverse of isna.
8185+
Series.dropna : Omit axes labels with missing values.
8186+
DataFrame.dropna : Omit axes labels with missing values.
81818187
isna : Top-level isna.
81828188
81838189
Examples
@@ -8225,11 +8231,77 @@ def isna(self) -> Self:
82258231
"""
82268232
return isna(self).__finalize__(self, method="isna")
82278233

8228-
@doc(isna, klass=_shared_doc_kwargs["klass"])
82298234
def isnull(self) -> Self:
8235+
"""
8236+
Detect missing values.
8237+
8238+
Return a boolean same-sized object indicating if the values are NA.
8239+
NA values, such as None or :attr:`numpy.NaN`, gets mapped to True
8240+
values.
8241+
Everything else gets mapped to False values. Characters such as empty
8242+
strings ``''`` or :attr:`numpy.inf` are not considered NA values.
8243+
8244+
Returns
8245+
-------
8246+
Series/DataFrame
8247+
Mask of bool values for each element in Series/DataFrame
8248+
that indicates whether an element is an NA value.
8249+
8250+
See Also
8251+
--------
8252+
Series.isna : Alias of isnull.
8253+
DataFrame.isna : Alias of isnull.
8254+
Series.notna : Boolean inverse of isnull.
8255+
DataFrame.notna : Boolean inverse of isnull.
8256+
Series.dropna : Omit axes labels with missing values.
8257+
DataFrame.dropna : Omit axes labels with missing values.
8258+
isna : Top-level isna.
8259+
8260+
Examples
8261+
--------
8262+
Show which entries in a DataFrame are NA.
8263+
8264+
>>> df = pd.DataFrame(
8265+
... dict(
8266+
... age=[5, 6, np.nan],
8267+
... born=[
8268+
... pd.NaT,
8269+
... pd.Timestamp("1939-05-27"),
8270+
... pd.Timestamp("1940-04-25"),
8271+
... ],
8272+
... name=["Alfred", "Batman", ""],
8273+
... toy=[None, "Batmobile", "Joker"],
8274+
... )
8275+
... )
8276+
>>> df
8277+
age born name toy
8278+
0 5.0 NaT Alfred NaN
8279+
1 6.0 1939-05-27 Batman Batmobile
8280+
2 NaN 1940-04-25 Joker
8281+
8282+
>>> df.isna()
8283+
age born name toy
8284+
0 False True False True
8285+
1 False False False False
8286+
2 True False False False
8287+
8288+
Show which entries in a Series are NA.
8289+
8290+
>>> ser = pd.Series([5, 6, np.nan])
8291+
>>> ser
8292+
0 5.0
8293+
1 6.0
8294+
2 NaN
8295+
dtype: float64
8296+
8297+
>>> ser.isna()
8298+
0 False
8299+
1 False
8300+
2 True
8301+
dtype: bool
8302+
"""
82308303
return isna(self).__finalize__(self, method="isnull")
82318304

8232-
@doc(klass=_shared_doc_kwargs["klass"])
82338305
def notna(self) -> Self:
82348306
"""
82358307
Detect existing (non-missing) values.
@@ -8242,15 +8314,18 @@ def notna(self) -> Self:
82428314
82438315
Returns
82448316
-------
8245-
{klass}
8246-
Mask of bool values for each element in {klass} that
8247-
indicates whether an element is not an NA value.
8317+
Series/DataFrame
8318+
Mask of bool values for each element in Series/DataFrame
8319+
that indicates whether an element is not an NA value.
82488320
82498321
See Also
82508322
--------
8251-
{klass}.notnull : Alias of notna.
8252-
{klass}.isna : Boolean inverse of notna.
8253-
{klass}.dropna : Omit axes labels with missing values.
8323+
Series.notnull : Alias of notna.
8324+
DataFrame.notnull : Alias of notna.
8325+
Series.isna : Boolean inverse of notna.
8326+
DataFrame.isna : Boolean inverse of notna.
8327+
Series.dropna : Omit axes labels with missing values.
8328+
DataFrame.dropna : Omit axes labels with missing values.
82548329
notna : Top-level notna.
82558330
82568331
Examples

pandas/core/indexes/range.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -88,13 +88,13 @@ class RangeIndex(Index):
8888
8989
Parameters
9090
----------
91-
start : int (default: 0), range, or other RangeIndex instance
91+
start : int, range, or other RangeIndex instance, default None
9292
If int and "stop" is not given, interpreted as "stop" instead.
93-
stop : int (default: 0)
93+
stop : int, default None
9494
The end value of the range (exclusive).
95-
step : int (default: 1)
95+
step : int, default None
9696
The step size of the range.
97-
dtype : np.int64
97+
dtype : np.int64, default None
9898
Unused, accepted for homogeneity with other index types.
9999
copy : bool, default False
100100
Unused, accepted for homogeneity with other index types.

0 commit comments

Comments
 (0)