Skip to content

Commit 2d73d62

Browse files
authored
BUG: Take method of NumpyExtensionArray now returns another extension array with the correct dtype. (#62502)
1 parent ed5fd66 commit 2d73d62

File tree

3 files changed

+59
-0
lines changed

3 files changed

+59
-0
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1299,6 +1299,7 @@ ExtensionArray
12991299
- Bug in :class:`Categorical` when constructing with an :class:`Index` with :class:`ArrowDtype` (:issue:`60563`)
13001300
- Bug in :meth:`.arrays.ArrowExtensionArray.__setitem__` which caused wrong behavior when using an integer array with repeated values as a key (:issue:`58530`)
13011301
- Bug in :meth:`ArrowExtensionArray.factorize` where NA values were dropped when input was dictionary-encoded even when dropna was set to False(:issue:`60567`)
1302+
- Bug in :meth:`NDArrayBackedExtensionArray.take` which produced arrays whose dtypes didn't match their underlying data, when called with integer arrays (:issue:`62448`)
13021303
- Bug in :meth:`api.types.is_datetime64_any_dtype` where a custom :class:`ExtensionDtype` would return ``False`` for array-likes (:issue:`57055`)
13031304
- Bug in comparison between object with :class:`ArrowDtype` and incompatible-dtyped (e.g. string vs bool) incorrectly raising instead of returning all-``False`` (for ``==``) or all-``True`` (for ``!=``) (:issue:`59505`)
13041305
- Bug in constructing pandas data structures when passing into ``dtype`` a string of the type followed by ``[pyarrow]`` while PyArrow is not installed would raise ``NameError`` rather than ``ImportError`` (:issue:`57928`)

pandas/core/arrays/numpy_.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@
4848
InterpolateOptions,
4949
NpDtype,
5050
Scalar,
51+
TakeIndexer,
5152
npt,
5253
)
5354

@@ -365,6 +366,27 @@ def interpolate(
365366
return self
366367
return type(self)._simple_new(out_data, dtype=self.dtype)
367368

369+
def take(
370+
self,
371+
indices: TakeIndexer,
372+
*,
373+
allow_fill: bool = False,
374+
fill_value: Any = None,
375+
axis: AxisInt = 0,
376+
) -> Self:
377+
"""
378+
Take entries from this array at each index in a list of indices,
379+
producing an array containing only those entries.
380+
"""
381+
result = super().take(
382+
indices, allow_fill=allow_fill, fill_value=fill_value, axis=axis
383+
)
384+
# See GH#62448.
385+
if self.dtype.kind in "iub":
386+
return type(self)(result._ndarray, copy=False)
387+
388+
return result
389+
368390
# ------------------------------------------------------------------------
369391
# Reductions
370392

pandas/tests/arrays/numpy_/test_numpy.py

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -348,6 +348,42 @@ def test_factorize_unsigned():
348348
tm.assert_extension_array_equal(res_unique, NumpyExtensionArray(exp_unique))
349349

350350

351+
@pytest.mark.parametrize(
352+
"dtype",
353+
[
354+
np.bool_,
355+
np.uint8,
356+
np.uint16,
357+
np.uint32,
358+
np.uint64,
359+
np.int8,
360+
np.int16,
361+
np.int32,
362+
np.int64,
363+
],
364+
)
365+
def test_take_assigns_floating_point_dtype(dtype):
366+
# GH#62448.
367+
if dtype == np.bool_:
368+
array = NumpyExtensionArray(np.array([False, True, False], dtype=dtype))
369+
expected = np.dtype(object)
370+
else:
371+
array = NumpyExtensionArray(np.array([1, 2, 3], dtype=dtype))
372+
expected = np.float64
373+
374+
result = array.take([-1], allow_fill=True)
375+
assert result.dtype.numpy_dtype == expected
376+
377+
result = array.take([-1], allow_fill=True, fill_value=5.0)
378+
assert result.dtype.numpy_dtype == expected
379+
380+
381+
def test_take_preserves_boolean_arrays():
382+
array = NumpyExtensionArray(np.array([False, True, False], dtype=np.bool_))
383+
result = array.take([-1], allow_fill=False)
384+
assert result.dtype.numpy_dtype == np.bool_
385+
386+
351387
# ----------------------------------------------------------------------------
352388
# Output formatting
353389

0 commit comments

Comments
 (0)