Skip to content

Commit 56ddf75

Browse files
committed
Merge branch 'main' into warn-52593
2 parents 4c1a545 + 4075fea commit 56ddf75

File tree

10 files changed

+115
-38
lines changed

10 files changed

+115
-38
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -179,7 +179,7 @@ If you are simply looking to start working with the pandas codebase, navigate to
179179

180180
You can also triage issues which may include reproducing bug reports, or asking for vital information such as version numbers or reproduction instructions. If you would like to start triaging issues, one easy way to get started is to [subscribe to pandas on CodeTriage](https://www.codetriage.com/pandas-dev/pandas).
181181

182-
Or maybe through using pandas you have an idea of your own or are looking for something in the documentation and thinking ‘this can be improved’...you can do something about it!
182+
Or maybe through using pandas you have an idea of your own or are looking for something in the documentation and thinking ‘this can be improved’... you can do something about it!
183183

184184
Feel free to ask questions on the [mailing list](https://groups.google.com/forum/?fromgroups#!forum/pydata) or on [Slack](https://pandas.pydata.org/docs/dev/development/community.html?highlight=slack#community-slack).
185185

doc/source/user_guide/groupby.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,7 @@ We could naturally group by either the ``A`` or ``B`` columns, or both:
137137

138138
``df.groupby('A')`` is just syntactic sugar for ``df.groupby(df['A'])``.
139139

140-
The above GroupBy will split the DataFrame on its index (rows). To split by columns, first do
140+
DataFrame groupby always operates along axis 0 (rows). To split by columns, first do
141141
a transpose:
142142

143143
.. ipython::

doc/source/whatsnew/v3.0.0.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -961,6 +961,7 @@ Categorical
961961
^^^^^^^^^^^
962962
- Bug in :class:`Categorical` where constructing from a pandas :class:`Series` or :class:`Index` with ``dtype='object'`` did not preserve the categories' dtype as ``object``; now the ``categories.dtype`` is preserved as ``object`` for these cases, while numpy arrays and Python sequences with ``dtype='object'`` continue to infer the most specific dtype (for example, ``str`` if all elements are strings) (:issue:`61778`)
963963
- Bug in :func:`Series.apply` where ``nan`` was ignored for :class:`CategoricalDtype` (:issue:`59938`)
964+
- Bug in :func:`bdate_range` raising ``ValueError`` with frequency ``freq="cbh"`` (:issue:`62849`)
964965
- Bug in :func:`testing.assert_index_equal` raising ``TypeError`` instead of ``AssertionError`` for incomparable ``CategoricalIndex`` when ``check_categorical=True`` and ``exact=False`` (:issue:`61935`)
965966
- Bug in :meth:`Categorical.astype` where ``copy=False`` would still trigger a copy of the codes (:issue:`62000`)
966967
- Bug in :meth:`DataFrame.pivot` and :meth:`DataFrame.set_index` raising an ``ArrowNotImplementedError`` for columns with pyarrow dictionary dtype (:issue:`53051`)
@@ -983,6 +984,7 @@ Datetimelike
983984
- Bug in :meth:`DataFrame.min` and :meth:`DataFrame.max` casting ``datetime64`` and ``timedelta64`` columns to ``float64`` and losing precision (:issue:`60850`)
984985
- Bug in :meth:`Dataframe.agg` with df with missing values resulting in IndexError (:issue:`58810`)
985986
- Bug in :meth:`DateOffset.rollback` (and subclass methods) with ``normalize=True`` rolling back one offset too long (:issue:`32616`)
987+
- Bug in :meth:`DatetimeIndex.asof` with a string key giving incorrect results (:issue:`50946`)
986988
- Bug in :meth:`DatetimeIndex.is_year_start` and :meth:`DatetimeIndex.is_quarter_start` does not raise on Custom business days frequencies bigger then "1C" (:issue:`58664`)
987989
- Bug in :meth:`DatetimeIndex.is_year_start` and :meth:`DatetimeIndex.is_quarter_start` returning ``False`` on double-digit frequencies (:issue:`58523`)
988990
- Bug in :meth:`DatetimeIndex.union` and :meth:`DatetimeIndex.intersection` when ``unit`` was non-nanosecond (:issue:`59036`)

pandas/core/frame.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9430,7 +9430,7 @@ def groupby(
94309430
index. If a dict or Series is passed, the Series or dict VALUES
94319431
will be used to determine the groups (the Series' values are first
94329432
aligned; see ``.align()`` method). If a list or ndarray of length
9433-
equal to the selected axis is passed (see the `groupby user guide
9433+
equal to the number of rows is passed (see the `groupby user guide
94349434
<https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#splitting-an-object-into-groups>`_),
94359435
the values are used as-is to determine the groups. A label or list
94369436
of labels may be passed to group by the columns in ``self``.

pandas/core/indexes/base.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5675,7 +5675,7 @@ def asof(self, label):
56755675
return self._na_value
56765676
else:
56775677
if isinstance(loc, slice):
5678-
loc = loc.indices(len(self))[-1]
5678+
return self[loc][-1]
56795679

56805680
return self[loc]
56815681

pandas/core/indexes/datetimes.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1133,12 +1133,14 @@ def bdate_range(
11331133
msg = "freq must be specified for bdate_range; use date_range instead"
11341134
raise TypeError(msg)
11351135

1136-
if isinstance(freq, str) and freq.startswith("C"):
1136+
if isinstance(freq, str) and freq.upper().startswith("C"):
1137+
msg = f"invalid custom frequency string: {freq}"
1138+
if freq == "CBH":
1139+
raise ValueError(f"{msg}, did you mean cbh?")
11371140
try:
11381141
weekmask = weekmask or "Mon Tue Wed Thu Fri"
11391142
freq = prefix_mapping[freq](holidays=holidays, weekmask=weekmask)
11401143
except (KeyError, TypeError) as err:
1141-
msg = f"invalid custom frequency string: {freq}"
11421144
raise ValueError(msg) from err
11431145
elif holidays or weekmask:
11441146
msg = (

pandas/tests/frame/methods/test_join.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -575,3 +575,27 @@ def test_frame_join_tzaware(self):
575575

576576
tm.assert_index_equal(result.index, expected)
577577
assert result.index.tz.key == "US/Central"
578+
579+
def test_frame_join_categorical_index(self):
580+
# GH 61675
581+
cat_data = pd.Categorical(
582+
[3, 4],
583+
categories=pd.Series([2, 3, 4, 5], dtype="Int64"),
584+
ordered=True,
585+
)
586+
values1 = "a b".split()
587+
values2 = "foo bar".split()
588+
df1 = DataFrame({"hr": cat_data, "values1": values1}).set_index("hr")
589+
df2 = DataFrame({"hr": cat_data, "values2": values2}).set_index("hr")
590+
df1.columns = pd.CategoricalIndex([4], dtype=cat_data.dtype, name="other_hr")
591+
df2.columns = pd.CategoricalIndex([3], dtype=cat_data.dtype, name="other_hr")
592+
593+
df_joined = df1.join(df2)
594+
expected = DataFrame(
595+
{"hr": cat_data, "values1": values1, "values2": values2}
596+
).set_index("hr")
597+
expected.columns = pd.CategoricalIndex(
598+
[4, 3], dtype=cat_data.dtype, name="other_hr"
599+
)
600+
601+
tm.assert_frame_equal(df_joined, expected)

pandas/tests/indexes/datetimes/methods/test_asof.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
from datetime import timedelta
22

33
from pandas import (
4+
DatetimeIndex,
45
Index,
56
Timestamp,
67
date_range,
@@ -28,3 +29,18 @@ def test_asof(self):
2829

2930
dt = index[0].to_pydatetime()
3031
assert isinstance(index.asof(dt), Timestamp)
32+
33+
def test_asof_datetime_string(self):
34+
# GH#50946
35+
36+
dti = date_range("2021-08-05", "2021-08-10", freq="1D")
37+
38+
key = "2021-08-09"
39+
res = dti.asof(key)
40+
exp = dti[4]
41+
assert res == exp
42+
43+
# add a non-midnight time caused a bug
44+
dti2 = DatetimeIndex(list(dti) + ["2021-08-11 00:00:01"])
45+
res = dti2.asof(key)
46+
assert res == exp

pandas/tests/indexes/datetimes/test_date_range.py

Lines changed: 34 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1216,7 +1216,7 @@ def test_cdaterange_holidays_weekmask_requires_freqstr(self):
12161216
)
12171217

12181218
@pytest.mark.parametrize(
1219-
"freq", [freq for freq in prefix_mapping if freq.startswith("C")]
1219+
"freq", [freq for freq in prefix_mapping if freq.upper().startswith("C")]
12201220
)
12211221
def test_all_custom_freq(self, freq):
12221222
# should not raise
@@ -1280,6 +1280,39 @@ def test_data_range_custombusinessday_partial_time(self, unit):
12801280
)
12811281
tm.assert_index_equal(result, expected)
12821282

1283+
def test_cdaterange_cbh(self):
1284+
# GH#62849
1285+
result = bdate_range(
1286+
"2009-03-13",
1287+
"2009-03-15",
1288+
freq="cbh",
1289+
weekmask="Mon Wed Fri",
1290+
holidays=["2009-03-14"],
1291+
)
1292+
expected = DatetimeIndex(
1293+
[
1294+
"2009-03-13 09:00:00",
1295+
"2009-03-13 10:00:00",
1296+
"2009-03-13 11:00:00",
1297+
"2009-03-13 12:00:00",
1298+
"2009-03-13 13:00:00",
1299+
"2009-03-13 14:00:00",
1300+
"2009-03-13 15:00:00",
1301+
"2009-03-13 16:00:00",
1302+
],
1303+
dtype="datetime64[ns]",
1304+
freq="cbh",
1305+
)
1306+
tm.assert_index_equal(result, expected)
1307+
1308+
def test_cdaterange_deprecated_error_CBH(self):
1309+
# GH#62849
1310+
msg = "invalid custom frequency string: CBH, did you mean cbh?"
1311+
with pytest.raises(ValueError, match=msg):
1312+
bdate_range(
1313+
START, END, freq="CBH", weekmask="Mon Wed Fri", holidays=["2009-03-14"]
1314+
)
1315+
12831316

12841317
class TestDateRangeNonNano:
12851318
def test_date_range_reso_validation(self):

pandas/tests/io/parser/common/test_file_buffer_url.py

Lines changed: 31 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -97,25 +97,25 @@ def test_nonexistent_path(all_parsers):
9797

9898
@pytest.mark.skipif(WASM, reason="limited file system access on WASM")
9999
@td.skip_if_windows # os.chmod does not work in windows
100-
def test_no_permission(all_parsers):
100+
def test_no_permission(all_parsers, temp_file):
101101
# GH 23784
102102
parser = all_parsers
103103

104104
msg = r"\[Errno 13\]"
105-
with tm.ensure_clean() as path:
106-
os.chmod(path, 0) # make file unreadable
105+
path = temp_file
106+
os.chmod(path, 0) # make file unreadable
107107

108-
# verify that this process cannot open the file (not running as sudo)
109-
try:
110-
with open(path, encoding="utf-8"):
111-
pass
112-
pytest.skip("Running as sudo.")
113-
except PermissionError:
108+
# verify that this process cannot open the file (not running as sudo)
109+
try:
110+
with open(path, encoding="utf-8"):
114111
pass
112+
pytest.skip("Running as sudo.")
113+
except PermissionError:
114+
pass
115115

116-
with pytest.raises(PermissionError, match=msg) as e:
117-
parser.read_csv(path)
118-
assert path == e.value.filename
116+
with pytest.raises(PermissionError, match=msg) as e:
117+
parser.read_csv(path)
118+
assert str(path.resolve()) == e.value.filename
119119

120120

121121
@pytest.mark.parametrize(
@@ -269,19 +269,19 @@ def test_internal_eof_byte(all_parsers):
269269
tm.assert_frame_equal(result, expected)
270270

271271

272-
def test_internal_eof_byte_to_file(all_parsers):
272+
def test_internal_eof_byte_to_file(all_parsers, temp_file):
273273
# see gh-16559
274274
parser = all_parsers
275275
data = b'c1,c2\r\n"test \x1a test", test\r\n'
276276
expected = DataFrame([["test \x1a test", " test"]], columns=["c1", "c2"])
277277
path = f"__{uuid.uuid4()}__.csv"
278278

279-
with tm.ensure_clean(path) as path:
280-
with open(path, "wb") as f:
281-
f.write(data)
279+
path2 = temp_file.parent / path
280+
with open(path2, "wb") as f:
281+
f.write(data)
282282

283-
result = parser.read_csv(path)
284-
tm.assert_frame_equal(result, expected)
283+
result = parser.read_csv(path2)
284+
tm.assert_frame_equal(result, expected)
285285

286286

287287
def test_file_handle_string_io(all_parsers):
@@ -372,7 +372,7 @@ def test_read_csv_file_handle(all_parsers, io_class, encoding):
372372
assert not handle.closed
373373

374374

375-
def test_memory_map_compression(all_parsers, compression):
375+
def test_memory_map_compression(all_parsers, compression, temp_file):
376376
"""
377377
Support memory map for compressed files.
378378
@@ -381,16 +381,16 @@ def test_memory_map_compression(all_parsers, compression):
381381
parser = all_parsers
382382
expected = DataFrame({"a": [1], "b": [2]})
383383

384-
with tm.ensure_clean() as path:
385-
expected.to_csv(path, index=False, compression=compression)
384+
path = temp_file
385+
expected.to_csv(path, index=False, compression=compression)
386386

387-
if parser.engine == "pyarrow":
388-
msg = "The 'memory_map' option is not supported with the 'pyarrow' engine"
389-
with pytest.raises(ValueError, match=msg):
390-
parser.read_csv(path, memory_map=True, compression=compression)
391-
return
387+
if parser.engine == "pyarrow":
388+
msg = "The 'memory_map' option is not supported with the 'pyarrow' engine"
389+
with pytest.raises(ValueError, match=msg):
390+
parser.read_csv(path, memory_map=True, compression=compression)
391+
return
392392

393-
result = parser.read_csv(path, memory_map=True, compression=compression)
393+
result = parser.read_csv(path, memory_map=True, compression=compression)
394394

395395
tm.assert_frame_equal(
396396
result,
@@ -442,12 +442,12 @@ def test_context_manageri_user_provided(all_parsers, datapath):
442442

443443

444444
@skip_pyarrow # ParserError: Empty CSV file
445-
def test_file_descriptor_leak(all_parsers):
445+
def test_file_descriptor_leak(all_parsers, temp_file):
446446
# GH 31488
447447
parser = all_parsers
448-
with tm.ensure_clean() as path:
449-
with pytest.raises(EmptyDataError, match="No columns to parse from file"):
450-
parser.read_csv(path)
448+
path = temp_file
449+
with pytest.raises(EmptyDataError, match="No columns to parse from file"):
450+
parser.read_csv(path)
451451

452452

453453
def test_memory_map(all_parsers, csv_dir_path):

0 commit comments

Comments
 (0)