Skip to content

Commit 3125a1d

Browse files
committed
TST: Add regression test for apply mutation
GH-40673: Adds a test case to prevent regression of a bug where the internal object reused by apply() would corrupt externally stored DataFrames created with .copy(). This test verifies that store[0] and store[1] correctly contain independent copies of their respective groups.
1 parent fa5b90a commit 3125a1d

File tree

1 file changed

+29
-0
lines changed

1 file changed

+29
-0
lines changed

pandas/tests/groupby/test_apply.py

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1516,3 +1516,32 @@ def f(x):
15161516
).set_index(["cat1", "cat2"])["rank"]
15171517
result = df.groupby("cat1").apply(f)
15181518
tm.assert_series_equal(result, expected)
1519+
1520+
1521+
def test_groupby_apply_store_copy():
1522+
# GH40673
1523+
rng = np.random.default_rng(seed=42)
1524+
1525+
df = DataFrame(
1526+
{
1527+
"A": rng.normal(10, 12, size=(4,)),
1528+
"B": [1, 2, 1, 2],
1529+
}
1530+
)
1531+
1532+
# Empty dict to hold the chunks
1533+
store = {}
1534+
1535+
def addstore(x):
1536+
store[len(store)] = x.copy()
1537+
1538+
df.groupby("B").apply(addstore)
1539+
1540+
# Output boolean mask
1541+
out_mask = {0: [True, False, True, False], 1: [False, True, False, True]}
1542+
1543+
# The expected output in store dict
1544+
expected_out = {0: df[out_mask[0]], 1: df[out_mask[1]]}
1545+
1546+
tm.assert_frame_equal(store[0], expected_out[0].drop("B", axis=1))
1547+
tm.assert_frame_equal(store[1], expected_out[1].drop("B", axis=1))

0 commit comments

Comments
 (0)