Skip to content

Commit efe1a5c

Browse files
BUG: Handle non-dict items in json_normalize with max_level (#62848)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 241f07f commit efe1a5c

File tree

3 files changed

+19
-0
lines changed

3 files changed

+19
-0
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1105,6 +1105,7 @@ I/O
11051105
- Bug in :class:`DataFrame` and :class:`Series` ``repr`` of :py:class:`collections.abc.Mapping` elements. (:issue:`57915`)
11061106
- Fix bug in ``on_bad_lines`` callable when returning too many fields: now emits
11071107
``ParserWarning`` and truncates extra fields regardless of ``index_col`` (:issue:`61837`)
1108+
- Bug in :func:`pandas.json_normalize` inconsistently handling non-dict items in ``data`` when ``max_level`` was set. The function will now raise a ``TypeError`` if ``data`` is a list containing non-dict items (:issue:`62829`)
11081109
- Bug in :meth:`.DataFrame.to_json` when ``"index"`` was a value in the :attr:`DataFrame.column` and :attr:`Index.name` was ``None``. Now, this will fail with a ``ValueError`` (:issue:`58925`)
11091110
- Bug in :meth:`.io.common.is_fsspec_url` not recognizing chained fsspec URLs (:issue:`48978`)
11101111
- Bug in :meth:`DataFrame._repr_html_` which ignored the ``"display.float_format"`` option (:issue:`59876`)

pandas/io/json/_normalize.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -501,6 +501,13 @@ def _pull_records(js: dict[str, Any], spec: list | str) -> list:
501501
# GH35923 Fix pd.json_normalize to not skip the first element of a
502502
# generator input
503503
data = list(data)
504+
for item in data:
505+
if not isinstance(item, dict):
506+
msg = (
507+
"All items in data must be of type dict, "
508+
f"found {type(item).__name__}"
509+
)
510+
raise TypeError(msg)
504511
else:
505512
raise NotImplementedError
506513

pandas/tests/io/json/test_normalize.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -511,6 +511,17 @@ def test_max_level_with_records_path(self, max_level, expected):
511511
expected_df = DataFrame(data=expected, columns=result.columns.values)
512512
tm.assert_equal(expected_df, result)
513513

514+
def test_json_normalize_non_dict_items(self):
515+
# gh-62829
516+
data_list = [np.nan, {"id": 12}, {"id": 13}]
517+
msg = "All items in data must be of type dict, found float"
518+
519+
with pytest.raises(TypeError, match=msg):
520+
json_normalize(data_list, max_level=0)
521+
522+
with pytest.raises(TypeError, match=msg):
523+
json_normalize(data_list)
524+
514525
def test_nested_flattening_consistent(self):
515526
# see gh-21537
516527
df1 = json_normalize([{"A": {"B": 1}}])

0 commit comments

Comments
 (0)