fix(anthropic): handle partial JSON chunks in streaming responses #17493
+214
−62
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Title
fix(anthropic): handle partial JSON chunks in streaming responses
Relevant issues
Fixes #17473
Pre-Submission checklist
tests/litellm/directory, Adding at least 1 test is a hard requirementmake test-unitType
🐛 Bug Fix
Changes
Anthropic streaming fails with
JSONDecodeErrorwhen network fragmentation causes SSE data to arrive in partial chunks.The
ModelResponseIteratorinhandler.pytries to parse JSON immediately withjson.loads()without handling the case where a chunk arrives incomplete due to TCP packet fragmentation.Solution
Added JSON accumulation logic, following the same pattern already used in the Gemini handler (
handle_accumulated_json_chunk):accumulated_jsonbuffer andchunk_typetoModelResponseIterator.__init___handle_accumulated_json_chunk()to accumulate partial JSON until valid_parse_sse_data()to handle both complete and partial chunks__next__and__anext__to use accumulation logic with a loopHow it works
When streaming, Anthropic sends SSE like:
data: {"type":"content_block_delta","text":"Hello"}Network can fragment TCP packets, so you receive:
Before :
After (fix):
Tests added
test_partial_json_chunk_accumulation- verifies partial chunks accumulatetest_complete_json_chunk_no_accumulation- verifies complete chunks parse immediatelytest_multiple_partial_chunks_accumulation- verifies 3+ parts accumulate correctly