Skip to content

Conversation

@Chesars
Copy link
Contributor

@Chesars Chesars commented Dec 4, 2025

Title

fix(anthropic): handle partial JSON chunks in streaming responses

Relevant issues

Fixes #17473

Pre-Submission checklist

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix

Changes

Anthropic streaming fails with JSONDecodeError when network fragmentation causes SSE data to arrive in partial chunks.

The ModelResponseIterator in handler.py tries to parse JSON immediately with json.loads() without handling the case where a chunk arrives incomplete due to TCP packet fragmentation.

Solution

Added JSON accumulation logic, following the same pattern already used in the Gemini handler (handle_accumulated_json_chunk):

  • Added accumulated_json buffer and chunk_type to ModelResponseIterator.__init__
  • Added _handle_accumulated_json_chunk() to accumulate partial JSON until valid
  • Added _parse_sse_data() to handle both complete and partial chunks
  • Modified __next__ and __anext__ to use accumulation logic with a loop
  • Added 3 unit tests for partial chunk handling

How it works

When streaming, Anthropic sends SSE like:
data: {"type":"content_block_delta","text":"Hello"}

Network can fragment TCP packets, so you receive:

data: {"type":"content_block_delta","te # Chunk 1: 
xt":"Hello"} # Chunk 2

Before :

  # Chunk 1 arrives json.loads('{"type":"content_block_delta","te  → 💥 JSONDecodeError

After (fix):

# Chunk 1 arrives json.loads failssave to buffer
# Chunk 2 arrives buffer + chunk2json.loads('{"type":"content_block_delta","text":"Hello"}') → ✅

Tests added

  • test_partial_json_chunk_accumulation - verifies partial chunks accumulate
  • test_complete_json_chunk_no_accumulation - verifies complete chunks parse immediately
  • test_multiple_partial_chunks_accumulation - verifies 3+ parts accumulate correctly

Fixes BerriAI#17473 - Anthropic streaming fails with JSONDecodeError when
network fragmentation causes SSE data to arrive in partial chunks.

Changes:
- Add accumulated_json buffer and chunk_type to ModelResponseIterator
- Add _handle_accumulated_json_chunk() to accumulate partial JSON
- Add _parse_sse_data() to handle both complete and partial chunks
- Modify __next__ and __anext__ to use accumulation logic
- Add unit tests for partial chunk handling
@vercel
Copy link

vercel bot commented Dec 4, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
litellm Ready Ready Preview Comment Dec 4, 2025 6:58pm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Anthropic streaming fails with JSONDecodeError on partial chunks

1 participant